I’m Sil. I’m a PhD student working with David Mimno and Matthew Wilkens at Cornell University in the Culture and Computation Lab. I'm also an AI Research Scientist at Epiq AI Labs. Here's my CV.

What do I do study? I look for where humans and language models disagree on the world. My published research draws on narrative theory to show how LLMs (fail to) capture cultural concepts in areas like jurisprudence, journalism, and storytelling. This helps us improve them.

What else? I've consulted for news organizations like NYT and AP on language models. I've also presented at the Nieman Foundation, CS50 at Harvard, and the Computer History Museum.

Want to connect? Bluesky / email / LinkedIn

02/2026 "Too Long, Didn't Model" was accepted by SIGHUM 2026
01/2026 "NarraBench" was accepted by EACL 2026
10/2025 Organizing the next NLP4DH, co-located with ACL 2026
07/2025 Hired as an AI Research Scientist at Epiq AI Labs
07/2025 "The Zero Body Problem" was accepted by COLM 2025
05/2025 Began a summer internship at Epiq AI Labs
04/2025 Attended NAACL 2025 and presented "A City of Millions" at NLP4DH
04/2025 NSERC awarded me a 3-year scholarship to pursue extracting knowledge from neural networks
01/2025 Spoke on structuring data for the digital humanities at Concordia University

Simulation Papers

Blind Judgement: Agent-Based Supreme Court Modelling with GPT. Simulating the Supreme Court with chatbots.
The COVID That Wasn’t: Counterfactual Journalism using GPT. Estimating priors on COVID-19 with GPT-2.

Interpretability Papers

The Zero Body Problem: Probing LLM Use of Sensory Language. Generated stories don't reliably evoke the senses.
Lost in Space: Finding the Right Tokens for Structured Output. Structured output harms model accuracy.
Detecting Mode Collapse in Language Models via Narration. RLHF causes GPT to write more generic stories.
Mrs. Dalloway Said She Would Segment the Chapters Herself. Finding chapter borders in story sentiment.

Narrative Papers

Software

DocPlot. Private semantic search in the browser.
COVID-17. Showcases a counterfactual COVID narrative.
semantic-space. Generates thesauruses from latent space.
adsb-utils. Maps ADS-B packets with ncurses.
feature-space-explorer. Plots sentence embeddings in 3D.

Courses

Generative AI For Journalists. 350 students.
How to use ChatGPT and other generative AI tools in your newsrooms. 10k+ students and translated into Spanish & Portuguese.

Workshops

NLP4DH 2025. I served as session chair.
Workshop on AI & DH. Concordia University.
Run LLMs On Your Laptop. Media Party.
Workshop and Mixer on ChatGPT. Brown Institute for Media Innovation.