Presentations: FDS Workshop: AI for Social Science Research Methods

Keynote

Day 2 · May 22

Nicholas Christakis

Yale University

Social Artificial Intelligence

May 22 · 11:00 AM – 12:00 PM

Bio & abstract

About

Nicholas A. Christakis, MD, PhD, MPH, is the Sterling Professor of Social and Natural Science at Yale University. His work is in the fields of network science and biosocial science. He directs the Human Nature Lab and is the Co-Director of the Yale Institute for Network Science. He was elected to the National Academy of Medicine in 2006; the American Association for the Advancement of Science in 2010; the American Academy of Arts and Sciences in 2017; and the National Academy of Sciences in 2024.

Abstract

The impact of artificial intelligence (AI) transcends the simple case of human–machine interactions and extends to human–human interactions in the presence of machines. Here, I explore such "hybrid systems" of humans and AI. I show how the careful yet simple programming of AI agents can enhance the performance of human groups, making people within such groups better able to cooperate, coordinate, innovate, and communicate, ultimately contributing to their superior performance. On the other hand, both simple and complex forms of AI (such as large language models) can also do the opposite, harming groups of people and our society as a whole. Our experiments show how AI agents can affect social processes and human performance in settings as diverse as people working together online or coordinating their movement on roadways. Our work, in short, does not involve the development of super-smart AI to replace human cognition, but rather "dumb AI" to supplement human interaction. These findings reveal what the disruptive introduction of AI into our lives means for the future of human social behavior. And they suggest ways to design AI—as a type of "social catalyst"—so as to make sure it supports a utopian rather than dystopian future.

Oral Presentations

12 presenters Days 1 & 2

Siwei ChengNew York University
Who Talks to Whom? Using Images to Study Segregation in Informal Interactions at Academic Conferences

Bio & abstract

About

Siwei Cheng is associate professor of sociology at New York University. She received her Ph.D. in Sociology and Public Policy (2015) and M.A. in Statistics (2012) from the University of Michigan. She received B.A. in Economics and Mathematical Statistics from Peking University (2009). Prior to joining the faculty of NYU, she was assistant professor of sociology at UCLA (2015-2016). Cheng's research encompasses various areas of inequality, mobility, labor market, networks, and quantitative methodology. Her current work examines the interconnectedness of jobs, occupations, and skills, leverages image data to study social dynamics in public spaces, and traces the unequal pathways to publication using two decades of digitized conference programs.

Abstract

Informal workplace networks—who interacts with whom, who shares information with whom, who mentors whom, who influences whom—serve as the fundamental mechanisms through which social closure operates, opportunities are hoarded, and group-based socioeconomic inequalities are reproduced. Yet real-time, large-sample data of such interactions are notoriously scarce, largely because private workspaces are, well, private. Yet occasionally, informal interpersonal connections in the workplace may become visible in public settings—at least among one group of professionals: academics! Academic conferences are key sites for building the kinds of professional networks that shape scholarly careers. Yet, it is hardly surprising that these informal interactions are often deeply segregated by subgroups such as race, gender, and seniority. In this study, we introduce large-scale image data and computer vision models as a novel approach to understanding information interaction segregation at academic conferences. First, we collect over 7,000 photographs of informal interaction groups at thirteen academic conferences organized by mainstream academic disciplines in the United States. Second, we identify interaction groups and code individual-level characteristics—gender, race, and age—in the images. The image annotation was carried out by both human coders and a locally fine-tuned open-source vision–language model. Finally, we compute various segregation metrics based on these group dynamics both within and across different conferences. Our findings challenge the notion that conferences are spaces where scholars simply "mingle by chance." Across conferences, informal interactions are strongly segregated along racial, gender, and age lines. The level of racial segregation in these face-to-face encounters is substantially higher than that observed in more stable and institutionalized settings such as residential neighborhoods, workplaces, or schools. This study also highlights the promise of harnessing new forms of data and methods to better understand "weak ties" in real-life settings.
Sharad GoelHarvard University
Mitigating Label Bias With Rubric Embeddings

Bio & abstract

About

Sharad Goel is a Professor of Public Policy at Harvard Kennedy School. He looks at public policy through the lens of computer science, bringing a computational perspective to a diverse range of contemporary social and political issues, including education, the delivery of public benefits, and the equitable design of algorithms. He is the founder and director of the Harvard Computational Policy Lab. Prior to joining Harvard, Sharad was on the faculty at Stanford University, with appointments in management science & engineering, computer science, sociology, and the law school.

Abstract

Statistical decision algorithms are increasingly deployed in domains where ground-truth labels are hard to obtain, such as hiring, university admissions, and content moderation. In these settings, models are typically trained on historical human evaluations—for example, using past hiring decisions as a proxy for true applicant quality. However, if past evaluations unjustly favor certain groups, models trained on these labels may inherit those biases. To address this problem, we propose basing predictions on rubric embeddings, a representation framework that replaces standard black-box embeddings with features derived from expert-defined criteria that align with the underlying construct of interest. By anchoring predictions to semantically meaningful dimensions, this approach guards against biased proxy signals. We provide both theoretical and empirical evidence that rubric embeddings mitigate label bias under plausible conditions. Empirically, we evaluate our method on a novel dataset of applications to a large master's program. We find that models trained on rubric embeddings reduce group disparities while improving measures of cohort quality. Our results suggest that basing predictions on interpretable, domain-grounded representations offers a practical approach to learning in the presence of biased labels.
Valentina Gonzalez-RostaniUniversity of Southern California
Tracing AI Assistance and AI Agents in Survey Research

Bio & abstract

About

Valentina Gonzalez-Rostani is an Assistant Professor in the Department of Political Science and International Relations at the University of Southern California. Her research examines how technological change, especially automation and artificial intelligence, reshapes political attitudes, parties, and democratic institutions across advanced economies and the Global South. Using causal inference and computational text analysis, she studies populism, political behavior, inequality, international trade, and the measurement of political discourse. Before joining USC, she was a Postdoctoral Research Associate at Princeton University. Her work has appeared in The Journal of Politics, Political Science Research and Methods, Ecological Economics, and Legislative Studies Quarterly.

Abstract

Generative AI creates a measurement problem for survey research. When respondents can outsource summarization, explanation, or fact retrieval, open-ended answers may no longer reflect their own reasoning. We introduce an auditable toolkit that combines response-process paradata with prompt-specific semantic benchmarks to detect AI use in open-ended responses. We validate the approach in a randomized survey experiment in which participants complete a summary task under either blocked AI access or observed access to an embedded AI tool, and by administering the same instrument to a synthetic AI agent. AI-assisted and automated responses leave distinct behavioral and textual traces: they show reduced drafting and revision, greater similarity to AI benchmark output, and lower distinctiveness relative to peer responses. These findings show that, in the age of generative AI, authorship must be measured rather than assumed.
Kosuke ImaiHarvard University
GenAI Powered Inference

Bio & abstract

About

Kosuke Imai is a Professor in the Departments of Government and Statistics at Harvard University and an affiliate of the Institute for Quantitative Social Science. His research focuses on the development of statistical methods and machine learning algorithms for social science applications, with particular expertise in causal inference, computational social science, and survey methodology. He is the author of Quantitative Social Science: An Introduction and leads the Algorithm-Assisted Redistricting Methodology (ALARM) Project. Imai has served as President of the Society for Political Methodology.

Abstract

We introduce GenAI-Powered Inference (GPI), a statistical framework for both causal and predictive inference using unstructured data, including text, image, and video. GPI leverages open-source Generative AI models—such as large language models and diffusion models—not only to generate unstructured data at scale but also to extract low-dimensional representations that capture their underlying structure. Applying machine learning to these representations, GPI enables estimation of causal and predictive effects while quantifying associated estimation uncertainty. Unlike existing approaches to representation learning, GPI does not require fine-tuning of generative models, making it computationally efficient and broadly accessible.
Yingdan LuNorthwestern University
Mitigating Confounding Bias in Observational Visual Media Research: An LLM-Informed, Machine Learning-Based Inference Framework

Bio & abstract

About

Yingdan Lu (Ph.D., Stanford University) is an Assistant Professor in the Department of Communication Studies at Northwestern University. She is the director of the Computational Media and Politics Lab, and the co-director of the Computational Multimodal Communication Lab. Her research focuses on digital technology, political communication, and information manipulation. Her work has appeared in PNAS, American Journal of Political Science, Political Communication, and New Media & Society.

Abstract

Observational studies of visual media effects are vulnerable to confounding bias due to the complexity of visual data, the difficulty in variable measurement, and the uncertainty of selection. We propose an inference framework that leverages multimodal LLMs and machine learning-based inference techniques to (1) identify potential confounders through reasoned visual assessment, (2) measure semantically complex features that are difficult to capture with conventional methods, and (3) apply double-Lasso for covariate selection and hierarchical OLS regression for effect estimation. We demonstrate this framework by revisiting the relationship between political personalization and social media engagement using 59,020 Instagram posts from U.S. politicians.
Matthew SalganikPrinceton University
Breaking out of the X matrix: Book of life and LLMs in social research

Bio & abstract

About

Matthew Salganik is the Alexander Stewart 1886 Professor of Sociology at Princeton University. He is also affiliated with several of Princeton's interdisciplinary research centers including the Princeton AI Lab, Princeton Precision Health, and Center for Information Technology Policy. He is the author of the award-winning book Bit by Bit: Social Research in the Digital Age, which has been translated into 5 languages.

Abstract

A dominant approach in quantitative social research is to represent data as a rectangle of numbers, where there is one row for each person and column for each variable about the people. This data representation is natural for survey data and fits well with existing methods, such as the GLM and supervised machine learning. In this talk, we begin to explore an alternative pipeline that represents data about people as a text "book of life" and then analyzes the data using LLMs. This approach creates affordances that have no obvious analogue in existing approaches, and may be especially valuable for life course data with complex temporal, network, and hierarchical structure.

We study the book of life + LLM pipeline in two distinct settings focused on predicting life outcomes: one complex (the Dutch population registry) and one simple (the US American Community Survey). We compare approaches empirically and try to isolate the source of performance differences. The talk concludes with a brief discussion of the shared infrastructure—community-determined "model organisms" and open-source software—that might help us discover the value (or lack thereof) of the new approach more quickly and reliably.

Joint work with Sarah Pedersen, Mattie Niznik, Stephan Rabanser, Varun Satish, Flavio Hafner, Sayash Kapoor, Malte Luken, Lydia T. Liu, Tiffany Liu, Juan C. Perdomo, Benedikt Stroebl, Keyon Vafa, and Mark Verhagen.
Austin van LoonMIT Sloan
Using LLMs as a source of data in experiments

Bio & abstract

About

Austin van Loon is the Class of 1956 Career Development Assistant Professor and an Assistant Professor of Work and Organization Studies at the MIT Sloan School of Management. He studies the various intersections of technology and communication, mass political polarization in the US, and how to use AI to do better social science.

Abstract

Large language models (LLMs) have prompted proposals to replace human subjects in social science experiments with simulated responses. Empirical evaluations suggest that this practice—often called silicon sampling—can sometimes approximate human behavior but is unreliable. We delineate where this approach may still provide value and where it may not, but primarily study an alternative approach: one in which model-based predictions are used not as substitutes for human data, but as auxiliary measurements within randomized experiments. We formalize the inference of causal estimands from mixed-subjects randomized controlled trials, in which outcomes are observed for a subset of units while predictions are available for all units. Under transparent design conditions, we derive a family of estimators that remain unbiased for the average treatment effect in finite samples while exploiting predictions to reduce variance. We characterize when prediction-powered, calibration-based, arm-specifically tuned, and difference-in-predictions estimators improve precision, and we provide a software package which operationalizes these results and aids researchers to jointly select estimators and allocate budgets between human data collection and prediction generation. Together, our results show how generative artificial intelligence can improve experimental social science without compromising scientific validity.
Alexander VolfovskyDuke University
Platforms, Prompts, and Proof: Reproducible Social Science in Human-AI Environments

Bio & abstract

About

Alexander Volfovsky is an Associate Professor of Statistical Science at Duke University, where he also serves as co-director of the Polarization Lab. His research lies at the intersection of causal inference, network analysis, and machine learning, with applications to understanding social behavior, online interactions, and decision-making in complex systems. His recent projects focus on human–AI interaction, trust calibration, and the design of artificial agents that foster constructive discourse.

Abstract

Social media research increasingly studies environments rather than static stimuli: feeds evolve, users interact, and generative AI can now populate platforms with synthetic actors that respond dynamically. We describe the Social Media Accelerator as research infrastructure for this setting: a controlled platform for randomized social-media experiments with embedded surveys, synthetic users, behavioral logs, and auditable exposure records. The talk then considers how GenAI enters the scientific workflow itself, including literature synthesis, coding, measurement, drafting, and validation. Prompts, model versions, generated outputs, coding decisions, validation checks, and human revisions can be treated as part of the research record. The broader aim is to support accountable AI-assisted social science in a world where both social behavior and the work of studying it increasingly involve AI.
Hannah WaightUniversity of Oregon
State Media Control Influences Large Language Models

Bio & abstract

About

Hannah Waight is an Assistant Professor of Sociology at the University of Oregon and a former Postdoctoral Research Associate at CSMaP. Waight studies the politics of media and information and their implications for social organization. She has also worked on popular perceptions of inequality and has an ongoing project on how people in authoritarian regimes "see the state."

Authors

Hannah Waight, Eddie Yang, Yin Yuan, Solomon Messing, Margaret E. Roberts, Brandon M. Stewart, Joshua A. Tucker

Abstract

Millions of people around the world query large language models for information. While several studies have compellingly documented the persuasive potential of these models, there is limited evidence of who or what influences the models themselves, leading to a flurry of concerns about which companies and governments build and regulate the models. We show through six studies that government control of the media across the world already influences the output of large language models (LLMs) via their training data.

We use a cross-national audit to show that LLMs exhibit a stronger pro-government valence in the languages of countries with lower media freedom than those with higher media freedom. This result is correlational so to triangulate the specific mechanism of how state media control can influence LLMs, we develop a multi-part case study on China's media. We demonstrate that media scripted and curated by the Chinese state appears in large language model training datasets. To evaluate the plausible effect of this inclusion, we use an open-weight model to show that additional pretraining on Chinese state-coordinated media generates more positive answers to prompts about Chinese political institutions and leaders.

We link this phenomenon to commercial models through two audit studies demonstrating that prompting models in Chinese generates more positive responses about China's institutions and leaders than do the same queries in English. The combination of influence and persuasive potential across languages suggests the troubling conclusion that states and powerful institutions have increased strategic incentives to leverage media control in the hopes of shaping large language model output.
Yiqing XuStanford University
Scaling Reproducibility: An AI-Assisted Workflow for Large-Scale Replications and Reanalysis

Bio & abstract

About

Yiqing Xu is an Assistant Professor in the Department of Political Science at Stanford University and a faculty affiliate of the Stanford Causal Science Center and the Center on China's Economy and Institutions. He received his B.A. in Economics from Fudan University, his M.A. in Economics from Peking University, and his Ph.D. in Political Science from the Massachusetts Institute of Technology. His primary research areas are causal inference and comparative politics. In recent years, his work has focused on developing and applying causal inference methods for panel data.

Abstract

Computational reproducibility is central to scientific credibility, yet verifying published results at scale remains costly. We develop an AI-assisted workflow for automated full-paper replication—retrieving materials, reconstructing environments, executing code, and matching outputs to point estimates reported in regression tables. We define a universe of all empirical and quantitative papers from the three top political science journals (2010–2025) and measure stated data availability using automated extraction. For a stratified sample of 384 studies, we apply the workflow to conduct full-paper replication, totaling 3,382 empirical models. We find that journal verification requirements, combined with data archiving mandates, drive reproducibility: the full-paper reproducibility rate rises significantly due to DA-RT, and conditional on accessible replication packages, the vast majority of papers are fully or largely reproducible. As a secondary application, we apply standardized IV diagnostics to 92 studies (215 specifications), illustrating how automated execution enables systematic reanalysis across heterogeneous empirical settings.
Eddie YangPurdue University
Data Annotation with Large Language Models: Lessons from a Large Empirical Evaluation

Bio & abstract

About

Eddie Yang is an Assistant Professor of Political Science and faculty member in the Cornerstone Integrated Liberal Arts program. He received his Ph.D. in political science from the University of California San Diego. Yang studies the politics of innovation and technology. His research has been published at the Proceedings of the National Academy of Sciences, and ACM Conference on Fairness, Accountability, and Transparency, among other outlets.

Abstract

Large Language Models (LLMs) are increasingly used in social science research to annotate unstructured data, often replacing research assistants and experts. Yet using these predicted annotations in downstream statistical analyses can yield biased estimates—a problem compounded by the black-box and stochastic nature of LLMs. This study evaluates the consequences of LLM annotation for empirical political science research. We conduct a systematic replication and reanalysis of 14 recently published papers from leading political science journals, re-annotating data originally coded by humans or supervised models with 15 different open-weight and proprietary LLMs. Analyzing hundreds of millions of annotations, we find that LLM annotations have low intercoder reliability with the original annotations and moderate reliability among the LLMs themselves. Smaller and reasoning models are particularly sensitive to minor variations in artifacts such as prompt design. As a result, downstream estimates derived from different sets of annotations show significant variation, altering conclusions in more than one-third of cases. Mitigation strategies, such as in-context learning and bias correction techniques, are useful but have limitations. Based on these findings, we propose best practices for using LLMs for annotation and provide an open-source R package, localLLM, to support their implementation.
Han ZhangBrown University
How Should Experimenters Construct Image Stimuli? A Framework for Controlled Image Construction

Bio & abstract

About

Han Zhang is Young Family Assistant Professor of Sociology and International and Public Affairs at Brown University. His research interests include political sociology, social movements, and computational social science. His work has been published in Social Forces, Sociological Methodology, Sociological Methods & Research, Socio-Economic Review, Science Advances, Mobilization, and The China Quarterly, among others. He has received best paper awards from the American Sociological Association and the International Communication Association.

Abstract

Images are increasingly used as stimuli in social science experiments, but image construction introduces two design challenges that text vignettes avoid. First, the image may fail to depict the intended treatment attributes clearly (construct validity). Second, because images carry a complete scene, they inevitably include non-treatment features that may differ across conditions and confound the treatment comparison (internal validity). We distinguish uncontrolled construction, where non-treatment features are left to the source or the model, from controlled construction, where the researcher deliberately decides which features vary and which stay fixed. We propose two controlled approaches: decomposed AI generation, which builds stimuli from scratch by generating reusable components, and image editing, which modifies only the treatment-relevant elements of an existing image. We develop measures of construct validity and unconfoundedness and evaluate these approaches in two empirical settings. Study 1 compares natural sourcing, simple AI generation, and decomposed AI generation on a 216-condition factorial design. Decomposed AI generation achieves the highest construct validity and lowest confounding. Study 2 compares manual and AI image editing on a 75-condition conjoint extension of Butler and Tavits (2017). Manual editing achieves near-zero confounding but at substantially higher effort than AI editing.

Panel Discussions

Day 1 · May 21

AI & Open Science

May 21 · 11:00 AM – 12:00 PM

Moderated by Jeremy Freese Stanford University

Jessica HullmanNorthwestern University

Bio

About

Jessica Hullman is Ginni Rometty Professor of Computer Science and a Fellow at the Institute of Policy Research at Northwestern University. Her research develops methods for appraising and combining human and artificial intelligence. Jessica's work has been awarded multiple best paper awards at top conferences, a Microsoft Faculty award and NSF CAREER, Medium, and Small awards as PI, among others.
Chenhao TanUniversity of Chicago

Bio

About

Chenhao Tan is an Associate Professor of Computer Science and Data Science at the University of Chicago, and directs the Chicago Human+AI Lab. He earned his PhD in Computer Science from Cornell University and dual bachelor's degrees in computer science and economics from Tsinghua University. His research focuses on human-centered AI, communication & intelligence, and AI alignment. He has received a Sloan research fellowship, an NSF CAREER award, and research awards from Amazon, IBM, JP Morgan, Google, and Salesforce.
Yiqing XuStanford University
Yian YinCornell University

Bio

About

Yian Yin is an Assistant Professor of Information Science at Cornell University in the Bowers College of Computing and Information Science. His research applies and develops computational tools to understand how individual, social, and environmental processes promote or inhibit scientific progress and innovation. As a computational social scientist, he also uses science and innovation as a lens to examine broader social processes, from cultural production to media attention and human conflict. His work has appeared in Science, Nature, and Nature Human Behaviour, and has been featured in outlets including The Atlantic and MIT Technology Review. He received his Ph.D. in industrial engineering and management science from Northwestern University and was named to the Forbes 30 Under 30: Science list in 2023.

AI & Publishing

May 21 · 4:30 – 5:30 PM

Moderated by Emma Zang Yale University

Weihua AnEmory University

Bio

About

Dr. Weihua An is Professor of Sociology and Professor of Data and Decision Sciences at Emory University, with associated appointments in the East Asian Studies Program, the Goizueta Business School, and the Rollins School of Public Health. He received his Ph.D. in Sociology and A.M. in Statistics from Harvard University, where he also held doctoral and postdoctoral fellowships at the Harvard Kennedy School. His research advances theories and methods for network analysis and causal inference, with applications to inequality and social policy, health, and organizations, and has appeared in journals including the Annual Review of Sociology, Social Forces, Social Networks, Sociological Methodology, and the Journal of Statistical Software. He is the author of several widely used statistical packages—including "fglsnet," "LARF," "IUPS," and "keyplayer" in R and "DIDMatch" in Stata—which together have been downloaded more than 150,000 times. He currently serves as Editor of Sociological Methodology, has served on the editorial boards of leading journals such as the American Sociological Review, and is a recipient of the Clifford Clogg Award from the American Sociological Association and the Faculty Teaching Award from Emory Sociology.
Adam BerinskyMIT

Bio

About

Adam Berinsky is the Mitsui Professor of Political Science at MIT and serves as the director of the MIT Political Experiments Research Lab (PERL). He is also a Faculty Affiliate at the Institute for Data, Systems, and Society (IDSS). Berinsky received his PhD from the University of Michigan in 2000. He is the author of Political Rumors: Why We Accept Misinformation and How to Fight It (Princeton University Press, 2023), In Time of War: Understanding American Public Opinion from World War II to Iraq (University of Chicago Press, 2009), and Silent Voices: Public Opinion and Political Participation in America (Princeton University Press, 2004). He has published articles in many journals and is currently the co-editor of the Chicago Studies in American Politics book series at the University of Chicago Press. He is the recipient of multiple grants from the National Science Foundation and was a fellow at the Center for Advanced Study in the Behavioral Sciences. Berinsky was appointed a John Simon Guggenheim Memorial Foundation Fellow to study how political rumors spread and how they can be effectively debunked.
Donald Tomaskovic-DeveyUniversity of Massachusetts Amherst

Bio

About

Donald Tomaskovic-Devey is Professor of Sociology at the University of Massachusetts Amherst, founding Director of the UMass Center for Employment Equity, and coordinator of the Comparative Organizational Inequality Network. He studies the processes that generate workplace inequality, with projects on the impact of financialization on U.S. income distribution, workplace desegregation and equal opportunity, network models of labor market structure, and relational inequality as a theoretical and empirical project. He currently serves as co-editor of the American Sociological Review. He received his Ph.D. in Sociology from Boston University in 1984 and previously taught at North Carolina State University for 17 years before joining UMass in 2005.
Johan UganderYale University

Bio

About

Johan Ugander is Associate Professor of Statistics & Data Science at Yale University and serves as Deputy Director of the Yale Institute for Foundations of Data Science (FDS). His research develops algorithmic and statistical frameworks for analyzing social networks, social systems, and other large-scale social and behavioral data. Prior to Yale, he was on the faculty at Stanford for ten years (2015 to 2025) in the Department of Management Science & Engineering, where he received tenure in 2022. He received his Ph.D. in Applied Mathematics from Cornell University in 2014, advised by Jon Kleinberg. His awards include an NSF CAREER Award and an Army Research Office Young Investigator Award.

Lightning Round Presentations

8 presenters Day 2 · May 22

Soubhik BarariNORC at the University of Chicago
AI-Assisted Conversational Interviewing: Effects on Data Quality and Respondent Experience

Bio

About

Soubhik Barari is a Senior Research Methodologist at NORC at the University of Chicago where he leads a variety of client and research projects in public policy domains such as health communication, public health, online trust & safety, and U.S. politics. Soubhik also oversees the robust and responsible implementation of several initiatives at the intersection of AI and survey research. Soubhik is also an Adjunct Associate Professor of Political Science at Columbia University where he lectures on quantitative social science methods and political analytics.
David BroskaStanford University
Social Science Simulator: A Tool for Simulating Experiments with LLMs

Bio

About

David Broska joined Stanford's Department of Sociology as a PhD student after receiving an MSc in Social Research Methods at the London School of Economics and a BA in Sociology, Politics, and Economics at Zeppelin University in Germany. His research interests include computational social science, inequality, and social policy.
Jacob CrainicYee Collins Research Group
Prompt Degrees of Freedom in LLM-Based Social Science: A Specification Curve Framework for Transparent Reporting

Bio

About

Jacob Crainic is a Senior Researcher at the Yee Collins Research Group. His research examines free expression, democratic erosion, and legislative polarization, drawing on computational methods where they serve the question. His work has appeared or is forthcoming in IC2S2, AIM-3D, and the Harvard Undergraduate Research Journal, with writing on free expression in the South Florida Sun Sentinel.
Jiaxin (Allyson) CuiDuke University
Same Prompt, Different Outcomes: Evaluating the Reproducibility of Data Analysis by LLMs

Bio

About

Allyson Cui is a graduate student at Duke University, studying economics and computation. She holds a Bachelor of Science from the University of Toronto with a major in Economics and minors in Computer Science and Mathematics.
Junsol KimUniversity of Chicago / Google
Reasoning Models Generate Societies of Thought

Bio

About

Junsol Kim is a Ph.D. candidate in Sociology at the University of Chicago, advised by James Evans, and a student researcher at Google. His research examines how AI and other social technologies reshape information and knowledge ecosystems, and how AI can be harnessed to augment social science methods. His work has appeared in Nature Communications, PNAS, ICLR, ICML, and EMNLP.
Joe SievUniversity of Virginia Darden School of Business
The Cross-Context Design: An AI-Enabled Method for Testing Behavioral Hypotheses across Archival Datasets

Bio

About

Joe Siev is a postdoctoral research associate at the University of Virginia Darden School of Business and an incoming Assistant Professor of Marketing at the University of Alabama. He holds a PhD in Social Psychology from The Ohio State University. His research focuses on political consumerism—how consumers express political values through marketplace choices, and how brands' political expressions shape consumer behavior. He also builds research tools that leverage AI and automation.
Yuehong Cassandra TaiPenn State University
From Annotation to Measurement: Traceable Workflows for LLM Inference

Bio

About

Yuehong Cassandra Tai is an Assistant Research Professor at Penn State's Center for Social Data Analytics. She develops evaluation frameworks and oversight pipelines for AI-assisted measurement in social science, including the C-R-A-F-T framework for generative AI evaluation and multi-agent, codebook-grounded annotation workflows. Her applied work studies political communication, misinformation, and democratic accountability. Her research has appeared in American Political Science Review, Political Communication, Nature Scientific Data, and Proceedings of the ACM Web Science Conference.
Charles EesleyStanford University
Four Paradigms for AI-Enabled Social Science Research

Presenting for Guankai Zhai

Bio

About

Chuck Eesley is a Professor and W. M. Keck Foundation Faculty Scholar in the Department of Management Science and Engineering at Stanford University. He studies and designs systems that enable high-quality entrepreneurship and innovation under uncertainty, including in emerging technologies and sustainability transitions. He is Faculty Co-Director of the Stanford Technology Ventures Program and a faculty affiliate at the Stanford Center for AI Safety.

Poster Presentations

30 presenters · 32 posters Evening Reception · May 21

Sidhika BalachandarUC Berkeley
Using Sparse Autoencoders to Measure Social Biases in Podcast Data
Honglin BaoUniversity of Chicago
Improving AI for Scientific Discovery with 6749 Scientists
Youngjin ChaeRutgers University
Survey as Life-course Annotation: Extracting Life Histories from Survey Data Using Large Language Models
Xi ChenYale University
The Hidden Costs of AI Diagnosis: Safety Tradeoffs and Demographic Disparities in Chatbot Performance
Yue ChuThe Ohio State University
AI as Measurement Infrastructure: Adaptive Survey Design in Verbal Autopsy
Erdem DemirtasUniversity of Copenhagen
Measuring Hidden Conceptual Change with AI: A Dual-Embedding Framework for Political Text
Tianyu FanYale University
The Geopolitical Determinants of Economic Growth, 1960–2024
Mohamed Dhia HammamiSyracuse University
Agentic Retrieval Models for Elite Network Data Collection: Application to the Israeli Case
Orestes HastingsColorado State University
Can AI Predict Life Outcomes? Large Language Models and the Birth Lottery
Nico Hernandez-AguileraYale University
iKON in Colombia: Hybrid Human–AI Data Collection for Historical Climate Impacts via WhatsApp Pairwise Comparisons
Yiming HuangThe Ohio State University
When the Black Box Enters Leisure: A Cognitive–Affective Dual-Path Model of AI-Generated Video and Recovery
Liangze "Robert" KeYale School of Public Health
Yolov5-based Computer Vision Model Measures Political Polarization: Visible Interactions Between Religious and Non-religious Citizens
So KubotaTohoku University
LLM-Assisted Replication as Social Science Infrastructure
Jack LaVioletteColumbia University
An LLM-enabled Hermeneutics of Fictional Personality: Social Difference and the Moral Structure of the American Literary Field, 1860–1900
Kyungho LeeYale University
Copyright and Competition: Estimating Supply and Demand with Unstructured Data
Jingyuan LiuBoston University
What is Actually Being Annotated? Inter-Prompt Reliability as a Measurement Problem in LLM-Based Social Science Labeling
Menglin (Miley) LiuThe Chinese University of Hong Kong, Shenzhen
Hearing Democracy: Audio-Based Classification of Speakers in City Council Meetings Using Multimodal LLMs, Temporal Context, and Multi-Agent Deliberation

Abstract

Abstract

Who speaks in local democracy, and how much? Despite growing scholarly interest in political participation beyond voting, we lack scalable tools to measure citizen voice in local government. We develop an automated audio analysis pipeline that classifies speakers in city council meetings as elected officials or members of the public—directly from audio, using neural speaker diarization combined with multimodal LLM few-shot classification.

We introduce two methodological innovations: a temporal exemplar selection with iterative bootstrapping strategy, and a multi-agent deliberation framework. An AND ensemble combining the baseline and bootstrapped classifiers achieves 93.2% accuracy (κ = 0.79), citizen precision of 0.82, and citizen F1 of 0.83—substantially outperforming any single method. Our pipeline targets 1,600+ meetings from seven U.S. rent control cities and their neighbors (2017–2023).
Pangpang LiuYale University
Uncertainty Quantification for Large Language Model Reward Learning under Heterogeneous Human Feedback
Riyang LiuYale School of Public Health
Reconstructing Gridded Historical Particulate Matter Concentrations: A Deep Learning Approach
Robin NaMIT
1. Limits of Literature-Conditioned Large Language Models for Predicting Behavioral Experiments

2. Simulating Institutions with Borrowed Populations: Cross-Study Behavioral Portability via LLM-Mediated Simulation
Jake NicollUniversity of Chicago (Harris School of Public Policy)
(How) Do We Teach Emotions?
Jingyi QiuUniversity of Michigan
Counterfactual LLM-based Framework for Measuring Rhetorical Style
Kyle SilerUniversity of Toronto
The Diffusion of Large Language Models in Published Academic Articles
Kexin SongYale School of the Environment
From Pixels to Evidence: AI and Satellite Data as Social Science Tools in the Rohingya Crisis
Zerui TianUniversity of Oxford
Gendered Signals in LLM-Masked Adolescents' Writing: An Embedding-Based Index and Later-life Outcomes for a UK Cohort
Binglu WangNorthwestern University
Human–AI Collaboration in Science at Scale: A Global Large-scale Randomized Field Experiment
Hanning WangUniversity of Pittsburgh
Rethinking Codebook Development in Framing Analysis with LLMs
Weijun YuanThe University of Chicago
Task Dependence, Communication Networks, and Small Group Coordination: A Multi-Agent AI Experiment of the PCANS Model
Yongjun ZhangStony Brook University
Vibe Researching as Wolf Coming: Can AI Agents with Skills Replace Social Scientists?
Yifei ZhuUniversity of Hong Kong
1. Agentic Framework for Political Biography Extraction

2. Using AI Predictions to Augment Rather Than Replace Surveys

Poster 2 by Lai Wei (The University of Hong Kong), displayed by Yifei Zhu