FDS Workshop AI for Social Science Research Methods
  • Home
  • Topics
  • Schedule
  • Details
  • Submissions
  • Register
Yale Institute for Foundations of Data Science

AI for Social Science
Research Methods

Workshop Presentations
May 20–22, 2026 · Yale University

Keynote

Day 2 · May 22
Nicholas Christakis
Yale University
Talk title to be announced
May 22 · 11:00 AM – 12:00 PM

Oral Presentations

12 presenters Days 1 & 2
  1. Siwei ChengNew York University
    Who Talks to Whom? Measuring Segregation in Informal Interactions at Academic Conferences Using Images and Vision–Language Models
  2. Sharad GoelHarvard University
    Mitigating Label Bias With Rubric Embeddings
    Bio
    About
    Sharad Goel is a Professor of Public Policy at Harvard Kennedy School. He looks at public policy through the lens of computer science, bringing a computational perspective to a diverse range of contemporary social and political issues, including education, the delivery of public benefits, and the equitable design of algorithms. He is the founder and director of the Harvard Computational Policy Lab. Prior to joining Harvard, Sharad was on the faculty at Stanford University, with appointments in management science & engineering, computer science, sociology, and the law school.
  3. Valentina Gonzalez-RostaniUniversity of Southern California
    Tracing AI Assistance and AI Agents in Survey Research
    Bio & abstract
    About
    Valentina Gonzalez-Rostani is an Assistant Professor in the Department of Political Science and International Relations at the University of Southern California. Her research examines how technological change, especially automation and artificial intelligence, reshapes political attitudes, parties, and democratic institutions across advanced economies and the Global South. Using causal inference and computational text analysis, she studies populism, political behavior, inequality, international trade, and the measurement of political discourse. Before joining USC, she was a Postdoctoral Research Associate at Princeton University. Her work has appeared in The Journal of Politics, Political Science Research and Methods, Ecological Economics, and Legislative Studies Quarterly.
    Abstract
    Generative AI creates a measurement problem for survey research. When respondents can outsource summarization, explanation, or fact retrieval, open-ended answers may no longer reflect their own reasoning. We introduce an auditable toolkit that combines response-process paradata with prompt-specific semantic benchmarks to detect AI use in open-ended responses. We validate the approach in a randomized survey experiment in which participants complete a summary task under either blocked AI access or observed access to an embedded AI tool, and by administering the same instrument to a synthetic AI agent. AI-assisted and automated responses leave distinct behavioral and textual traces: they show reduced drafting and revision, greater similarity to AI benchmark output, and lower distinctiveness relative to peer responses. These findings show that, in the age of generative AI, authorship must be measured rather than assumed.
  4. Kosuke ImaiHarvard University
    GenAI Powered Inference
    Bio & abstract
    About
    Kosuke Imai is a Professor in the Departments of Government and Statistics at Harvard University and an affiliate of the Institute for Quantitative Social Science. His research focuses on the development of statistical methods and machine learning algorithms for social science applications, with particular expertise in causal inference, computational social science, and survey methodology. He is the author of Quantitative Social Science: An Introduction and leads the Algorithm-Assisted Redistricting Methodology (ALARM) Project. Imai has served as President of the Society for Political Methodology.
    Abstract
    We introduce GenAI-Powered Inference (GPI), a statistical framework for both causal and predictive inference using unstructured data, including text, image, and video. GPI leverages open-source Generative AI models—such as large language models and diffusion models—not only to generate unstructured data at scale but also to extract low-dimensional representations that capture their underlying structure. Applying machine learning to these representations, GPI enables estimation of causal and predictive effects while quantifying associated estimation uncertainty. Unlike existing approaches to representation learning, GPI does not require fine-tuning of generative models, making it computationally efficient and broadly accessible.
  5. Yingdan LuNorthwestern University
    Mitigating Confounding Bias in Observational Visual Media Research: An LLM-Informed, Machine Learning-Based Inference Framework
    Bio & abstract
    About
    Yingdan Lu (Ph.D., Stanford University) is an Assistant Professor in the Department of Communication Studies at Northwestern University. She is the director of the Computational Media and Politics Lab, and the co-director of the Computational Multimodal Communication Lab. Her research focuses on digital technology, political communication, and information manipulation. Her work has appeared in PNAS, American Journal of Political Science, Political Communication, and New Media & Society.
    Abstract
    Observational studies of visual media effects are vulnerable to confounding bias due to the complexity of visual data, the difficulty in variable measurement, and the uncertainty of selection. We propose an inference framework that leverages multimodal LLMs and machine learning-based inference techniques to (1) identify potential confounders through reasoned visual assessment, (2) measure semantically complex features that are difficult to capture with conventional methods, and (3) apply double-Lasso for covariate selection and hierarchical OLS regression for effect estimation. We demonstrate this framework by revisiting the relationship between political personalization and social media engagement using 59,020 Instagram posts from U.S. politicians.
  6. Matthew SalganikPrinceton University
    Title to be announced
  7. Austin van LoonMIT Sloan
    Using Large Language Models as a Source of Human Behavioral Data in Social Science Experiments
  8. Alexander VolfovskyDuke University
    Title to be announced
    Bio
    About
    Alexander Volfovsky is an Associate Professor of Statistical Science at Duke University, where he also serves as co-director of the Polarization Lab. His research lies at the intersection of causal inference, network analysis, and machine learning, with applications to understanding social behavior, online interactions, and decision-making in complex systems. His recent projects focus on human–AI interaction, trust calibration, and the design of artificial agents that foster constructive discourse.
  9. Hannah WaightUniversity of Oregon
    How Chatbots Change Expression
    Bio & abstract
    About
    Hannah Waight is an Assistant Professor of Sociology at the University of Oregon and a former Postdoctoral Research Associate at CSMaP. Waight studies the politics of media and information and their implications for social organization. She has also worked on popular perceptions of inequality and has an ongoing project on how people in authoritarian regimes "see the state."
    Abstract

    An emerging body of research has demonstrated the persuasive power of conversational AI or "chatbots," examining their effect on attitudes towards policy, belief in conspiracy theories, and agreement with propaganda statements. While important, this literature has overlooked a crucial distinction: the difference between what people believe and what they say. Discourse and attitudes are connected but distinct phenomena—what people are willing to say is furthermore a social phenomenon, contextually and relationally specific.

    In this project, we use three experimental studies to compare the effect of conversations with a chatbot on individuals' discourse as compared with their attitudes. In the first experiment, we found larger effects on what people are willing to say than their self-expressed attitudes, especially after interactions with a chatbot instructed to use facts. In the second experiment we expanded the number of conversation topics and disentangled the mechanisms by which fact-based conversations change expression. In a third planned experiment we extend these findings by measuring the effect of chatbot conversations on dialogs with real individuals.

  10. Yiqing XuStanford University
    Scaling Reproducibility: An AI-Assisted Workflow for Large-Scale Reanalysis
  11. Eddie YangPurdue University
    Data Annotation with Large Language Models: Lessons from a Large Empirical Evaluation
  12. Han ZhangBrown University
    How Should Experimenters Construct Image Stimuli? A Framework for Controlled Image Construction

Panel Discussions

Day 1 · May 21
AI & Open Science
May 21 · 11:00 AM – 12:00 PM
Moderated by Jeremy Freese Stanford University
  1. Jessica HullmanNorthwestern University
  2. Chenhao TanUniversity of Chicago
    Bio
    About
    Chenhao Tan is an Associate Professor of Computer Science and Data Science at the University of Chicago, and directs the Chicago Human+AI Lab. He earned his PhD in Computer Science from Cornell University and dual bachelor's degrees in computer science and economics from Tsinghua University. His research focuses on human-centered AI, communication & intelligence, and AI alignment. He has received a Sloan research fellowship, an NSF CAREER award, and research awards from Amazon, IBM, JP Morgan, Google, and Salesforce.
  3. Yiqing XuStanford University
  4. Yian YinCornell University
AI & Publishing
May 21 · 4:30 – 5:30 PM
Moderated by Emma Zang Yale University
  1. Weihua AnEmory University
  2. Adam BerinskyMIT
  3. Donald Tomaskovic-DeveyUniversity of Massachusetts Amherst
  4. Johan UganderYale University

Lightning Round Presentations

8 presenters Day 2 · May 22
  1. Soubhik BarariNORC at the University of Chicago
    AI-Assisted Conversational Interviewing: Effects on Data Quality and Respondent Experience
  2. David BroskaStanford University
    Social Science Simulator: A Tool for Simulating Experiments with LLMs
    Bio
    About
    David Broska joined Stanford's Department of Sociology as a PhD student after receiving an MSc in Social Research Methods at the London School of Economics and a BA in Sociology, Politics, and Economics at Zeppelin University in Germany. His research interests include computational social science, inequality, and social policy.
  3. Jacob CrainicYee Collins Research Group
    Prompt Degrees of Freedom in LLM-Based Social Science: A Specification Curve Framework for Transparent Reporting
  4. Jiaxin (Allyson) CuiDuke University
    Same Prompt, Different Outcomes: Evaluating the Reproducibility of Data Analysis by LLMs
  5. Junsol KimUniversity of Chicago / Google
    Reasoning Models Generate Societies of Thought
    Bio
    About
    Junsol Kim is a Ph.D. candidate in Sociology at the University of Chicago, advised by James Evans, and a student researcher at Google. His research examines how AI and other social technologies reshape information and knowledge ecosystems, and how AI can be harnessed to augment social science methods. His work has appeared in Nature Communications, PNAS, ICLR, ICML, and EMNLP.
  6. Joe SievUniversity of Virginia Darden School of Business
    The Cross-Context Design: An AI-Enabled Method for Testing Behavioral Hypotheses across Archival Datasets
    Bio
    About
    Joe Siev is a postdoctoral research associate at the University of Virginia Darden School of Business and an incoming Assistant Professor of Marketing at the University of Alabama. He holds a PhD in Social Psychology from The Ohio State University. His research focuses on political consumerism—how consumers express political values through marketplace choices, and how brands' political expressions shape consumer behavior. He also builds research tools that leverage AI and automation.
  7. Yuehong Cassandra TaiPenn State University
    From Annotation to Measurement: Traceable Workflows for LLM Inference
  8. Charles EesleyStanford University
    Four Paradigms for AI-Enabled Social Science Research
    Presenting for Guankai Zhai
    Bio
    About
    Chuck Eesley is a Professor and W. M. Keck Foundation Faculty Scholar in the Department of Management Science and Engineering at Stanford University. He studies and designs systems that enable high-quality entrepreneurship and innovation under uncertainty, including in emerging technologies and sustainability transitions. He is Faculty Co-Director of the Stanford Technology Ventures Program and a faculty affiliate at the Stanford Center for AI Safety.

Poster Presentations

32 presenters · 33 posters Evening Reception · May 21
  1. Sidhika BalachandarUC Berkeley
    Using Sparse Autoencoders to Measure Social Biases in Podcast Data
  2. Honglin BaoUniversity of Chicago
    Improving AI for Scientific Discovery with 6749 Scientists
  3. Youngjin ChaeRutgers University
    Survey as Life-course Annotation: Extracting Life Histories from Survey Data Using Large Language Models
  4. Xi ChenYale University
    The Hidden Costs of AI Diagnosis: Safety Tradeoffs and Demographic Disparities in Chatbot Performance
  5. Yue ChuThe Ohio State University
    AI as Measurement Infrastructure: Adaptive Survey Design in Verbal Autopsy
  6. Erdem DemirtasUniversity of Copenhagen
    Measuring Hidden Conceptual Change with AI: A Dual-Embedding Framework for Political Text
  7. Tianyu FanYale University
    The Geopolitical Determinants of Economic Growth, 1960–2024
  8. Mohamed Dhia HammamiSyracuse University
    Agentic Retrieval Models for Elite Network Data Collection: Application to the Israeli Case
  9. Orestes HastingsColorado State University
    Can AI Predict Life Outcomes? Large Language Models and the Birth Lottery
  10. Nico Hernandez-AguileraYale University
    iKON in Colombia: Hybrid Human–AI Data Collection for Historical Climate Impacts via WhatsApp Pairwise Comparisons
  11. Yiming HuangThe Ohio State University
    When the Black Box Enters Leisure: A Cognitive–Affective Dual-Path Model of AI-Generated Video and Recovery
  12. Liangze "Robert" KeYale School of Public Health
    Yolov5-based Computer Vision Model Measures Political Polarization: Visible Interactions Between Religious and Non-religious Citizens
  13. So KubotaTohoku University
    LLM-Assisted Replication as Social Science Infrastructure
  14. Jack LaVioletteColumbia University
    An LLM-enabled Hermeneutics of Fictional Personality: Social Difference and the Moral Structure of the American Literary Field, 1860–1900
  15. Kyungho LeeYale University
    Copyright and Competition: Estimating Supply and Demand with Unstructured Data
  16. Jingyuan LiuBoston University
    What is Actually Being Annotated? Inter-Prompt Reliability as a Measurement Problem in LLM-Based Social Science Labeling
  17. Menglin (Miley) LiuThe Chinese University of Hong Kong, Shenzhen
    Hearing Democracy: Audio-Based Classification of Speakers in City Council Meetings Using Multimodal LLMs, Temporal Context, and Multi-Agent Deliberation
    Abstract
    Abstract

    Who speaks in local democracy, and how much? Despite growing scholarly interest in political participation beyond voting, we lack scalable tools to measure citizen voice in local government. We develop an automated audio analysis pipeline that classifies speakers in city council meetings as elected officials or members of the public—directly from audio, using neural speaker diarization combined with multimodal LLM few-shot classification.

    We introduce two methodological innovations: a temporal exemplar selection with iterative bootstrapping strategy, and a multi-agent deliberation framework. An AND ensemble combining the baseline and bootstrapped classifiers achieves 93.2% accuracy (κ = 0.79), citizen precision of 0.82, and citizen F1 of 0.83—substantially outperforming any single method. Our pipeline targets 1,600+ meetings from seven U.S. rent control cities and their neighbors (2017–2023).

  18. Pangpang LiuYale University
    Uncertainty Quantification for Large Language Model Reward Learning under Heterogeneous Human Feedback
  19. Riyang LiuYale School of Public Health
    Reconstructing Gridded Historical Particulate Matter Concentrations: A Deep Learning Approach
  20. Robin NaMIT
    1. Limits of Literature-Conditioned Large Language Models for Predicting Behavioral Experiments
    2. Simulating Institutions with Borrowed Populations: Cross-Study Behavioral Portability via LLM-Mediated Simulation
  21. Jake NicollUniversity of Chicago (Harris School of Public Policy)
    (How) Do We Teach Emotions?
  22. Jingyi QiuUniversity of Michigan
    Counterfactual LLM-based Framework for Measuring Rhetorical Style
  23. Kyle SilerUniversity of Toronto
    The Diffusion of Large Language Models in Published Academic Articles
  24. Kexin SongYale School of the Environment
    From Pixels to Evidence: AI and Satellite Data as Social Science Tools in the Rohingya Crisis
  25. Zerui TianUniversity of Oxford
    Gendered Signals in LLM-Masked Adolescents' Writing: An Embedding-Based Index and Later-life Outcomes for a UK Cohort
  26. Binglu WangNorthwestern University
    Human–AI Collaboration in Science at Scale: A Global Large-scale Randomized Field Experiment
  27. Hanning WangUniversity of Pittsburgh
    Rethinking Codebook Development in Framing Analysis with LLMs
  28. Lai WeiThe University of Hong Kong
    Using AI Predictions to Augment Rather Than Replace Surveys
  29. Shimmei YamauchiUniversity of Tokyo
    Measuring Meaning on Video at Scale: From Classifier to Theory Interpreter in AI-Based Humor Annotation
  30. Weijun YuanThe University of Chicago
    Task Dependence, Communication Networks, and Small Group Coordination: A Multi-Agent AI Experiment of the PCANS Model
  31. Yongjun ZhangStony Brook University
    Vibe Researching as Wolf Coming: Can AI Agents with Skills Replace Social Scientists?
  32. Yifei ZhuUniversity of Hong Kong
    Agentic Framework for Political Biography Extraction
AI for Social Science Research Methods Workshop · May 20–22, 2026
Yale Institute for Foundations of Data Science
Register →
FDS Workshop: AI for Social Science Research Methods
FDS Workshop: AI for Social Science Research Methods
  • Home
  • Topics
  • Schedule
  • Details
  • Submissions
  • Logout
  • Register
  • Presentations