Time | to 01:00 pm Add to Calendar 2025-04-03 12:00:00 2025-04-03 13:00:00 The Center for Social Data Analytics Colloquium speaker: Dr. Cassandra Tai 421 Susan Welch Liberal Arts Building Population Research Institute America/New_York public |
---|---|
Location | 421 Susan Welch Liberal Arts Building |
Presenter(s) | Cassandra Tai |
Description |
GenAI vs. Human Fact-Checkers: Accurate Ratings, Flawed Rationales Despite recent advances in understanding the capabilities and limits of generative artificial intelligence (GenAI) models, we are just beginning to understand their capacity to assess and reason about the veracity of content. We evaluate multiple GenAI models across tasks that involve the rating of, and reasoning about, the credibility of information. The information in our experiments comes from content that subnational U.S. politicians post to Facebook. We find that GPT-4o, one of the most used AI models in consumer applications, outperforms other models, but all models exhibit only moderate agreement with human coders. Importantly, even when GenAI models accurately identify low-credibility content, their reasoning relies heavily on linguistic features and hard criteria, such as the level of detail, source reliability and language formality, rather than an understanding of veracity. We also assess the effectiveness of summarized versus full content inputs, finding that summarized content holds promise for improving efficiency without sacrificing accuracy. While GenAI has the potential to support human fact-checkers in scaling misinformation detection, our results caution against relying solely on these models. Dr. Cassandra Tai is an Assistant Research Professor at Penn State, co-funded by the College of the Liberal Arts and the Social Science Research Institute, and serves as Assistant Director of the Center for Social Data Analytics. Her research leverages large-scale text data, cross-national public opinion data, language models, generative AI, and Bayesian measurement models to examine elite political behavior and comparative public opinion. |