Name
Protecting the Protectors: Using Generative AI to Ensure Content Moderators’ Safety
Description

This talk is part of a two-presentation session running from 11:10 AM -12:00 PM. The session will feature two presentations back-to-back, with Q&A after each presentation.

Even as AI-based models perform content moderation, human input remains essential for many tasks, for instance, annotating training datasets, or reviewing complaints against automated decisions. These tasks frequently expose moderators to toxic content and users, resulting in psychological distress and some moderators quitting their jobs. To ensure moderators’ safety, I present two applications of Generative AI that reduce human exposure to toxicity.

First, I present AppealMod, a chatbot-based appeals process that induces self-selection among appealing users to discourage toxic users and assist users with sincere appeals. I demonstrate AppealMod’s effectiveness with results from a 4-month experiment in a Reddit community with 29 million users. Second, I present the best-performing and most cost-effective prompting strategies for annotating toxic content using Generative AI, derived from a large-scale experiment with over 100,000 data points.  By reducing human exposure to toxicity, these applications can augment moderators’ decision-making and also ensure their psychological safety. 

Location Name
Seacliff CD
Date
Monday, July 22, 2024
Time
11:10 AM - 12:00 PM
Session Type
Presentation
Track
Research
Session Themes
Data & Metrics, Research, Scaling T&S, Wellness & Resilience
Audience
All TrustCon attendees
Will this session be recorded?
No