This talk is part of a two-presentation session running from 1:30 PM - 2:20 PM. The session will feature two presentations back-to-back, with Q&A after each presentation.
Recently, Trust and Safety teams have considered using LLM models for policy enforcement, but many are not ready to rely on them today. Still, we can and should leverage these developments to improve the policy iteration process. At Roblox, a Policy-Engineering working group created SAIPIEN - an LLM-driven policy drafting co-pilot. SAIPIEN allows policy experts to test new policies by having an LLM apply them to previously reported content. Using results, we can find policy gaps that would likely lead to sub-par moderation, even before content is analyzed by moderators. That way, we iterate and improve policies well before they impact users. In this presentation, we will introduce the tool, explain its contribution to the policy iteration process, share lessons learned, and discuss the importance of having direct collaboration between engineering and policy teams. We will also explain how the project aligns with Roblox’s broader vision of improving automated moderation.