Paul Röttger
77 Arbeiten602 Zitationen
University of Oxford · GB
Relevante Arbeiten
Meistzitierte Publikationen im Bereich Gesundheit & MedTech
SimpleSafetyTests: a Test Suite for Identifying Critical Safety Risks in Large Language Models
2023 · 5 Zit. · arXiv (Cornell University)
No for Some, Yes for Others: Persona Prompts and Other Sources of False Refusal in Language Models
2025 · 0 Zit. · ArXiv.org
Large Language Model Hacking: Quantifying the Hidden Risks of Using LLMs for Text Annotation
2025 · 0 Zit. · ArXiv.org