Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
TrustSciAgent: Towards Rigorous and Trustworthy Agents for Scientific Research
0
Zitationen
6
Autoren
2025
Jahr
Abstract
Large language models have enabled automated agents to tackle complex scientific research tasks, yet most existing approaches struggle to deliver scientifically rigorous and verifiable outputs. In this work, we propose a novel agent framework, TrustSciAgent, which introduces a unified evidence–reasoning–validation pipeline explicitly governed by newly formulated scientific research trustworthiness principles. TrustSciAgent structurally organizes the entire research process into pre-research, in-research, and post-research phases, ensuring that each stage strictly adheres to these principles. This design compels the agent to generate transparent, logically sound reasoning chains and deliver auditable scientific conclusions. Comprehensive experiments across four scientific domains and three representative language models demonstrate that TrustSciAgent consistently improves both the structural completeness and the correctness of reasoning outputs, outperforming standard LLM-based agents. Our results provide strong evidence that embedding domain-agnostic trustworthiness principles into the agent workflow is critical for enabling credible, generalizable, and verifiable automated scientific research.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.493 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.377 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.835 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.555 Zit.