InspectorRAGet: An Introspection Platform for RAG Evaluation

Fadnis, Kshitij; Patel, Siva Sankalp; Boni, Odellia; Katsis, Yannis; Rosenthal, Sara; Sznajder, Benjamin; Danilevsky, Marina

Full-text links:

Download:

Current browse context:

cs.SE

< prev | next >

new | recent | 2404

Computer Science > Software Engineering

Title: InspectorRAGet: An Introspection Platform for RAG Evaluation

Authors: Kshitij Fadnis, Siva Sankalp Patel, Odellia Boni, Yannis Katsis, Sara Rosenthal, Benjamin Sznajder, Marina Danilevsky

(Submitted on 26 Apr 2024)

Abstract: Large Language Models (LLM) have become a popular approach for implementing Retrieval Augmented Generation (RAG) systems, and a significant amount of effort has been spent on building good models and metrics. In spite of increased recognition of the need for rigorous evaluation of RAG systems, few tools exist that go beyond the creation of model output and automatic calculation. We present InspectorRAGet, an introspection platform for RAG evaluation. InspectorRAGet allows the user to analyze aggregate and instance-level performance of RAG systems, using both human and algorithmic metrics as well as annotator quality. InspectorRAGet is suitable for multiple use cases and is available publicly to the community. The demo video is available at this https URL

Subjects:	Software Engineering (cs.SE); Human-Computer Interaction (cs.HC)
Cite as:	arXiv:2404.17347 [cs.SE]
	(or arXiv:2404.17347v1 [cs.SE] for this version)

Submission history

From: Kshitij Fadnis [view email]
[v1] Fri, 26 Apr 2024 11:51:53 GMT (2583kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2404.17347

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Software Engineering

Title: InspectorRAGet: An Introspection Platform for RAG Evaluation

Submission history