RAGLens

/Evaluation Suite

Some expected outcomes are successful answers. Others are successful refusals. The Evaluation Suite checks whether RAGLens can tell the difference.

Run curated test cases — including expected failures — through the live pipeline and see where eval results agree with expectations. Disagreements reveal where the pipeline or eval needs attention.

About← Single query

Choose a corpus and run tests to see results here.

Each row will show expected vs. actual pass, score, and eval diagnosis.