RAGLens

/Bulk Test

Run all test cases through the live pipeline and see where eval results agree with expectations. Disagreements reveal where the pipeline or eval needs attention.

← Single query

Choose a corpus and run tests to see results here.

Each row will show expected vs. actual pass, score, and eval diagnosis.