evals

openai / evals

https://github.com/openai/evals

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

11.5k stargazers

70 open issues

2.2k forks

Loading...