Code Generation
When To Use Code Generation Eval Template
This Eval checks the correctness and readability of the code from a code generation process. The template variables are:
query: The query is the coding question being asked
code: The code is the code that was returned.
Code Generation Eval Template
We are continually iterating our templates, view the most up-to-date template on GitHub. Last updated on 10/12/2023
Benchmark Results
GPT-4 Results
GPT-3.5 Results
GPT-4 Turbo
How To Run the Eval
The above shows how to use the code readability template.
Code Eval | GPT-4 Turbo | GPT-4 | Gemini Pro | GPT-3.5 | Palm | Llama 7b (soon) |
---|---|---|---|---|---|---|
Precision | 1.0 | 0.93 | 0.79 | 0.78 | 0.77 | |
Recall | 0.71 | 0.78 | 0.81 | 0.93 | 0.94 | |
F1 | 0.83 | 0.85 | 0.80 | 0.85 | 0.85 |
Last updated