Code Generation Eval
When To Use Code Generation Eval Template
This Eval checks the correctness and readability of the code from a code generation process. The template variables are:
query: The query is the coding question being asked
code: The code is the code that was returned.
Code Generation Eval Template
We are continually iterating our templates, view the most up-to-date template on GitHub. Last updated on 10/12/2023
Benchmark Results
GPT-4 Results
GPT-3.5 Results
How To Run the Eval
The above shows how to use the code readability template.
Code Eval | GPT-4 | GPT-3.5 | GPT-3.5-Instruct | Palm 2 (Text Bison) | Llama 7b (soon) |
---|---|---|---|---|---|
Precision | 0.93 | 0.76 | 0.67 | 0.77 | |
Recall | 0.78 | 0.93 | 1 | 0.94 | |
F1 | 0.85 | 0.85 | 0.81 | 0.85 |
Last updated