Back to Home

PersonaGym Leaderboard

All official submissions to the PersonaGym leaderboard are maintained at PersonaGym/evaluations

Model	Date	Action Justification	Expected Action	Linguistic Habits	Persona Consistency	Toxicity Control	PersonaScore
Claude 3.5 Sonnet	2024-07-10	4.52	4.37	3.98	4.81	4.88	4.51
LLaMA-3 (8b)	2024-07-10	4.55	4.43	3.97	4.77	4.74	4.49
LLaMA-2 (70b)	2024-07-10	4.44	4.32	3.85	4.67	4.68	4.39
GPT-3.5	2024-07-10	4.31	4.28	3.63	4.70	4.96	4.38
LLaMA-2 (13b)	2024-07-10	3.96	3.87	3.77	4.12	4.18	3.98
Claude 3 Haiku	2024-07-10	2.47	4.28	3.04	4.47	4.94	3.64

Click on any column header to sort the leaderboard by that task. The default ranking is by PersonaScore.

Submit to PersonaGym Leaderboard

If you are interested in submitting your model to the PersonaGym Leaderboard, please do the following:

Fork this repository.
Under the evaluations directory create a new folder with the submission date and the model name (e.g. 20240415_llama_2_70b).
Within the folder, please include the following files:
- scores.json: The JSON file containing the average score for all tasks in PersonaGym on our benchmark. This file is automatically generated by our code and is saved under scores/{save_name} where save_name is a flag to our run script.
- README.md: (Recommended) Include any information you would like to share about your model in this README.
Create a pull request to this repository with the new folder.

The leaderboard will be updated every Monday.