MMOU Evaluator

Upload a .json or .jsonl file where each entry contains question_id and answer.

Ready

Upload a prediction file and click Evaluate.

Summary

Run an evaluation to populate the aggregate summary.

Submission Format

Each entry must contain:

  • question_id
  • answer

answer must be a single letter from A to J. Letter matching is case-insensitive. Extra keys are ignored. Rows with empty or null answers are ignored.

Example JSON:

[
  {"question_id": "54aaef4d-2c22-476e-a7e7-37efabde2520", "answer": "C"},
  {"question_id": "a7f8790d-7828-4ece-a63a-a5d13edf9026", "answer": "B"}
]

Example JSONL:

{"question_id": "54aaef4d-2c22-476e-a7e7-37efabde2520", "answer": "C"}
{"question_id": "a7f8790d-7828-4ece-a63a-a5d13edf9026", "answer": "B"}