2024 Gaokao Math: LLM Evaluation Special Report
2024 Gaokao Math: LLM Evaluation Special Report
The 2024 Chinese National College Entrance Examination (高考) math papers offer a unique opportunity for LLM evaluation. Their high originality and strict confidentiality mean that models cannot have seen these exact questions during pre-training, making them an exceptionally fair test set.
Why Gaokao Math?
- Zero contamination risk — brand-new questions released under strict security
- Standardized difficulty — designed by professional exam committees
- Comprehensive coverage — tests algebra, geometry, probability, calculus, and logical reasoning
- National significance — taken by millions of students, with well-established scoring rubrics
Evaluation Design
We evaluated leading LLMs on both New Paper I (新I卷) and New Paper II (新II卷), using two different prompt formats:
LaTeX Format
Mathematical expressions rendered in standard LaTeX notation (e.g., \(\frac{x^2}{a^2} + \frac{y^2}{b^2} = 1\)).
Escape Character Format
Mathematical expressions using text-based representations with escape characters.
This dual-format design reveals how sensitive models are to prompt formatting — an often-overlooked factor in mathematical evaluation.
Key Findings
- Model performance varies significantly between the two prompt formats
- Some models show strong LaTeX comprehension but struggle with text-based math notation (and vice versa)
- The results highlight the importance of standardized input formatting for fair mathematical evaluation
Detailed results and leaderboard rankings are available in our GitHub repository.