Knowledge Index of Noah's Ark

A High-Density Benchmark Systematically Mapping 261 Disciplines

Overview

KINA is a high-density knowledge benchmark encompassing 261 fine-grained disciplines, the first to incorporate disciplinary representativeness as a core metric. It features a reusable, game-theoretic data collection pipeline that mitigates annotation vulnerabilities.

261Disciplines

899Questions

10Options

Benchmark Comparison

Bubble size = question count · Lower score = more challenging for SOTA models

KINA (Ours)

Other Benchmarks

Leaderboard

We evaluate 45 models from 13 major AI labs on KINA. Scores are reported as avg@4 accuracy.

Filter:

Rank	Model	Type	ALL ▲	Agr. ▲	Econ. ▲	Edu. ▲	Eng. ▲	Hist. ▲	Law ▲	Arts ▲	Mgt. ▲	Med. ▲	Phil. ▲	Sci. ▲	Soc. ▲

Closed-Source

Open-Source

1 Gold

2 Silver

3 Bronze

Bold = Best in column

Data Sample

Data Collection Pipeline

Score Distribution

Granularity:

Hover to see statistics. Click a violin to jump to the model in the leaderboard.

Discipline Coverage

We curate a hierarchical taxonomy of Disciplines grounded in the U.S. Classification of Instructional Programs (CIP).
The finalized dataset comprises 899 instances, distributed across 12 disciplines, 70 fields, and 261 fine-grained subfields.

Click any block to drill into its Level-3 sub-disciplines. Click the breadcrumb to return.

Model Scores Over Time

Hover a dot to see score and release date. Click to jump to the model in the leaderboard.

Inference Cost Distribution

Qwen3

Qwen3.5

Token Length vs. Performance

BibTeX

If you find KINA useful in your research, please cite our paper:

@misc{anonymous2026kina,
  title  = {KINA: Knowledge Index of Noah's Ark},
  author = {Anonymous Authors},
  year   = {2026},
  note   = {Under review}
}