[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"blog":3},{"title":4,"desc":5,"bannerImg":6,"date":7,"links":8,"description":5,"content":9,"tag1":292},"Google Gemini 3 Sets New SOTA on OmniDocBench: The New Standard for Document AI","Google DeepMind‘s latest Gemini 3 technical report selects 2077AI’s OmniDocBench 1.5 as a core benchmark for OCR. See how the new model performed and why top labs are choosing our datasets.","https:\u002F\u002Fdoxhub.s3.us-east-1.amazonaws.com\u002F2077ai\u002FBanner_blog\u002Fbanner_omindoc_gemini.png","2025-11-19","{\"github\":\"https:\u002F\u002Fgithub.com\u002Fopendatalab\u002FOmniDocBench\",\"huggingface\":\"https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fopendatalab\u002FOmniDocBench\", \"arxiv\":\"https:\u002F\u002Farxiv.org\u002Fabs\u002F2412.07626\",\"homepage\":\"https:\u002F\u002Fwww.2077ai.com\u002Fdataset\u002Fdataset-omnidocbench\"}",{"data":10,"body":12,"toc":283},{"title":4,"description":11},"In the race towards Artificial General Intelligence (AGI), benchmarks are the compass. They tell us where we are and how far we have to go.",{"type":13,"children":14},"root",[15,23,28,45,50,55,62,67,76,89,94,99,105,110,119,132,137,148,153,164,169,175,199,204,209,215,232,268,274],{"type":16,"tag":17,"props":18,"children":20},"element","h1",{"id":19},"google-gemini-3-sets-new-sota-on-omnidocbench-the-new-standard-for-document-ai",[21],{"type":22,"value":4},"text",{"type":16,"tag":24,"props":25,"children":26},"p",{},[27],{"type":22,"value":11},{"type":16,"tag":29,"props":30,"children":35},"div",{"className":31,"style":34},[32,33],"img-wrap","center","width: 100%; position: relative",[36,38],{"type":22,"value":37},"\n  ",{"type":16,"tag":39,"props":40,"children":44},"img",{"src":41,"alt":42,"style":43},"https:\u002F\u002Fdoxhub.s3.us-east-1.amazonaws.com\u002F2077ai\u002F20251119\u002Fgemini01.webp","gemini3","width: 100%; max-height: 60vh; object-fit: contain; background: #141414; border-radius: 8px",[],{"type":16,"tag":24,"props":46,"children":47},{},[48],{"type":22,"value":49},"Yesterday, Google DeepMind unveiled Gemini 3, their most capable multimodal AI model to date. In their comprehensive technical report, we were thrilled to see a familiar name: OmniDocBench 1.5, developed by 2077AI, was selected as a core benchmark to evaluate the model's Optical Character Recognition (OCR) and document understanding capabilities.",{"type":16,"tag":24,"props":51,"children":52},{},[53],{"type":22,"value":54},"This follows a recent citation by DeepSeek-OCR, marking a significant trend: OmniDocBench has effectively become the industry standard for evaluating complex, real-world Document AI.",{"type":16,"tag":56,"props":57,"children":59},"h2",{"id":58},"the-results-a-new-state-of-the-art",[60],{"type":22,"value":61},"The Results: A New State-of-the-Art",{"type":16,"tag":24,"props":63,"children":64},{},[65],{"type":22,"value":66},"Gemini 3's performance on our benchmark is nothing short of impressive.",{"type":16,"tag":29,"props":68,"children":70},{"className":69,"style":34},[32,33],[71,72],{"type":22,"value":37},{"type":16,"tag":39,"props":73,"children":75},{"src":74,"alt":42,"style":43},"https:\u002F\u002Fdoxhub.s3.us-east-1.amazonaws.com\u002F2077ai\u002F20251119\u002Fgemini02.webp",[],{"type":16,"tag":24,"props":77,"children":78},{},[79,81,87],{"type":22,"value":80},"According to the technical report, ",{"type":16,"tag":82,"props":83,"children":84},"strong",{},[85],{"type":22,"value":86},"Gemini 3 Pro achieved an Overall Edit Distance of 0.115",{"type":22,"value":88}," on OmniDocBench 1.5 (where lower is better). This score sets a new State-of-the-Art (SOTA), surpassing other frontier models listed in the comparison, such as Claude Sonnet 4.5 (0.145) and GPT-5.1 (0.147).",{"type":16,"tag":24,"props":90,"children":91},{},[92],{"type":22,"value":93},"Table from the Gemini 3 Technical Report showing performance across key benchmarks, including OmniDocBench 1.5.",{"type":16,"tag":24,"props":95,"children":96},{},[97],{"type":22,"value":98},"What does an Edit Distance of 0.115 mean? It means that even when faced with OmniDocBench’s most challenging samples—handwritten notes, multi-column newspapers, and complex financial tables—Gemini 3 Pro captures the content with near-perfect character-level accuracy. It demonstrates a significant leap in the model's ability to \"see\" and \"read\" documents just as a human would.",{"type":16,"tag":56,"props":100,"children":102},{"id":101},"why-top-labs-choose-omnidocbench",[103],{"type":22,"value":104},"Why Top Labs Choose OmniDocBench",{"type":16,"tag":24,"props":106,"children":107},{},[108],{"type":22,"value":109},"Why are industry giants like Google DeepMind and DeepSeek turning to OmniDocBench instead of older, traditional datasets?",{"type":16,"tag":29,"props":111,"children":113},{"className":112,"style":34},[32,33],[114,115],{"type":22,"value":37},{"type":16,"tag":39,"props":116,"children":118},{"src":117,"alt":42,"style":43},"https:\u002F\u002Fdoxhub.s3.us-east-1.amazonaws.com\u002F2077ai\u002F20251119\u002Fgemini03.webp",[],{"type":16,"tag":120,"props":121,"children":122},"ul",{},[123],{"type":16,"tag":124,"props":125,"children":126},"li",{},[127],{"type":16,"tag":82,"props":128,"children":129},{},[130],{"type":22,"value":131},"Real-World Diversity",{"type":16,"tag":24,"props":133,"children":134},{},[135],{"type":22,"value":136},"Traditional OCR benchmarks often focus on clean, single-column academic papers. OmniDocBench was built to break models. It covers 9 distinct document types, including slide decks, textbooks, magazines, and distorted scans. If a model can't handle a messy, rotated receipt or a dense financial table, it will fail our test.",{"type":16,"tag":120,"props":138,"children":139},{},[140],{"type":16,"tag":124,"props":141,"children":142},{},[143],{"type":16,"tag":82,"props":144,"children":145},{},[146],{"type":22,"value":147},"Fine-Grained Granularity",{"type":16,"tag":24,"props":149,"children":150},{},[151],{"type":22,"value":152},"We don't just give a pass\u002Ffail. OmniDocBench offers evaluation across 19 layout categories and 15 attribute labels. This allows researchers at labs like Google to pinpoint exactly where their model excels (e.g., formula recognition) and where it might still struggle (e.g., complex table structures).",{"type":16,"tag":120,"props":154,"children":155},{},[156],{"type":16,"tag":124,"props":157,"children":158},{},[159],{"type":16,"tag":82,"props":160,"children":161},{},[162],{"type":22,"value":163},"The \"Acid Test\" for Multimodality:",{"type":16,"tag":24,"props":165,"children":166},{},[167],{"type":22,"value":168},"As models evolve from simple LLMs to true Multimodal LLMs (MLLMs), the ability to process visual text is paramount. OmniDocBench provides the rigorous, high-fidelity ground truth needed to train and verify these next-generation capabilities.",{"type":16,"tag":56,"props":170,"children":172},{"id":171},"building-the-infrastructure-for-agi",[173],{"type":22,"value":174},"Building the Infrastructure for AGI",{"type":16,"tag":24,"props":176,"children":177},{},[178,180,185,187,192,194],{"type":22,"value":179},"The adoption of OmniDocBench by both ",{"type":16,"tag":82,"props":181,"children":182},{},[183],{"type":22,"value":184},"DeepSeek",{"type":22,"value":186}," and ",{"type":16,"tag":82,"props":188,"children":189},{},[190],{"type":22,"value":191},"Google",{"type":22,"value":193}," within the same quarter validates our core mission at 2077AI: ",{"type":16,"tag":82,"props":195,"children":196},{},[197],{"type":22,"value":198},"Great AI is built on great data.",{"type":16,"tag":24,"props":200,"children":201},{},[202],{"type":22,"value":203},"We believe that open, high-quality, and challenging benchmarks are essential for the transparency and progress of the AI field. By providing the tools to measure \"intelligence\" accurately, we help the entire ecosystem move forward faster and more safely.",{"type":16,"tag":24,"props":205,"children":206},{},[207],{"type":22,"value":208},"We congratulate the Google DeepMind team on this remarkable achievement with Gemini 3. We are proud to provide the ruler by which the world's smartest models are measured.",{"type":16,"tag":56,"props":210,"children":212},{"id":211},"explore-the-benchmark",[213],{"type":22,"value":214},"Explore the Benchmark",{"type":16,"tag":24,"props":216,"children":217},{},[218,220,225,226,230],{"type":22,"value":219},"Ready to test your models against the same rigorous standard trusted by ",{"type":16,"tag":82,"props":221,"children":222},{},[223],{"type":22,"value":224},"Google DeepMind",{"type":22,"value":186},{"type":16,"tag":82,"props":227,"children":228},{},[229],{"type":22,"value":184},{"type":22,"value":231},"?",{"type":16,"tag":120,"props":233,"children":234},{},[235,253],{"type":16,"tag":124,"props":236,"children":237},{},[238,243,245],{"type":16,"tag":82,"props":239,"children":240},{},[241],{"type":22,"value":242},"🌐 OmniDocBench Homepage:",{"type":22,"value":244}," ",{"type":16,"tag":246,"props":247,"children":251},"a",{"href":248,"rel":249},"https:\u002F\u002Fwww.2077ai.com\u002Fdataset\u002Fdataset-omnidocbench",[250],"nofollow",[252],{"type":22,"value":248},{"type":16,"tag":124,"props":254,"children":255},{},[256,261,262],{"type":16,"tag":82,"props":257,"children":258},{},[259],{"type":22,"value":260},"📝 Technical Deep Dive:",{"type":22,"value":244},{"type":16,"tag":246,"props":263,"children":266},{"href":264,"rel":265},"https:\u002F\u002Fwww.2077ai.com\u002Fblog\u002FOmniDocBench",[250],[267],{"type":22,"value":264},{"type":16,"tag":56,"props":269,"children":271},{"id":270},"acknowledgments",[272],{"type":22,"value":273},"Acknowledgments",{"type":16,"tag":24,"props":275,"children":276},{},[277],{"type":16,"tag":278,"props":279,"children":280},"em",{},[281],{"type":22,"value":282},"OmniDocBench is a collaborative effort involving researchers from 2077AI, Shanghai AI Laboratory, and our core data partner, Abaka AI, whose advanced data construction pipeline made this high-fidelity benchmark possible.",{"title":284,"searchDepth":285,"depth":285,"links":286},"",2,[287,288,289,290,291],{"id":58,"depth":285,"text":61},{"id":101,"depth":285,"text":104},{"id":171,"depth":285,"text":174},{"id":211,"depth":285,"text":214},{"id":270,"depth":285,"text":273},"news"]