[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"blog":3},{"title":4,"desc":5,"bannerImg":6,"date":7,"orgImgLinks":8,"bannerLinks":9,"blogCategory":10,"category":11,"weight":12,"externalUrl":11,"links":13,"description":5,"content":14,"tag1":608,"tag2":609,"resLinks":612},"Data Curation Beats Scaling: Why 20K High-Quality Samples Outperform 46K Noisy Ones in AI Image Editing","Break the \"scaling trap\" in generative AI. This article details how the 2077AI team used the EditReward reward model and a meticulous multi-dimensional scoring rubric to curate high-fidelity data. Learn how this \"Digital Data Curator\" enables automatic synthesis pipelines, proving that quality is the new scale for building state-of-the-art, open-source image editing models.","https:\u002F\u002Fdoxhub.s3.us-east-1.amazonaws.com\u002F2077ai\u002F20260114\u002Feditreward1%20banner%20image","2026-01-14","[]","{}","Research","",0,"{\"homepage\":\"\",\"github\":\"\",\"huggingface\":\"\",\"x\":\"\",\"discord\":\"\",\"arxiv\":\"\"}",{"data":15,"body":18,"toc":600},{"title":16,"description":17},"Data Curation in Image Editing: Why 20K High-Quality Samples Outperform 46K Noisy Ones","In the rapidly evolving landscape of generative AI, the prevailing mantra has long been \"scaling is all you need.\" We have scaled parameters, compute, and data to unprecedented levels. However, in specialized domains like Instruction-Guided Image Editing, we are discovering a counter-intuitive truth: more data often leads to diminishing returns if it isn't the right data.",{"type":19,"children":20},"root",[21,37,47,56,65,91,103,112,177,186,196,205,214,222,237,279,288,298,307,316,383,390,402,411,421,430,507,516,526,535,544,593],{"type":22,"tag":23,"props":24,"children":29},"element","h1",{"className":25,"style":27,"id":28},[26],"heading__h1","text-align: left;","data-curation-in-image-editing-why-20k-high-quality-samples-outperform-46k-noisy-ones",[30],{"type":22,"tag":31,"props":32,"children":34},"span",{"style":33},"white-space: pre-wrap;",[35],{"type":36,"value":16},"text",{"type":22,"tag":38,"props":39,"children":42},"p",{"className":40},[41],"doxhub-editor-paragraph",[43],{"type":22,"tag":31,"props":44,"children":45},{"style":33},[46],{"type":36,"value":17},{"type":22,"tag":38,"props":48,"children":50},{"className":49},[41],[51],{"type":22,"tag":31,"props":52,"children":53},{"style":33},[54],{"type":36,"value":55},"The open-source community currently faces a significant bottleneck. While closed-source giant models like GPT-Image-1 and Seedream set new benchmarks, open-source models struggle to keep pace.",{"type":22,"tag":38,"props":57,"children":59},{"className":58},[41],[60],{"type":22,"tag":31,"props":61,"children":62},{"style":33},[63],{"type":36,"value":64},"The core issue? A lack of high-fidelity training data.",{"type":22,"tag":38,"props":66,"children":68},{"className":67},[41],[69,74,86],{"type":22,"tag":31,"props":70,"children":71},{"style":33},[72],{"type":36,"value":73},"To bridge this gap, 2077AI research team released EditReward, a human-aligned reward model designed to serve as a fair, consistent, and highly accurate critic for instruction-guided image editing. More importantly, our work proposes a paradigm shift toward rigorous",{"type":22,"tag":75,"props":76,"children":77},"b",{},[78],{"type":22,"tag":79,"props":80,"children":83},"strong",{"className":81,"style":33},[82],"text__bold",[84],{"type":36,"value":85}," ",{"type":22,"tag":31,"props":87,"children":88},{"style":33},[89],{"type":36,"value":90},"data curation—demonstrating that a \"Data Diet\" can be more powerful than a \"Data Feast\".",{"type":22,"tag":92,"props":93,"children":97},"h2",{"className":94,"id":96},[95],"heading__h2","the-scaling-trap-the-hidden-challenge-of-data-noise",[98],{"type":22,"tag":31,"props":99,"children":100},{"style":33},[101],{"type":36,"value":102},"The Scaling Trap: The Hidden Challenge of Data Noise",{"type":22,"tag":38,"props":104,"children":106},{"className":105},[41],[107],{"type":22,"tag":31,"props":108,"children":109},{"style":33},[110],{"type":36,"value":111},"The easiest way to expand an image editing dataset is through automated synthesis or harvesting web-crawled pairs. However, these methods are inherently \"noisy.\" When we talk about noise in image editing, we aren't just referring to pixel grain; we are talking about semantic and physical misalignment:",{"type":22,"tag":113,"props":114,"children":117},"ol",{"className":115},[116],"doxhub-editor-ol",[118,139,158],{"type":22,"tag":119,"props":120,"children":124},"li",{"value":121,"className":122},"1",[123],"doxhub-editor-list-item",[125,134],{"type":22,"tag":75,"props":126,"children":127},{},[128],{"type":22,"tag":79,"props":129,"children":131},{"className":130,"style":33},[82],[132],{"type":36,"value":133},"Instruction Mismatch",{"type":22,"tag":31,"props":135,"children":136},{"style":33},[137],{"type":36,"value":138},": Failure to capture semantic alignment with user instructions.",{"type":22,"tag":119,"props":140,"children":143},{"value":141,"className":142},"2",[123],[144,153],{"type":22,"tag":75,"props":145,"children":146},{},[147],{"type":22,"tag":79,"props":148,"children":150},{"className":149,"style":33},[82],[151],{"type":36,"value":152},"Over-Editing (Lack of Exclusivity)",{"type":22,"tag":31,"props":154,"children":155},{"style":33},[156],{"type":36,"value":157},": Making unprompted or unrelated changes to the rest of the image.",{"type":22,"tag":119,"props":159,"children":162},{"value":160,"className":161},"3",[123],[163,172],{"type":22,"tag":75,"props":164,"children":165},{},[166],{"type":22,"tag":79,"props":167,"children":169},{"className":168,"style":33},[82],[170],{"type":36,"value":171},"Physical Hallucinations",{"type":22,"tag":31,"props":173,"children":174},{"style":33},[175],{"type":36,"value":176},": Issues with plausibility, such as a lack of consistency with real-world physics like lighting and shadows.",{"type":22,"tag":38,"props":178,"children":180},{"className":179},[41],[181],{"type":22,"tag":31,"props":182,"children":183},{"style":33},[184],{"type":36,"value":185},"When a model is trained on tens of thousands of these noisy samples, it fails to learn the boundary between a successful edit and a failure. It learns to be \"vaguely correct\" but never \"precisely aligned.\" This is where the role of the data curator becomes the primary lever for performance.",{"type":22,"tag":92,"props":187,"children":190},{"className":188,"id":189},[95],"the-editreward-curation-protocol-meticulous-expert-alignment",[191],{"type":22,"tag":31,"props":192,"children":193},{"style":33},[194],{"type":36,"value":195},"The EditReward Curation Protocol: Meticulous Expert Alignment",{"type":22,"tag":38,"props":197,"children":199},{"className":198},[41],[200],{"type":22,"tag":31,"props":201,"children":202},{"style":33},[203],{"type":36,"value":204},"To solve the noise problem, 2077AI research team didn't just collect more data; we redefined how data is curated. We built EDITREWARD-DATA, a large-scale human preference dataset containing over 200K pairs.",{"type":22,"tag":38,"props":206,"children":208},{"className":207},[41],[209],{"type":22,"tag":31,"props":210,"children":211},{"style":33},[212],{"type":36,"value":213},"The secret sauce lies in our multi-dimensional scoring criteria. Instead of a simple binary \"good or bad\" label, we employed trained experts to act as data curators, evaluating samples on a granular scale across two primary dimensions:",{"type":22,"tag":38,"props":215,"children":217},{"className":216},[41],[218],{"type":22,"tag":219,"props":220,"children":221},"br",{},[],{"type":22,"tag":223,"props":224,"children":225},"figure",{},[226,232],{"type":22,"tag":227,"props":228,"children":231},"img",{"src":229,"alt":230},"https:\u002F\u002Fdoxhub.s3.us-east-1.amazonaws.com\u002F2077ai\u002F20260114\u002FA%20look%20into%20the%20meticulous%20annotation%20process%20on%20the%20platform%20provided%20by%20Abaka%20AI%2C%20a%20key%20contributor%20from%202077AI.","A look into the meticulous annotation process on the platform provided by Abaka AI, a key contributor from 2077AI.",[],{"type":22,"tag":233,"props":234,"children":235},"figcaption",{},[236],{"type":36,"value":230},{"type":22,"tag":238,"props":239,"children":242},"ul",{"className":240},[241],"doxhub-editor-ul",[243,261],{"type":22,"tag":119,"props":244,"children":246},{"value":121,"className":245},[123],[247,256],{"type":22,"tag":75,"props":248,"children":249},{},[250],{"type":22,"tag":79,"props":251,"children":253},{"className":252,"style":33},[82],[254],{"type":36,"value":255},"Instruction Following",{"type":22,"tag":31,"props":257,"children":258},{"style":33},[259],{"type":36,"value":260},": Assessing the accuracy, completeness, and exclusivity of the edit relative to the text prompt.",{"type":22,"tag":119,"props":262,"children":264},{"value":141,"className":263},[123],[265,274],{"type":22,"tag":75,"props":266,"children":267},{},[268],{"type":22,"tag":79,"props":269,"children":271},{"className":270,"style":33},[82],[272],{"type":36,"value":273},"Visual Quality",{"type":22,"tag":31,"props":275,"children":276},{"style":33},[277],{"type":36,"value":278},": Evaluating plausibility, physical consistency (lighting\u002Fshadows), and the absence of technical artifacts.",{"type":22,"tag":38,"props":280,"children":282},{"className":281},[41],[283],{"type":22,"tag":31,"props":284,"children":285},{"style":33},[286],{"type":36,"value":287},"Detailed scoring rubrics ranging from 1 (Very Poor) to 4 (Very Good) ensure high alignment with human judgment and minimize label noise. This meticulous data curation ensures that the reward signal is exceptionally clean.",{"type":22,"tag":92,"props":289,"children":292},{"className":290,"id":291},[95],"proof-of-concept-the-20k-vs-46k-experiment",[293],{"type":22,"tag":31,"props":294,"children":295},{"style":33},[296],{"type":36,"value":297},"Proof of Concept: The 20K vs. 46K Experiment",{"type":22,"tag":38,"props":299,"children":301},{"className":300},[41],[302],{"type":22,"tag":31,"props":303,"children":304},{"style":33},[305],{"type":36,"value":306},"The most compelling evidence for the \"Data Diet\" comes from a direct experiment conducted by 2077AI research team. We utilized the existing ShareGPT-4o-Image dataset, a collection of 46,000 noisy image editing pairs.",{"type":22,"tag":38,"props":308,"children":310},{"className":309},[41],[311],{"type":22,"tag":31,"props":312,"children":313},{"style":33},[314],{"type":36,"value":315},"Using EditReward as an automated data curator, we ranked these 46K samples and selected only the top 20,000 highest-quality pairs. We then fine-tuned Step1X-Edit on this curated subset, and compared it with fine-tuning on the full dataset. The results on the GEdit-Bench Overall score were stark:",{"type":22,"tag":238,"props":317,"children":319},{"className":318},[241],[320,352],{"type":22,"tag":119,"props":321,"children":323},{"value":121,"className":322},[123],[324,333,338,347],{"type":22,"tag":75,"props":325,"children":326},{},[327],{"type":22,"tag":79,"props":328,"children":330},{"className":329,"style":33},[82],[331],{"type":36,"value":332},"Training on the full noisy set (46K)",{"type":22,"tag":31,"props":334,"children":335},{"style":33},[336],{"type":36,"value":337},": Yielded a GEdit-Bench score of ",{"type":22,"tag":75,"props":339,"children":340},{},[341],{"type":22,"tag":79,"props":342,"children":344},{"className":343,"style":33},[82],[345],{"type":36,"value":346},"6.8",{"type":22,"tag":31,"props":348,"children":349},{"style":33},[350],{"type":36,"value":351},".",{"type":22,"tag":119,"props":353,"children":355},{"value":141,"className":354},[123],[356,365,370,379],{"type":22,"tag":75,"props":357,"children":358},{},[359],{"type":22,"tag":79,"props":360,"children":362},{"className":361,"style":33},[82],[363],{"type":36,"value":364},"Training on the curated subset (20K)",{"type":22,"tag":31,"props":366,"children":367},{"style":33},[368],{"type":36,"value":369},": Yielded a significantly higher score of ",{"type":22,"tag":75,"props":371,"children":372},{},[373],{"type":22,"tag":79,"props":374,"children":376},{"className":375,"style":33},[82],[377],{"type":36,"value":378},"7.1",{"type":22,"tag":31,"props":380,"children":381},{"style":33},[382],{"type":36,"value":351},{"type":22,"tag":38,"props":384,"children":386},{"className":385},[41],[387],{"type":22,"tag":219,"props":388,"children":389},{},[],{"type":22,"tag":223,"props":391,"children":392},{},[393,398],{"type":22,"tag":227,"props":394,"children":397},{"src":395,"alt":396},"https:\u002F\u002Fdoxhub.s3.us-east-1.amazonaws.com\u002F2077ai\u002F20260114\u002FComprehensive%20comparison%20of%20state-of-the-art%20models%20on%20different%20versions%20of%20the%20GEdit-Bench%20benchmark%20shows%20that%20our%20model%20significantly%20improve%20the%20base%20model","Comprehensive comparison of state-of-the-art models on different versions of the GEdit-Bench benchmark shows that our model significantly improve the base model.",[],{"type":22,"tag":233,"props":399,"children":400},{},[401],{"type":36,"value":396},{"type":22,"tag":38,"props":403,"children":405},{"className":404},[41],[406],{"type":22,"tag":31,"props":407,"children":408},{"style":33},[409],{"type":36,"value":410},"By discarding more than half of the data, the model's performance improved significantly. This suggests that low-quality data acts as \"noise interference,\" preventing the model from converging on high-quality features. A high-quality reward signal is a key ingredient for training powerful editing models.",{"type":22,"tag":92,"props":412,"children":415},{"className":413,"id":414},[95],"the-future-building-automatic-data-synthesis-pipelines",[416],{"type":22,"tag":31,"props":417,"children":418},{"style":33},[419],{"type":36,"value":420},"🚀The Future: Building Automatic Data Synthesis Pipelines",{"type":22,"tag":38,"props":422,"children":424},{"className":423},[41],[425],{"type":22,"tag":31,"props":426,"children":427},{"style":33},[428],{"type":36,"value":429},"The real-world utility of EditReward extends beyond being a benchmark—it is an engine for automatic data curation. For labs looking to scale their models, EditReward enables a high-throughput synthesis pipeline:",{"type":22,"tag":113,"props":431,"children":433},{"className":432},[116],[434,452,470,488],{"type":22,"tag":119,"props":435,"children":437},{"value":121,"className":436},[123],[438,447],{"type":22,"tag":75,"props":439,"children":440},{},[441],{"type":22,"tag":79,"props":442,"children":444},{"className":443,"style":33},[82],[445],{"type":36,"value":446},"Synthesis",{"type":22,"tag":31,"props":448,"children":449},{"style":33},[450],{"type":36,"value":451},": Generate diverse candidate pools from multiple state-of-the-art models.",{"type":22,"tag":119,"props":453,"children":455},{"value":141,"className":454},[123],[456,465],{"type":22,"tag":75,"props":457,"children":458},{},[459],{"type":22,"tag":79,"props":460,"children":462},{"className":461,"style":33},[82],[463],{"type":36,"value":464},"Automated Data Curation",{"type":22,"tag":31,"props":466,"children":467},{"style":33},[468],{"type":36,"value":469},": Use EditReward to score and rank these candidates based on human alignment.",{"type":22,"tag":119,"props":471,"children":473},{"value":160,"className":472},[123],[474,483],{"type":22,"tag":75,"props":475,"children":476},{},[477],{"type":22,"tag":79,"props":478,"children":480},{"className":479,"style":33},[82],[481],{"type":36,"value":482},"High-Fidelity Distillation",{"type":22,"tag":31,"props":484,"children":485},{"style":33},[486],{"type":36,"value":487},": Set a high-quality threshold",{"type":22,"tag":119,"props":489,"children":492},{"value":490,"className":491},"4",[123],[493,502],{"type":22,"tag":75,"props":494,"children":495},{},[496],{"type":22,"tag":79,"props":497,"children":499},{"className":498,"style":33},[82],[500],{"type":36,"value":501},"Iterative Training",{"type":22,"tag":31,"props":503,"children":504},{"style":33},[505],{"type":36,"value":506},": Select high-quality subsets for training next-generation models.",{"type":22,"tag":38,"props":508,"children":510},{"className":509},[41],[511],{"type":22,"tag":31,"props":512,"children":513},{"style":33},[514],{"type":36,"value":515},"This pipeline transforms the reward model into a \"Digital Data Curator,\" allowing the community to build more high-quality image editing training datasets.",{"type":22,"tag":92,"props":517,"children":520},{"className":518,"id":519},[95],"conclusion-quality-is-the-new-scale",[521],{"type":22,"tag":31,"props":522,"children":523},{"style":33},[524],{"type":36,"value":525},"Conclusion: Quality is the New Scale",{"type":22,"tag":38,"props":527,"children":529},{"className":528},[41],[530],{"type":22,"tag":31,"props":531,"children":532},{"style":33},[533],{"type":36,"value":534},"The findings from EditReward research serve as a wake-up call for the \"more is better\" crowd. As we approach the limits of raw data scraping, the focus must shift to the refinement of that data. Data curation is not a secondary task; it is the primary architecture of model intelligence.",{"type":22,"tag":38,"props":536,"children":538},{"className":537},[41],[539],{"type":22,"tag":31,"props":540,"children":541},{"style":33},[542],{"type":36,"value":543},"By adopting a \"Data Diet\" and prioritizing expert-aligned samples, we can build models that don't just \"read\" our instructions but truly understand the nuance of our creative intent.",{"type":22,"tag":38,"props":545,"children":547},{"className":546},[41],[548,553,566,585],{"type":22,"tag":31,"props":549,"children":550},{"style":33},[551],{"type":36,"value":552},"If you want to learn more about EDITREWARD, read the previous blog of",{"type":22,"tag":554,"props":555,"children":561},"a",{"href":556,"rel":557,"className":559},"https:\u002F\u002Fwww.2077ai.com\u002Fblog\u002Fintroducing-editreward-human-aligned-ai-for-image-editing?utm_source=officialwebsite&utm_medium=blog&utm_campaign=editreward1",[558],"noreferrer",[560],"text__link",[562],{"type":22,"tag":31,"props":563,"children":564},{"style":33},[565],{"type":36,"value":85},{"type":22,"tag":554,"props":567,"children":570},{"href":556,"rel":568,"className":569},[558],[560],[571],{"type":22,"tag":572,"props":573,"children":574},"u",{},[575],{"type":22,"tag":75,"props":576,"children":577},{},[578],{"type":22,"tag":79,"props":579,"children":582},{"className":580,"style":33},[82,581],"text__underline",[583],{"type":36,"value":584},"the general instruction to EDITREWARD",{"type":22,"tag":75,"props":586,"children":587},{},[588],{"type":22,"tag":79,"props":589,"children":591},{"className":590,"style":33},[82],[592],{"type":36,"value":351},{"type":22,"tag":38,"props":594,"children":596},{"className":595},[41],[597],{"type":22,"tag":219,"props":598,"children":599},{},[],{"title":11,"searchDepth":601,"depth":601,"links":602},2,[603,604,605,606,607],{"id":96,"depth":601,"text":102},{"id":189,"depth":601,"text":195},{"id":291,"depth":601,"text":297},{"id":414,"depth":601,"text":420},{"id":519,"depth":601,"text":525},"dataset",[610,611],"image","multimodal",{"homepage":613,"arxiv":614,"github":615,"huggingface":616},"https:\u002F\u002Ftiger-ai-lab.github.io\u002FEditReward\u002F","https:\u002F\u002Farxiv.org\u002Fabs\u002F2509.26346","https:\u002F\u002Fgithub.com\u002FTIGER-AI-Lab\u002FEditReward","https:\u002F\u002Fhuggingface.co\u002Fcollections\u002FTIGER-Lab\u002Feditreward"]