Redefining AI Collaboration: Introducing Chain-of-Agents and Agent Foundation Models

The field of artificial intelligence is moving at an incredible pace, with AI agents capable of tackling complex, multi-step tasks becoming a reality. At 2077AI, we are thrilled to have contributed to a significant leap forward in this domain. We're excited to spotlight the groundbreaking latest paper and open-source release from our collaborators at OPPO's Personalized AI Lab and other leading researchers: Chain-of-Agents (CoA), a new paradigm that trains Agent Foundation Models (AFM) for unparalleled performance and efficiency.

While multi-agent systems (MAS) have shown promise, they often face critical limitations:

High Computational Costs: Frequent and redundant communication between agents makes them slow and expensive.
Limited Generalization: Adapting to new tasks requires extensive and costly manual prompt engineering and workflow design.
Lack of Learnability: Most systems can't learn and improve from data, hitting a performance ceiling defined by their initial design.

The Chain-of-Agents paradigm was created to solve these challenges head-on.

Chain-of-Agents: A New Paradigm for Native Collaboration

Instead of relying on multiple, separate models governed by complex external frameworks, Chain-of-Agents (CoA) enables a single, end-to-end model to simulate multi-agent collaboration internally.

CoA employs a hierarchical agent architecture that can be dynamically activated within the model:

Role-playing Agents: These agents handle the reasoning process. They include a Thinking Agent for analysis, a Plan Agent for strategy, a Reflection Agent for self-correction, and a Verification Agent for confirming results.
Tool Agents: These agents execute specific actions, such as a Search Agent for finding information, a Crawl Agent for accessing web content, and a Code Agent for writing and executing code.

The core technical pillars of the AFM framework

By integrating these roles into one model, AFM eliminates the need for complex prompt engineering and drastically reduces communication overhead, leading to a more efficient and powerful system.

How We Built AFM: A Three-Stage Training Framework

Creating a model with these native agentic capabilities required a novel training framework that combines multi-agent distillation with reinforcement learning.

The AFM training framework

1. Trajectory Acquisition & Distillation

The process starts by collecting a diverse set of tasks across web navigation, math, and coding. An advanced multi-agent system, OAgents, is used to solve these tasks. The successful solution paths, or "trajectories," are then distilled and converted into a format compatible with the CoA paradigm. This creates a high-quality dataset for the next stage.

2. Supervised Fine-Tuning (SFT)

The base Large Language Model (LLM) is fine-tuned on these distilled trajectories. This SFT stage teaches the model the foundational patterns of Chain-of-Agents reasoning, embedding the ability to plan, act, and reflect.

3. Agent Reinforcement Learning (RL)

To elevate the model from simply mimicking trajectories to developing an optimal strategy, we employ reinforcement learning. The model performs tasks, and its actions are evaluated by a sophisticated reward model. This reward system uses rule-based checks for verifiable tasks (like passing test cases in code) and an "LLM-as-a-Judge" to assess reasoning coherence, tool efficiency, and answer precision. The policy is continually updated, sharpening the model’s problem-solving strategies, especially for the most challenging tasks.

Unprecedented Performance Across the Board

The results speak for themselves. Agent Foundation Models have set a new state-of-the-art across nearly 20 complex agent benchmarks.

AFM establishes its leadership across four challenging agent benchmarks

As shown in the benchmarks:

On GAIA, a general AI assistant benchmark, AFM achieves a score of 55.3, surpassing previous leading models.
For complex web navigation, AFM scores 11.1 on BrowseComp and 18.0 on HLE, demonstrating superior tool usage and planning.
In the domain of advanced mathematical reasoning, AFM reaches a commanding 59.8 on AIME25.

Beyond raw performance, AFM is also remarkably efficient. In our analysis, it reduced the number of tokens required for inference by up to 85.5% compared to traditional multi-agent frameworks, delivering top-tier results at a fraction of the computational cost.

The Future is Agentic, and It's Here

We are standing at the edge of a new frontier in artificial intelligence. The introduction of the Chain-of-Agents paradigm and Agent Foundation Models is more than just an academic breakthrough; it represents a fundamental shift in how we build intelligent systems. We are moving beyond the era of rigid, manually-coded agents and stepping into a future of dynamic, learning entities that can reason, adapt, and solve problems with unprecedented autonomy.

At 2077AI, we are incredibly proud to have played a part in this visionary project. We believe this work doesn't just lay the groundwork — it unlocks the door to a future where AI agents become true partners in discovery, creation, and human progress. By making this entire framework fully open-source, we are extending an invitation to every developer, researcher, and dreamer to join us on this journey. Let's build the future of AI, together.

Blog

Datasets

Blogs

About

Projects

Project-EVA

Resources

Paper