12 个渐进 Sessions：从零构建 Claude Code 的 Agent Harness

Claude Code 是 Anthropic 官方的 AI 编程助手。它的架构设计优雅、功能强大，但其核心思想却非常简单：Model 是 Agent，Code 只是 Harness。

本文基于 shareAI-lab/learn-claude-code 仓库，通过 12 个渐进式的编程 Sessions，从零开始实现 Claude Code 的核心机制，带你深入理解 Harness Engineering 的精髓。

核心理念：Agent 是训练出来的 Model，不是编写出来的代码。我们工程师的职责不是"开发 Agent"，而是"建造 Harness"——为 Agent 提供工具、知识、上下文和权限，让它能够有效地在特定领域行动。

为什么是 12 个 Sessions？

每个 Session 聚焦一个核心机制，按渐进式难度编排：

Phase 1: THE LOOP                    Phase 2: PLANNING & KNOWLEDGE
==================                   =============================
s01  The Agent Loop          [1]     s03  TodoWrite               [5]
     while + stop_reason                  TodoManager + nag reminder
     |                                    |
     +-> s02  Tool Use            [4]     s04  Subagents            [5]
              dispatch map: name->handler     fresh messages[] per child
                                              |
                                         s05  Skills               [5]
                                              SKILL.md via tool_result
                                              |
                                         s06  Context Compact      [5]
                                              3-layer compression

Phase 3: PERSISTENCE                 Phase 4: TEAMS
==================                   =====================
s07  Tasks                   [8]     s09  Agent Teams             [9]
     file-based CRUD + deps graph         teammates + JSONL mailboxes
     |                                    |
s08  Background Tasks        [6]     s10  Team Protocols          [12]
     daemon threads + notify queue        shutdown + plan approval FSM
                                          |
                                     s11  Autonomous Agents       [14]
                                          idle cycle + auto-claim
                                     |
                                     s12  Worktree Isolation      [16]
                                          task coordination + optional isolated execution lanes

Phase 1: THE LOOP —— Agent 的核心

Session 01: The Agent Loop

座右铭："One loop & Bash is all you need"

一切从一个简单的循环开始：

def agent_loop(messages):
    while True:
        response = client.messages.create(
            model=MODEL,
            messages=messages,
            tools=TOOLS,
        )
        messages.append({"role": "assistant", "content": response.content})

        # 检查 Agent 是否想要使用工具
        if response.stop_reason != "tool_use":
            return  # Agent 完成了任务

        # 执行 Agent 请求的工具调用
        results = []
        for block in response.content:
            if block.type == "tool_use":
                output = TOOL_HANDLERS[block.name](**block.input)
                results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": output,
                })

        # 将工具结果返回给 Agent，继续循环
        messages.append({"role": "user", "content": results})

关键洞察：

Agent 决定何时调用工具、调用什么工具
Code 只是执行 Agent 的请求
这个循环模式贯穿所有后续 Sessions

Session 02: Tool Use

座右铭："Adding a tool means adding one handler"

添加新工具不需要改变循环逻辑，只需注册处理器：

TOOL_HANDLERS = {
    "bash": bash_handler,
    "read_file": read_handler,
    "write_file": write_handler,
}

# 添加新工具只需注册
TOOL_HANDLERS["search"] = search_handler

设计原则：工具应该是原子的、可组合的、描述清晰的。

Phase 2: PLANNING & KNOWLEDGE —— 规划与知识

Session 03: TodoWrite

座右铭："An agent without a plan drifts"

没有计划的 Agent 会漂移。先列出步骤，再执行：

class TodoManager:
    def __init__(self):
        self.todos = []

    def add_todos(self, items):
        self.todos.extend(items)

    def complete_todo(self, index):
        if 0 <= index < len(self.todos):
            self.todos[index]["status"] = "completed"

    def get_progress(self):
        completed = sum(1 for t in self.todos if t["status"] == "completed")
        return f"{completed}/{len(self.todos)}"

# Agent 先规划
todos = ["Read README", "Install dependencies", "Run tests"]
todo_manager.add_todos(todos)

# 然后按顺序执行
for i, todo in enumerate(todos):
    execute_step(todo)
    todo_manager.complete_todo(i)
    print(f"Progress: {todo_manager.get_progress()}")

Session 04: Subagents

座右铭："Break big tasks down; each subtask gets a clean context"

大任务分解给 Subagent，每个 Subagent 有独立的上下文：

def spawn_subagent(task_description):
    # 独立的 messages[]，不污染主对话
    sub_messages = [
        {"role": "user", "content": task_description}
    ]
    return agent_loop(sub_messages)

# 主 Agent 的 context 保持简洁
main_context = handle_big_task_in_parallel()

# 复杂任务交给 Subagent
result = spawn_subagent("Analyze this codebase and report bugs")

好处：

主对话保持清晰，不会被细节淹没
Subagent 可以专注于特定任务
失败隔离：一个 Subagent 失败不影响其他

Session 05: Skills

座右铭："Load knowledge when you need it, not upfront"

知识按需加载，而不是塞进 system prompt：

def load_skill(skill_name):
    skill_path = f"skills/{skill_name}/SKILL.md"
    skill_content = read_file(skill_path)

    # 通过 tool_result 注入，不污染 system prompt
    return {
        "type": "tool_result",
        "tool_use_id": current_tool_id,
        "content": f"[Loaded skill: {skill_name}]\n\n{skill_content}"
    }

# Agent 请求加载 Skill 时才注入
if tool_name == "load_skill":
    result = load_skill(tool_input["skill_name"])

Claude Code 的实现：Skill 是包含指令和元数据的 SKILL.md 文件，Agent 在需要时动态加载。

Session 06: Context Compact

座右铭："Context will fill up; you need a way to make room"

三层压缩策略实现无限会话：

def compress_context(messages):
    # Layer 1: 保留最近 N 条消息
    recent = messages[-100:]

    # Layer 2: 将更早的消息压缩成摘要
    old_messages = messages[:-100]
    summary = summarize_messages(old_messages)

    # Layer 3: 选择性保留关键信息
    key_info = extract_key_info(old_messages)

    return [
        {"role": "system", "content": f"Previous context summary: {summary}"},
        *key_info,
        *recent
    ]

Claude Code 的做法：当 context 接近 token 限制时，自动压缩早期对话，保留关键信息。

Phase 3: PERSISTENCE —— 持久化

Session 07: Tasks

座右铭："Break big goals into small tasks, order them, persist to disk"

任务持久化到文件，支持依赖关系：

import json
from pathlib import Path

class TaskGraph:
    def __init__(self, filepath):
        self.filepath = Path(filepath)
        self.tasks = self.load()

    def load(self):
        if self.filepath.exists():
            return json.loads(self.filepath.read_text())
        return []

    def save(self):
        self.filepath.write_text(json.dumps(self.tasks, indent=2))

    def add_task(self, description, depends_on=None):
        task = {
            "id": f"task-{len(self.tasks) + 1}",
            "description": description,
            "status": "pending",
            "depends_on": depends_on or []
        }
        self.tasks.append(task)
        self.save()

    def get_ready_tasks(self):
        """获取可以执行的任务（依赖已满足）"""
        ready = []
        for task in self.tasks:
            if task["status"] == "pending":
                deps = task.get("depends_on", [])
                if all(self.tasks[d]["status"] == "completed" for d in deps):
                    ready.append(task)
        return ready

为多 Agent 协作奠定基础：任务图可以让多个 Agent 并行工作。

Session 08: Background Tasks

座右铭："Run slow operations in the background; the agent keeps thinking"

后台任务执行，Agent 不阻塞：

import threading
import queue

class BackgroundTaskManager:
    def __init__(self):
        self.notification_queue = queue.Queue()

    def run_background(self, command):
        def worker():
            result = subprocess.run(command, capture_output=True, text=True)
            # 完成时注入通知
            self.notification_queue.put({
                "type": "task_complete",
                "command": command,
                "output": result.stdout
            })

        thread = threading.Thread(target=worker)
        thread.start()

    def check_notifications(self):
        """Agent 每次循环检查是否有后台任务完成"""
        notifications = []
        while not self.notification_queue.empty():
            notifications.append(self.notification_queue.get())
        return notifications

场景：长时间运行的测试、文件下载、数据同步等。

Phase 4: TEAMS —— 团队协作

Session 09: Agent Teams

座右铭："When the task is too big for one, delegate to teammates"

持久化队友 + 异步 mailboxes：

import json
from pathlib import Path

class Teammate:
    def __init__(self, name, skill):
        self.name = name
        self.skill = skill
        self.mailbox_path = Path(f"mailboxes/{name}.jsonl")

    def send_message(self, message):
        """发送消息到队友的 mailbox"""
        with open(self.mailbox_path, "a") as f:
            f.write(json.dumps(message) + "\n")

    def check_mailbox(self):
        """读取新消息"""
        if not self.mailbox_path.exists():
            return []

        messages = []
        with open(self.mailbox_path) as f:
            for line in f:
                messages.append(json.loads(line))
        return messages

# 创建专业化的队友
coder = Teammate("coder", "writes code")
tester = Teammate("tester", "runs tests")
reviewer = Teammate("reviewer", "reviews code")

Session 10: Team Protocols

座右铭："Teammates need shared communication rules"

统一的 request-response 协议：

# 标准消息格式
def create_request(from_agent, to_agent, task, context=None):
    return {
        "type": "request",
        "from": from_agent,
        "to": to_agent,
        "task": task,
        "context": context or {},
        "timestamp": datetime.now().isoformat()
    }

def create_response(request, status, result=None):
    return {
        "type": "response",
        "from": request["to"],
        "to": request["from"],
        "in_reply_to": request.get("timestamp"),
        "status": status,  # "accepted", "rejected", "completed"
        "result": result,
        "timestamp": datetime.now().isoformat()
    }

# 使用示例
request = create_request(
    "lead", "coder",
    "Implement authentication",
    {"framework": "fastapi"}
)

coder.send_message(request)

# Coder 处理并回复
messages = coder.check_mailbox()
for msg in messages:
    if msg["type"] == "request":
        # 执行任务
        result = implement_auth(msg["task"])
        # 发送回复
        response = create_response(msg, "completed", result)
        lead.send_message(response)

Session 11: Autonomous Agents

座右铭："Teammates scan the board and claim tasks themselves"

自主任务认领，无需中心分配：

def autonomous_cycle(teammate, task_graph):
    while True:
        # 扫描任务板
        ready_tasks = task_graph.get_ready_tasks()

        # 自动认领匹配的任务
        for task in ready_tasks:
            if can_handle(teammate, task):
                # 认领任务
                task["status"] = "in_progress"
                task["assigned_to"] = teammate.name
                task_graph.save()

                # 执行任务
                try:
                    result = execute_task(teammate, task)
                    task["status"] = "completed"
                    task["result"] = result
                except Exception as e:
                    task["status"] = "failed"
                    task["error"] = str(e)
                finally:
                    task_graph.save()

        # 心跳间隔
        time.sleep(30)

# 每个 teammate 运行自己的 autonomous cycle
for teammate in team:
    threading.Thread(
        target=autonomous_cycle,
        args=(teammate, task_graph)
    ).start()

Session 12: Worktree Isolation

座右铭："Each works in its own directory, no interference"

工作树隔离，Tasks 管理目标，Worktrees 管理目录：

import subprocess
import shutil

class Worktree:
    def __init__(self, task_id, base_path="worktrees"):
        self.task_id = task_id
        self.path = Path(base_path) / task_id
        self.path.mkdir(parents=True, exist_ok=True)

    def execute_in_isolation(self, command):
        """在隔离环境中执行命令"""
        return subprocess.run(
            command,
            cwd=self.path,
            capture_output=True,
            text=True
        )

    def cleanup(self):
        """清理工作目录"""
        if self.path.exists():
            shutil.rmtree(self.path)

# Task 和 Worktree 通过 ID 绑定
task = {
    "id": "task-123",
    "description": "Build feature X",
    "worktree_id": "worktree-123"
}

# 创建隔离的工作环境
worktree = Worktree(task["worktree_id"])

# 在隔离环境中执行
result = worktree.execute_in_isolation(["npm", "install"])
result = worktree.execute_in_isolation(["npm", "run", "build"])

# 完成后清理
worktree.cleanup()

整合：完整的 Agent Harness

将所有机制整合起来：

class AgentHarness:
    def __init__(self):
        # Phase 1: 核心循环
        self.tools = self.setup_tools()

        # Phase 2: 规划与知识
        self.todo_manager = TodoManager()
        self.skill_loader = SkillLoader()
        self.context_manager = ContextManager()

        # Phase 3: 持久化
        self.task_graph = TaskGraph("tasks.json")
        self.background_tasks = BackgroundTaskManager()

        # Phase 4: 团队
        self.team = self.setup_team()
        self.worktrees = WorktreeManager()

    def setup_tools(self):
        return {
            "bash": bash_handler,
            "read_file": read_handler,
            "write_file": write_handler,
            "todo_write": self.todo_manager.add_todos,
            "load_skill": self.skill_loader.load,
            "spawn_subagent": spawn_subagent,
        }

    def run(self, user_message):
        messages = [{"role": "user", "content": user_message}]

        while True:
            # 检查后台任务通知
            notifications = self.background_tasks.check_notifications()
            if notifications:
                messages.append({
                    "role": "system",
                    "content": f"Background tasks: {notifications}"
                })

            # 压缩 context（如果需要）
            messages = self.context_manager.compress_if_needed(messages)

            # 调用 LLM
            response = client.messages.create(
                model=MODEL,
                messages=messages,
                tools=self.tools,
            )

            messages.append({"role": "assistant", "content": response.content})

            # 检查是否完成
            if response.stop_reason != "tool_use":
                return messages

            # 执行工具调用
            for block in response.content:
                if block.type == "tool_use":
                    output = self.tools[block.name](**block.input)
                    messages.append({
                        "role": "user",
                        "content": [{
                            "type": "tool_result",
                            "tool_use_id": block.id,
                            "content": str(output)
                        }]
                    })

关键设计原则

通过这 12 个 Sessions，我们可以总结出 Claude Code 架构的核心设计原则：

1. Model 是 Agent，Code 是 Harness

Agent 的智能来自训练，不是代码
Code 的职责是提供环境，不是模拟智能

2. 信任 Model，简化 Harness

不要用复杂的决策树限制 Model
给 Model 工具，让它自己决定何时使用

3. 按需加载，保持简洁

知识按需加载，不塞满 system prompt
Context 动态管理，及时压缩清理

4. 隔离与协作

Subagents 独立上下文，避免污染
Teammates 异步通信，并行工作
Worktrees 隔离执行，互不干扰

5. 持久化一切

Tasks 持久化到文件
Mailboxes 持久化通信
支持中断恢复

实战应用：构建你自己的 Agent Harness

理解原理后，你可以：

1. 学习 Claude Code

阅读 shareAI-lab/learn-claude-code 完整实现
运行 12 个参考实现，观察每个机制如何工作
修改和扩展，理解 trade-offs

2. 使用 Kode Agent CLI

npm i -g @shareai-lab/kode

开源的编码 Agent CLI，支持多种 Model，可学习其 Harness 设计。

3. 集成到你的应用

使用 Kode Agent SDK 将 Agent 能力嵌入你的后端、浏览器扩展或任何应用。

进阶：从 On-Demand 到 Always-On

本文讨论的 Harness 是"用完即弃"模型——每次 session 从零开始。

如果你的需求是"始终在线"的 Assistant，可以参考 claw0 仓库，它基于相同的 Agent 核心，添加了：

Heartbeat：定期唤醒检查是否有工作
Cron：Agent 可以调度自己的未来任务
IM Channels：多渠道即时通讯（WhatsApp、Telegram、Slack 等）
Memory：持久化上下文记忆
Soul：个性化人格系统

learn-claude-code                   claw0
(agent harness core)                (proactive always-on harness)

总结

通过 12 个渐进 Sessions，我们从零构建了一个完整的 Agent Harness，实现了 Claude Code 的核心机制：

核心循环（s01-s02）：Agent Loop + Tool Use
规划与知识（s03-s06）：TodoWrite、Subagents、Skills、Context Compact
持久化（s07-s08）：Tasks、Background Tasks
团队协作（s09-s12）：Agent Teams、Protocols、Autonomous Agents、Worktree Isolation

最重要的收获：Agent 是训练出来的 Model，不是编写出来的代码。作为工程师，我们的职责是建造优秀的 Harness——给 Model 提供工具、知识、上下文和权限，然后信任它，让它去推理、去决定、去行动。

这就是 Claude Code 的灵魂。这就是 Harness Engineering 的精髓。

资源链接：

shareAI-lab/learn-claude-code - 12 个渐进 Sessions 的完整实现
Anthropic Claude Code 官方文档 - Claude Code 官方文档
shareAI-lab/Kode-cli - 开源编码 Agent CLI
shareAI-lab/Kode-agent-sdk - 嵌入式 Agent SDK
shareAI-lab/claw0 - Always-On Agent Harness