12 个渐进 Sessions:从零构建 Claude Code 的 Agent Harness
12 个渐进 Sessions:从零构建 Claude Code 的 Agent Harness
Claude Code 是 Anthropic 官方的 AI 编程助手。它的架构设计优雅、功能强大,但其核心思想却非常简单:Model 是 Agent,Code 只是 Harness。
本文基于 shareAI-lab/learn-claude-code 仓库,通过 12 个渐进式的编程 Sessions,从零开始实现 Claude Code 的核心机制,带你深入理解 Harness Engineering 的精髓。
核心理念:Agent 是训练出来的 Model,不是编写出来的代码。我们工程师的职责不是"开发 Agent",而是"建造 Harness"——为 Agent 提供工具、知识、上下文和权限,让它能够有效地在特定领域行动。
为什么是 12 个 Sessions?
每个 Session 聚焦一个核心机制,按渐进式难度编排:
Phase 1: THE LOOP Phase 2: PLANNING & KNOWLEDGE
================== =============================
s01 The Agent Loop [1] s03 TodoWrite [5]
while + stop_reason TodoManager + nag reminder
| |
+-> s02 Tool Use [4] s04 Subagents [5]
dispatch map: name->handler fresh messages[] per child
|
s05 Skills [5]
SKILL.md via tool_result
|
s06 Context Compact [5]
3-layer compression
Phase 3: PERSISTENCE Phase 4: TEAMS
================== =====================
s07 Tasks [8] s09 Agent Teams [9]
file-based CRUD + deps graph teammates + JSONL mailboxes
| |
s08 Background Tasks [6] s10 Team Protocols [12]
daemon threads + notify queue shutdown + plan approval FSM
|
s11 Autonomous Agents [14]
idle cycle + auto-claim
|
s12 Worktree Isolation [16]
task coordination + optional isolated execution lanesPhase 1: THE LOOP —— Agent 的核心
Session 01: The Agent Loop
座右铭:"One loop & Bash is all you need"
一切从一个简单的循环开始:
def agent_loop(messages):
while True:
response = client.messages.create(
model=MODEL,
messages=messages,
tools=TOOLS,
)
messages.append({"role": "assistant", "content": response.content})
# 检查 Agent 是否想要使用工具
if response.stop_reason != "tool_use":
return # Agent 完成了任务
# 执行 Agent 请求的工具调用
results = []
for block in response.content:
if block.type == "tool_use":
output = TOOL_HANDLERS[block.name](**block.input)
results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": output,
})
# 将工具结果返回给 Agent,继续循环
messages.append({"role": "user", "content": results})关键洞察:
- Agent 决定何时调用工具、调用什么工具
- Code 只是执行 Agent 的请求
- 这个循环模式贯穿所有后续 Sessions
Session 02: Tool Use
座右铭:"Adding a tool means adding one handler"
添加新工具不需要改变循环逻辑,只需注册处理器:
TOOL_HANDLERS = {
"bash": bash_handler,
"read_file": read_handler,
"write_file": write_handler,
}
# 添加新工具只需注册
TOOL_HANDLERS["search"] = search_handler设计原则:工具应该是原子的、可组合的、描述清晰的。
Phase 2: PLANNING & KNOWLEDGE —— 规划与知识
Session 03: TodoWrite
座右铭:"An agent without a plan drifts"
没有计划的 Agent 会漂移。先列出步骤,再执行:
class TodoManager:
def __init__(self):
self.todos = []
def add_todos(self, items):
self.todos.extend(items)
def complete_todo(self, index):
if 0 <= index < len(self.todos):
self.todos[index]["status"] = "completed"
def get_progress(self):
completed = sum(1 for t in self.todos if t["status"] == "completed")
return f"{completed}/{len(self.todos)}"
# Agent 先规划
todos = ["Read README", "Install dependencies", "Run tests"]
todo_manager.add_todos(todos)
# 然后按顺序执行
for i, todo in enumerate(todos):
execute_step(todo)
todo_manager.complete_todo(i)
print(f"Progress: {todo_manager.get_progress()}")Session 04: Subagents
座右铭:"Break big tasks down; each subtask gets a clean context"
大任务分解给 Subagent,每个 Subagent 有独立的上下文:
def spawn_subagent(task_description):
# 独立的 messages[],不污染主对话
sub_messages = [
{"role": "user", "content": task_description}
]
return agent_loop(sub_messages)
# 主 Agent 的 context 保持简洁
main_context = handle_big_task_in_parallel()
# 复杂任务交给 Subagent
result = spawn_subagent("Analyze this codebase and report bugs")好处:
- 主对话保持清晰,不会被细节淹没
- Subagent 可以专注于特定任务
- 失败隔离:一个 Subagent 失败不影响其他
Session 05: Skills
座右铭:"Load knowledge when you need it, not upfront"
知识按需加载,而不是塞进 system prompt:
def load_skill(skill_name):
skill_path = f"skills/{skill_name}/SKILL.md"
skill_content = read_file(skill_path)
# 通过 tool_result 注入,不污染 system prompt
return {
"type": "tool_result",
"tool_use_id": current_tool_id,
"content": f"[Loaded skill: {skill_name}]\n\n{skill_content}"
}
# Agent 请求加载 Skill 时才注入
if tool_name == "load_skill":
result = load_skill(tool_input["skill_name"])Claude Code 的实现:Skill 是包含指令和元数据的 SKILL.md 文件,Agent 在需要时动态加载。
Session 06: Context Compact
座右铭:"Context will fill up; you need a way to make room"
三层压缩策略实现无限会话:
def compress_context(messages):
# Layer 1: 保留最近 N 条消息
recent = messages[-100:]
# Layer 2: 将更早的消息压缩成摘要
old_messages = messages[:-100]
summary = summarize_messages(old_messages)
# Layer 3: 选择性保留关键信息
key_info = extract_key_info(old_messages)
return [
{"role": "system", "content": f"Previous context summary: {summary}"},
*key_info,
*recent
]Claude Code 的做法:当 context 接近 token 限制时,自动压缩早期对话,保留关键信息。
Phase 3: PERSISTENCE —— 持久化
Session 07: Tasks
座右铭:"Break big goals into small tasks, order them, persist to disk"
任务持久化到文件,支持依赖关系:
import json
from pathlib import Path
class TaskGraph:
def __init__(self, filepath):
self.filepath = Path(filepath)
self.tasks = self.load()
def load(self):
if self.filepath.exists():
return json.loads(self.filepath.read_text())
return []
def save(self):
self.filepath.write_text(json.dumps(self.tasks, indent=2))
def add_task(self, description, depends_on=None):
task = {
"id": f"task-{len(self.tasks) + 1}",
"description": description,
"status": "pending",
"depends_on": depends_on or []
}
self.tasks.append(task)
self.save()
def get_ready_tasks(self):
"""获取可以执行的任务(依赖已满足)"""
ready = []
for task in self.tasks:
if task["status"] == "pending":
deps = task.get("depends_on", [])
if all(self.tasks[d]["status"] == "completed" for d in deps):
ready.append(task)
return ready为多 Agent 协作奠定基础:任务图可以让多个 Agent 并行工作。
Session 08: Background Tasks
座右铭:"Run slow operations in the background; the agent keeps thinking"
后台任务执行,Agent 不阻塞:
import threading
import queue
class BackgroundTaskManager:
def __init__(self):
self.notification_queue = queue.Queue()
def run_background(self, command):
def worker():
result = subprocess.run(command, capture_output=True, text=True)
# 完成时注入通知
self.notification_queue.put({
"type": "task_complete",
"command": command,
"output": result.stdout
})
thread = threading.Thread(target=worker)
thread.start()
def check_notifications(self):
"""Agent 每次循环检查是否有后台任务完成"""
notifications = []
while not self.notification_queue.empty():
notifications.append(self.notification_queue.get())
return notifications场景:长时间运行的测试、文件下载、数据同步等。
Phase 4: TEAMS —— 团队协作
Session 09: Agent Teams
座右铭:"When the task is too big for one, delegate to teammates"
持久化队友 + 异步 mailboxes:
import json
from pathlib import Path
class Teammate:
def __init__(self, name, skill):
self.name = name
self.skill = skill
self.mailbox_path = Path(f"mailboxes/{name}.jsonl")
def send_message(self, message):
"""发送消息到队友的 mailbox"""
with open(self.mailbox_path, "a") as f:
f.write(json.dumps(message) + "\n")
def check_mailbox(self):
"""读取新消息"""
if not self.mailbox_path.exists():
return []
messages = []
with open(self.mailbox_path) as f:
for line in f:
messages.append(json.loads(line))
return messages
# 创建专业化的队友
coder = Teammate("coder", "writes code")
tester = Teammate("tester", "runs tests")
reviewer = Teammate("reviewer", "reviews code")Session 10: Team Protocols
座右铭:"Teammates need shared communication rules"
统一的 request-response 协议:
# 标准消息格式
def create_request(from_agent, to_agent, task, context=None):
return {
"type": "request",
"from": from_agent,
"to": to_agent,
"task": task,
"context": context or {},
"timestamp": datetime.now().isoformat()
}
def create_response(request, status, result=None):
return {
"type": "response",
"from": request["to"],
"to": request["from"],
"in_reply_to": request.get("timestamp"),
"status": status, # "accepted", "rejected", "completed"
"result": result,
"timestamp": datetime.now().isoformat()
}
# 使用示例
request = create_request(
"lead", "coder",
"Implement authentication",
{"framework": "fastapi"}
)
coder.send_message(request)
# Coder 处理并回复
messages = coder.check_mailbox()
for msg in messages:
if msg["type"] == "request":
# 执行任务
result = implement_auth(msg["task"])
# 发送回复
response = create_response(msg, "completed", result)
lead.send_message(response)Session 11: Autonomous Agents
座右铭:"Teammates scan the board and claim tasks themselves"
自主任务认领,无需中心分配:
def autonomous_cycle(teammate, task_graph):
while True:
# 扫描任务板
ready_tasks = task_graph.get_ready_tasks()
# 自动认领匹配的任务
for task in ready_tasks:
if can_handle(teammate, task):
# 认领任务
task["status"] = "in_progress"
task["assigned_to"] = teammate.name
task_graph.save()
# 执行任务
try:
result = execute_task(teammate, task)
task["status"] = "completed"
task["result"] = result
except Exception as e:
task["status"] = "failed"
task["error"] = str(e)
finally:
task_graph.save()
# 心跳间隔
time.sleep(30)
# 每个 teammate 运行自己的 autonomous cycle
for teammate in team:
threading.Thread(
target=autonomous_cycle,
args=(teammate, task_graph)
).start()Session 12: Worktree Isolation
座右铭:"Each works in its own directory, no interference"
工作树隔离,Tasks 管理目标,Worktrees 管理目录:
import subprocess
import shutil
class Worktree:
def __init__(self, task_id, base_path="worktrees"):
self.task_id = task_id
self.path = Path(base_path) / task_id
self.path.mkdir(parents=True, exist_ok=True)
def execute_in_isolation(self, command):
"""在隔离环境中执行命令"""
return subprocess.run(
command,
cwd=self.path,
capture_output=True,
text=True
)
def cleanup(self):
"""清理工作目录"""
if self.path.exists():
shutil.rmtree(self.path)
# Task 和 Worktree 通过 ID 绑定
task = {
"id": "task-123",
"description": "Build feature X",
"worktree_id": "worktree-123"
}
# 创建隔离的工作环境
worktree = Worktree(task["worktree_id"])
# 在隔离环境中执行
result = worktree.execute_in_isolation(["npm", "install"])
result = worktree.execute_in_isolation(["npm", "run", "build"])
# 完成后清理
worktree.cleanup()整合:完整的 Agent Harness
将所有机制整合起来:
class AgentHarness:
def __init__(self):
# Phase 1: 核心循环
self.tools = self.setup_tools()
# Phase 2: 规划与知识
self.todo_manager = TodoManager()
self.skill_loader = SkillLoader()
self.context_manager = ContextManager()
# Phase 3: 持久化
self.task_graph = TaskGraph("tasks.json")
self.background_tasks = BackgroundTaskManager()
# Phase 4: 团队
self.team = self.setup_team()
self.worktrees = WorktreeManager()
def setup_tools(self):
return {
"bash": bash_handler,
"read_file": read_handler,
"write_file": write_handler,
"todo_write": self.todo_manager.add_todos,
"load_skill": self.skill_loader.load,
"spawn_subagent": spawn_subagent,
}
def run(self, user_message):
messages = [{"role": "user", "content": user_message}]
while True:
# 检查后台任务通知
notifications = self.background_tasks.check_notifications()
if notifications:
messages.append({
"role": "system",
"content": f"Background tasks: {notifications}"
})
# 压缩 context(如果需要)
messages = self.context_manager.compress_if_needed(messages)
# 调用 LLM
response = client.messages.create(
model=MODEL,
messages=messages,
tools=self.tools,
)
messages.append({"role": "assistant", "content": response.content})
# 检查是否完成
if response.stop_reason != "tool_use":
return messages
# 执行工具调用
for block in response.content:
if block.type == "tool_use":
output = self.tools[block.name](**block.input)
messages.append({
"role": "user",
"content": [{
"type": "tool_result",
"tool_use_id": block.id,
"content": str(output)
}]
})关键设计原则
通过这 12 个 Sessions,我们可以总结出 Claude Code 架构的核心设计原则:
1. Model 是 Agent,Code 是 Harness
- Agent 的智能来自训练,不是代码
- Code 的职责是提供环境,不是模拟智能
2. 信任 Model,简化 Harness
- 不要用复杂的决策树限制 Model
- 给 Model 工具,让它自己决定何时使用
3. 按需加载,保持简洁
- 知识按需加载,不塞满 system prompt
- Context 动态管理,及时压缩清理
4. 隔离与协作
- Subagents 独立上下文,避免污染
- Teammates 异步通信,并行工作
- Worktrees 隔离执行,互不干扰
5. 持久化一切
- Tasks 持久化到文件
- Mailboxes 持久化通信
- 支持中断恢复
实战应用:构建你自己的 Agent Harness
理解原理后,你可以:
1. 学习 Claude Code
- 阅读 shareAI-lab/learn-claude-code 完整实现
- 运行 12 个参考实现,观察每个机制如何工作
- 修改和扩展,理解 trade-offs
2. 使用 Kode Agent CLI
npm i -g @shareai-lab/kode开源的编码 Agent CLI,支持多种 Model,可学习其 Harness 设计。
3. 集成到你的应用
使用 Kode Agent SDK 将 Agent 能力嵌入你的后端、浏览器扩展或任何应用。
进阶:从 On-Demand 到 Always-On
本文讨论的 Harness 是"用完即弃"模型——每次 session 从零开始。
如果你的需求是"始终在线"的 Assistant,可以参考 claw0 仓库,它基于相同的 Agent 核心,添加了:
- Heartbeat:定期唤醒检查是否有工作
- Cron:Agent 可以调度自己的未来任务
- IM Channels:多渠道即时通讯(WhatsApp、Telegram、Slack 等)
- Memory:持久化上下文记忆
- Soul:个性化人格系统
learn-claude-code claw0
(agent harness core) (proactive always-on harness)总结
通过 12 个渐进 Sessions,我们从零构建了一个完整的 Agent Harness,实现了 Claude Code 的核心机制:
- 核心循环(s01-s02):Agent Loop + Tool Use
- 规划与知识(s03-s06):TodoWrite、Subagents、Skills、Context Compact
- 持久化(s07-s08):Tasks、Background Tasks
- 团队协作(s09-s12):Agent Teams、Protocols、Autonomous Agents、Worktree Isolation
最重要的收获:Agent 是训练出来的 Model,不是编写出来的代码。作为工程师,我们的职责是建造优秀的 Harness——给 Model 提供工具、知识、上下文和权限,然后信任它,让它去推理、去决定、去行动。
这就是 Claude Code 的灵魂。这就是 Harness Engineering 的精髓。
资源链接:
- shareAI-lab/learn-claude-code - 12 个渐进 Sessions 的完整实现
- Anthropic Claude Code 官方文档 - Claude Code 官方文档
- shareAI-lab/Kode-cli - 开源编码 Agent CLI
- shareAI-lab/Kode-agent-sdk - 嵌入式 Agent SDK
- shareAI-lab/claw0 - Always-On Agent Harness