v100 is my engine for building, running, studying, and evolving autonomous coding agents under real constraints.
It is not a framework in the abstract. It is a concrete Go-based agent runtime featuring a CLI, Bubble Tea TUI, tool safety controls, durable memory, trace replay, benchmarking, evaluation, policy evolution, and long-running execution paths.
I built v100 to close the loop between idea, execution, observation, and iteration.
v100 operates on a fundamental principle: autonomy requires visibility, not hidden magic. Every agent action is trackable, replayable, and inspectable.
Instead of a rigid prompt loop, v100 implements multiple reasoning strategies (Solvers) that can be swapped or combined:
react: The classic reasoning loop, enhanced with watchdogs for tool denial and stall recovery.plan_execute: A two-phase strategy where the agent previews a plan and executes it, with automatic replanning on failure.smartrouter: Cost-performance escalation. It routes "trivial" idempotent tool calls (like reading files) to cheap models (e.g., Gemini Flash or local Ollama) and escalates to frontier models (e.g., MiniMax, Claude Opus) when a dangerous or complex mutation is required.rlm: DSPy-style Recursive Language Model pattern with sub-model invocation.miniglm: Intelligent provider switching between tool-focused and reasoning-focused models.
v100 is designed for long-running, unattended execution:
- Wake Daemon (
v100 wake): Runs on a recurring schedule. It can act as agoal_generator(mining TODOs and failures for next steps) or anissue_worker(autonomously picking open GitHub issues, implementing fixes, running local tests, pushing, and closing the issue). - Research Loop (
v100 research): A fully autonomous experiment loop. It proposes code changes, runs remote (Modal) or local experiments, parses a metric, and implements keep/discard (git commit vs. revert) logic automatically.
Tools are a first-class part of the runtime. The model interacts with the world through explicitly registered, schema-bound tools (40+ currently available):
- Safety boundaries: Tools are marked
SafeorDangerous. Dangerous tools can require explicit operator confirmation, trigger mandatory "Reflection" turns (Policy.ReflectOnDangerous) to assess confidence, or be blocked entirely. - Sandboxing: Runs can be executed inside isolated Docker containers with strict Network Tiers to prevent unauthorized data exfiltration.
- Semantic analysis: Includes tools like
sem_diff,sem_impact, andsem_blamethat understand code entities (functions, classes) instead of just text lines.
v100 treats memory as runtime infrastructure, not just prompt stuffing:
- Durable Blackboard: A shared workspace for agents to read and write ongoing findings.
- ATProto Integration: Deep Bluesky integration (
atproto_index,atproto_recall,atproto_vibe_check,atproto_daily_digest,atproto_graph_explorer). It indexes social feeds and profiles into vector embeddings for semantic RAG, summarizes feed activity, and explores follow graphs for real-time external context.
If an agent does something surprising, you shouldn't have to guess why.
- Every run emits a structured
trace.jsonl. v100 replay <run_id>lets you step through an agent's reasoning turn-by-turn after the fact.- Checkpoints allow you to resume interrupted runs seamlessly.
Prebuilt releases are published on GitHub. Alternatively, build from source:
./scripts/build.shThat rebuilds ./v100 and updates the shell v100 link. The underlying Go command is:
go build -o v100 ./cmd/v100Bootstrap your configuration and check your environment:
./v100 config init
./v100 doctorStart a standard interactive run using MiniMax (the preferred provider):
./v100 run --provider minimax --workspace .Enable Bubble Tea TUI for a better visual experience:
./v100 run --provider minimax --tui --workspace .Leverage advanced solvers for planning or cost routing:
./v100 run --solver plan_execute --plan --workspace .
./v100 run --solver smartrouter --workspace .Run fully unattended (the agent executes until completion or budget exhaustion):
./v100 run --continuous --workspace .Browse recent runs, resume them, or replay their traces:
./v100 runs
./v100 resume <run_id>
./v100 replay <run_id>
./v100 blame <run_id> <file_path> # See exactly which reasoning turn modified a filecmd/v100/ CLI commands
internal/core/ loop, solvers (react, plan, router), budgets, tracing, research, hooks
internal/tools/ 40+ tool implementations (fs, git, web, atproto, semantic)
internal/providers/ provider adapters (MiniMax, GLM, Anthropic, Gemini, OpenAI, etc.)
internal/eval/ scoring, benchmarks, experiments, analysis
internal/memory/ durable memory and vector stores
internal/ui/ terminal UI components
docs/ architecture notes
research/ research configs and artifacts
This is not meant to be a polished general-purpose framework in the abstract. It is my working engine for agentic research. I use it to try ideas quickly, keep the sharp edges visible, and evolve the system in public through actual use.
That means the repo carries a mix of serious runtime infrastructure, rough-edged experimental features, and bespoke tooling. It is built for researchers and power users who want deep control over their autonomous systems.
MIT