Architecture
Clarissa is built on the ReAct (Reasoning + Acting) agent pattern, enabling an LLM to reason about tasks and take actions through tool execution. This document provides a technical overview of how the system is designed and how components interact.
System Overview
At its core, Clarissa connects user input to large language models through a provider abstraction layer, supporting both cloud APIs (OpenRouter, OpenAI, Anthropic) and local inference (LM Studio, Apple Intelligence, GGUF models). The architecture supports both interactive terminal sessions and one-shot command execution.
flowchart TB
subgraph Entry["Entry Points"]
CLI["CLI Parser
src/index.tsx"]
OneShot["One-Shot Mode"]
Interactive["Interactive Mode"]
end
subgraph Core["Core Engine"]
Agent["ReAct Agent
src/agent.ts"]
LLM["LLM Client
src/llm/client.ts"]
Providers["Provider Registry
src/llm/providers/"]
Context["Context Manager
src/llm/context.ts"]
end
subgraph Tools["Tool System"]
Registry["Tool Registry
src/tools/index.ts"]
BuiltIn["Built-in Tools
file, git, bash, web"]
MCP["MCP Client
src/mcp/client.ts"]
end
subgraph Persistence["Persistence Layer"]
Session["Session Manager
src/session/"]
Memory["Memory Manager
src/memory/"]
end
subgraph UI["User Interface"]
App["Ink App
src/ui/App.tsx"]
MD["Markdown Renderer"]
end
CLI --> OneShot
CLI --> Interactive
OneShot --> Agent
Interactive --> App
App --> Agent
Agent --> LLM
LLM --> Providers
Agent --> Context
Agent --> Registry
Registry --> BuiltIn
Registry --> MCP
Providers -.->|Cloud APIs| External["OpenRouter/OpenAI/Anthropic"]
Providers -.->|Local| Local["LM Studio/Apple AI/GGUF"]
MCP -.->|stdio| ExtMCP["MCP Servers"]
Agent --> Session
Agent --> Memory
Core Components
Entry Point (src/index.tsx)
The CLI entry point handles argument parsing and determines the execution mode:
- Interactive Mode - Launches the Ink-based terminal UI for ongoing conversations
- One-Shot Mode - Processes a single message and exits (supports piped input)
ReAct Agent (src/agent.ts)
The Agent class implements the ReAct loop pattern. It maintains conversation history, coordinates with the LLM, and orchestrates tool execution. The loop continues until the LLM provides a final response without tool calls, or the maximum iteration limit is reached.
LLM Client (src/llm/client.ts)
Provides a unified interface for streaming chat completions with tool support across all providers. Includes retry logic with exponential backoff and jitter for transient failures.
Provider Registry (src/llm/providers/)
Manages multiple LLM providers with a common interface. Each provider implements:
checkAvailability()- Verify the provider is configured and accessiblechat()- Send messages and receive streaming responsesinitialize()/shutdown()- Lifecycle management
Supported providers: OpenRouter, OpenAI, Anthropic, Apple Intelligence, LM Studio, Local Llama (GGUF).
Context Manager (src/llm/context.ts)
Tracks token usage and manages the context window. When conversations exceed the model's context limit, it intelligently truncates older messages while preserving atomic groups (tool calls must stay with their results).
Tool Registry (src/tools/index.ts)
Centralized registry for all available tools. Each tool is defined with:
- Zod schema for parameter validation
- Automatic JSON Schema generation for the LLM API
- Confirmation flag for potentially dangerous operations
- Category for organization (file, git, system, mcp, utility)
MCP Client (src/mcp/client.ts)
Connects to external Model Context Protocol servers via stdio transport. Discovered tools are automatically converted to Clarissa's tool format and registered with the Tool Registry.
Session Manager (src/session/index.ts)
Persists conversation history to ~/.clarissa/sessions/ as JSON files.
Sessions can be saved, listed, loaded, and deleted via slash commands.
Memory Manager (src/memory/index.ts)
Long-term memory persistence stored in ~/.clarissa/memories.json.
Memories are injected into the system prompt, allowing the agent to recall facts across sessions.
The ReAct Loop
The ReAct pattern enables the agent to reason about tasks and take actions iteratively:
flowchart LR
A["User Message"] --> B["Add to History"]
B --> C["Truncate Context"]
C --> D["LLM Request"]
D --> E{{"Response Type?"}}
E -->|"Tool Calls"| F["Execute Tools"]
F --> G["Add Results"]
G --> D
E -->|"Final Answer"| H["Return Response"]
Key aspects of the loop:
- User message is added to conversation history
- Context is truncated if approaching the model's token limit
- Request sent to LLM with available tool definitions
- If response contains tool calls, each tool is executed (with optional user confirmation)
- Tool results are appended to history and the loop continues
- When the LLM responds without tool calls, the final answer is returned
Request Lifecycle
This sequence diagram shows the complete flow of a user request that involves tool execution:
sequenceDiagram
participant User
participant UI as Ink UI
participant Agent
participant LLM as LLM Client
participant Tools as Tool Registry
participant API as OpenRouter
User->>UI: Enter message
UI->>Agent: run(message)
Agent->>Agent: Update system prompt with memories
Agent->>Agent: Truncate context if needed
Agent->>LLM: chatStreamComplete()
LLM->>API: POST /chat/completions
API-->>LLM: Stream response chunks
LLM-->>UI: onStreamChunk()
LLM->>Agent: Response with tool_calls
Agent->>UI: onToolCall(name, args)
UI->>User: Confirm execution?
User->>UI: Approve
Agent->>Tools: execute(name, args)
Tools->>Tools: Validate with Zod
Tools->>Agent: Tool result
Agent->>Agent: Add result to history
Agent->>LLM: Continue conversation
LLM->>API: POST /chat/completions
API-->>LLM: Final response
LLM->>Agent: Response (no tool calls)
Agent->>UI: onResponse()
UI->>User: Display answer
Tool Execution Flow
Tools that could be destructive (file writes, shell commands, git commits) require user confirmation.
This flow can be bypassed with "yolo mode" (/yolo) for trusted automation scenarios.
flowchart TD
A["Tool Call Received"] --> B{{"Requires Confirmation?"}}
B -->|No| E["Execute Tool"]
B -->|Yes| C{{"Auto-Approve Enabled?"}}
C -->|Yes| E
C -->|No| D["Prompt User"]
D --> F{{"User Response"}}
F -->|Approve| E
F -->|Reject| G["Return Rejection"]
E --> H["Parse & Validate Args"]
H --> I["Run Tool Function"]
I --> J["Return Result"]
G --> J
Context Window Management
Different models have different context window sizes (e.g., 200K for Claude, 128K for GPT-4). The Context Manager tracks token usage and automatically truncates when approaching limits.
flowchart TD
A["Estimate Conversation Tokens"] --> B{{"Exceeds Limit?"}}
B -->|No| C["Return Messages As-Is"]
B -->|Yes| D["Group Messages Atomically"]
D --> E["Keep System Prompt"]
E --> F["Add Recent Groups Until Limit"]
F --> G["Return Truncated Messages"]
subgraph Atomic["Atomic Groups"]
H["User Message"]
I["Assistant + Tool Results"]
end
The truncation algorithm ensures tool calls always stay with their corresponding results, preventing the LLM from seeing orphaned tool calls or results without context.
Technical Design Decisions
| Decision | Rationale |
|---|---|
| Bun Runtime | Native TypeScript support, fast startup, built-in file APIs, single binary compilation |
| Ink (React for CLI) | Declarative UI with component model, familiar React patterns, rich ecosystem |
| Multi-Provider Architecture | Unified interface for cloud (OpenRouter, OpenAI, Anthropic) and local (Apple Intelligence, LM Studio, GGUF) providers |
| Apple Intelligence | On-device AI with full tool calling support, no API keys required, privacy-focused local inference |
| Zod Validation | Type-safe schemas, automatic JSON Schema generation, runtime validation |
| ReAct Pattern | Proven agentic architecture, iterative reasoning, clear action boundaries |
| MCP Protocol | Standardized tool interface, ecosystem compatibility, external extensibility |
| Streaming Responses | Immediate feedback, better UX for long responses, progressive display |
| Local Persistence | Privacy-first approach, no external dependencies, simple JSON format |
File Structure
src/
index.tsx # CLI entry point, argument parsing
agent.ts # ReAct agent loop implementation
update.ts # Auto-update and upgrade functionality
config/ # Environment validation (Zod), model config
llm/
client.ts # Unified LLM client with streaming
context.ts # Token tracking, context truncation
types.ts # Message and tool type definitions
providers/ # Multi-provider abstraction layer
index.ts # Provider registry and auto-detection
openrouter.ts, openai.ts, anthropic.ts # Cloud providers
apple-ai.ts, lmstudio.ts, local-llama.ts # Local providers
mcp/
client.ts # MCP server connections, tool bridging
memory/
index.ts # Long-term memory persistence
models/
download.ts # GGUF model download from Hugging Face
preferences/
index.ts # User preferences persistence
session/
index.ts # Conversation session management
tools/
index.ts # Tool registry, execution orchestration
base.ts # Tool interface, Zod-to-JSON conversion
*.ts # Individual tool implementations
ui/
App.tsx # Main Ink application component
markdown.ts # Terminal markdown rendering
components/ # Reusable UI components