Architecture

Clarissa is built on the ReAct (Reasoning + Acting) agent pattern, enabling an LLM to reason about tasks and take actions through tool execution. This document provides a technical overview of how the system is designed and how components interact.

System Overview

At its core, Clarissa connects user input to large language models through a provider abstraction layer, supporting both cloud APIs (OpenRouter, OpenAI, Anthropic) and local inference (LM Studio, Apple Intelligence, GGUF models). The architecture supports both interactive terminal sessions and one-shot command execution.

flowchart TB
    subgraph Entry["Entry Points"]
        CLI["CLI Parser
src/index.tsx"] OneShot["One-Shot Mode"] Interactive["Interactive Mode"] end subgraph Core["Core Engine"] Agent["ReAct Agent
src/agent.ts"] LLM["LLM Client
src/llm/client.ts"] Providers["Provider Registry
src/llm/providers/"] Context["Context Manager
src/llm/context.ts"] end subgraph Tools["Tool System"] Registry["Tool Registry
src/tools/index.ts"] BuiltIn["Built-in Tools
file, git, bash, web"] MCP["MCP Client
src/mcp/client.ts"] end subgraph Persistence["Persistence Layer"] Session["Session Manager
src/session/"] Memory["Memory Manager
src/memory/"] end subgraph UI["User Interface"] App["Ink App
src/ui/App.tsx"] MD["Markdown Renderer"] end CLI --> OneShot CLI --> Interactive OneShot --> Agent Interactive --> App App --> Agent Agent --> LLM LLM --> Providers Agent --> Context Agent --> Registry Registry --> BuiltIn Registry --> MCP Providers -.->|Cloud APIs| External["OpenRouter/OpenAI/Anthropic"] Providers -.->|Local| Local["LM Studio/Apple AI/GGUF"] MCP -.->|stdio| ExtMCP["MCP Servers"] Agent --> Session Agent --> Memory

Core Components

Entry Point (src/index.tsx)

The CLI entry point handles argument parsing and determines the execution mode:

  • Interactive Mode - Launches the Ink-based terminal UI for ongoing conversations
  • One-Shot Mode - Processes a single message and exits (supports piped input)

ReAct Agent (src/agent.ts)

The Agent class implements the ReAct loop pattern. It maintains conversation history, coordinates with the LLM, and orchestrates tool execution. The loop continues until the LLM provides a final response without tool calls, or the maximum iteration limit is reached.

LLM Client (src/llm/client.ts)

Provides a unified interface for streaming chat completions with tool support across all providers. Includes retry logic with exponential backoff and jitter for transient failures.

Provider Registry (src/llm/providers/)

Manages multiple LLM providers with a common interface. Each provider implements:

  • checkAvailability() - Verify the provider is configured and accessible
  • chat() - Send messages and receive streaming responses
  • initialize() / shutdown() - Lifecycle management

Supported providers: OpenRouter, OpenAI, Anthropic, Apple Intelligence, LM Studio, Local Llama (GGUF).

Context Manager (src/llm/context.ts)

Tracks token usage and manages the context window. When conversations exceed the model's context limit, it intelligently truncates older messages while preserving atomic groups (tool calls must stay with their results).

Tool Registry (src/tools/index.ts)

Centralized registry for all available tools. Each tool is defined with:

  • Zod schema for parameter validation
  • Automatic JSON Schema generation for the LLM API
  • Confirmation flag for potentially dangerous operations
  • Category for organization (file, git, system, mcp, utility)

MCP Client (src/mcp/client.ts)

Connects to external Model Context Protocol servers via stdio transport. Discovered tools are automatically converted to Clarissa's tool format and registered with the Tool Registry.

Session Manager (src/session/index.ts)

Persists conversation history to ~/.clarissa/sessions/ as JSON files. Sessions can be saved, listed, loaded, and deleted via slash commands.

Memory Manager (src/memory/index.ts)

Long-term memory persistence stored in ~/.clarissa/memories.json. Memories are injected into the system prompt, allowing the agent to recall facts across sessions.

The ReAct Loop

The ReAct pattern enables the agent to reason about tasks and take actions iteratively:

flowchart LR
    A["User Message"] --> B["Add to History"]
    B --> C["Truncate Context"]
    C --> D["LLM Request"]
    D --> E{{"Response Type?"}}
    E -->|"Tool Calls"| F["Execute Tools"]
    F --> G["Add Results"]
    G --> D
    E -->|"Final Answer"| H["Return Response"]
    

Key aspects of the loop:

  1. User message is added to conversation history
  2. Context is truncated if approaching the model's token limit
  3. Request sent to LLM with available tool definitions
  4. If response contains tool calls, each tool is executed (with optional user confirmation)
  5. Tool results are appended to history and the loop continues
  6. When the LLM responds without tool calls, the final answer is returned

Request Lifecycle

This sequence diagram shows the complete flow of a user request that involves tool execution:

sequenceDiagram
    participant User
    participant UI as Ink UI
    participant Agent
    participant LLM as LLM Client
    participant Tools as Tool Registry
    participant API as OpenRouter

    User->>UI: Enter message
    UI->>Agent: run(message)
    Agent->>Agent: Update system prompt with memories
    Agent->>Agent: Truncate context if needed
    Agent->>LLM: chatStreamComplete()
    LLM->>API: POST /chat/completions
    API-->>LLM: Stream response chunks
    LLM-->>UI: onStreamChunk()
    LLM->>Agent: Response with tool_calls
    Agent->>UI: onToolCall(name, args)
    UI->>User: Confirm execution?
    User->>UI: Approve
    Agent->>Tools: execute(name, args)
    Tools->>Tools: Validate with Zod
    Tools->>Agent: Tool result
    Agent->>Agent: Add result to history
    Agent->>LLM: Continue conversation
    LLM->>API: POST /chat/completions
    API-->>LLM: Final response
    LLM->>Agent: Response (no tool calls)
    Agent->>UI: onResponse()
    UI->>User: Display answer
    

Tool Execution Flow

Tools that could be destructive (file writes, shell commands, git commits) require user confirmation. This flow can be bypassed with "yolo mode" (/yolo) for trusted automation scenarios.

flowchart TD
    A["Tool Call Received"] --> B{{"Requires Confirmation?"}}
    B -->|No| E["Execute Tool"]
    B -->|Yes| C{{"Auto-Approve Enabled?"}}
    C -->|Yes| E
    C -->|No| D["Prompt User"]
    D --> F{{"User Response"}}
    F -->|Approve| E
    F -->|Reject| G["Return Rejection"]
    E --> H["Parse & Validate Args"]
    H --> I["Run Tool Function"]
    I --> J["Return Result"]
    G --> J
    

Context Window Management

Different models have different context window sizes (e.g., 200K for Claude, 128K for GPT-4). The Context Manager tracks token usage and automatically truncates when approaching limits.

flowchart TD
    A["Estimate Conversation Tokens"] --> B{{"Exceeds Limit?"}}
    B -->|No| C["Return Messages As-Is"]
    B -->|Yes| D["Group Messages Atomically"]
    D --> E["Keep System Prompt"]
    E --> F["Add Recent Groups Until Limit"]
    F --> G["Return Truncated Messages"]

    subgraph Atomic["Atomic Groups"]
        H["User Message"]
        I["Assistant + Tool Results"]
    end
    

The truncation algorithm ensures tool calls always stay with their corresponding results, preventing the LLM from seeing orphaned tool calls or results without context.

Technical Design Decisions

Decision Rationale
Bun Runtime Native TypeScript support, fast startup, built-in file APIs, single binary compilation
Ink (React for CLI) Declarative UI with component model, familiar React patterns, rich ecosystem
Multi-Provider Architecture Unified interface for cloud (OpenRouter, OpenAI, Anthropic) and local (Apple Intelligence, LM Studio, GGUF) providers
Apple Intelligence On-device AI with full tool calling support, no API keys required, privacy-focused local inference
Zod Validation Type-safe schemas, automatic JSON Schema generation, runtime validation
ReAct Pattern Proven agentic architecture, iterative reasoning, clear action boundaries
MCP Protocol Standardized tool interface, ecosystem compatibility, external extensibility
Streaming Responses Immediate feedback, better UX for long responses, progressive display
Local Persistence Privacy-first approach, no external dependencies, simple JSON format

File Structure

src/
  index.tsx        # CLI entry point, argument parsing
  agent.ts         # ReAct agent loop implementation
  update.ts        # Auto-update and upgrade functionality
  config/          # Environment validation (Zod), model config
  llm/
    client.ts      # Unified LLM client with streaming
    context.ts     # Token tracking, context truncation
    types.ts       # Message and tool type definitions
    providers/     # Multi-provider abstraction layer
      index.ts     # Provider registry and auto-detection
      openrouter.ts, openai.ts, anthropic.ts  # Cloud providers
      apple-ai.ts, lmstudio.ts, local-llama.ts  # Local providers
  mcp/
    client.ts      # MCP server connections, tool bridging
  memory/
    index.ts       # Long-term memory persistence
  models/
    download.ts    # GGUF model download from Hugging Face
  preferences/
    index.ts       # User preferences persistence
  session/
    index.ts       # Conversation session management
  tools/
    index.ts       # Tool registry, execution orchestration
    base.ts        # Tool interface, Zod-to-JSON conversion
    *.ts           # Individual tool implementations
  ui/
    App.tsx        # Main Ink application component
    markdown.ts    # Terminal markdown rendering
    components/    # Reusable UI components