Sector F Logo

Sector F Labs


Reservoir

Abstract

Reservoir is a stateful proxy server for OpenAI-compatible Chat Completions APIs. It maintains conversation history in a Neo4j graph database and automatically injects relevant context into requests based on semantic similarity and recency.

Reservoir

Problem Statement

OpenAI-compatible Chat Completions APIs are stateless. Each request must include the complete conversation history for the model to maintain context. This creates several problems:

  1. Manual conversation state management
  2. Token limit constraints as conversations grow
  3. Inability to reference semantically related conversations
  4. No persistent storage of conversation data

Solution

Reservoir acts as an intermediary that:

Architecture

sequenceDiagram
    participant App
    participant Reservoir
    participant Neo4j
    participant LLM as OpenAI/Ollama

    App->>Reservoir: Request (e.g. /v1/chat/completions/$USER/my-application)
    Reservoir->>Reservoir: Check if last message exceeds token limit (Return error if true)
    Reservoir->>Reservoir: Tag with Trace ID + Partition
    Reservoir->>Neo4j: Store original request message(s)

    %% --- Context Enrichment Steps ---
    Reservoir->>Neo4j: Query for similar & recent messages
    Neo4j-->>Reservoir: Return relevant context messages
    Reservoir->>Reservoir: Inject context messages into request payload
    %% --- End Enrichment Steps ---

    Reservoir->>Reservoir: Check total token count & truncate if needed (preserving system/last messages)

    Reservoir->>LLM: Forward enriched & potentially truncated request
    LLM->>Reservoir: Return LLM response
    Reservoir->>Neo4j: Store LLM response message
    Reservoir->>App: Return LLM response

Supported Providers

Data Model

Conversations are stored as a graph structure:

Semantic Relationships

Reservoir creates synapses between messages when cosine similarity exceeds 0.85. This enables:

Conversation Graph View

Usage

Replace OpenAI API endpoint:

https://api.openai.com/v1/chat/completions

With Reservoir endpoint:

http://127.0.0.1:3017/partition/$USER/instance/reservoir/v1/chat/completions

The system organizes conversations using a partition/instance hierarchy for multi-tenant isolation.

Implementation

Start server:

cargo run -- start

The server initializes a vector index in Neo4j and listens on port 3017.

Documentation

Technical documentation is available at sectorflabs.com/reservoir.

Local documentation can be built with:

make book

Reference Implementation

A reference talk demonstrating the system architecture: Rust Relationships and Reservoir

License

BSD 3-Clause License