Discussion: LEANN Integration - Graph-Based Storage for 97% Space Savings #219

doobidoo · 2025-11-09T15:00:23Z

doobidoo
Nov 9, 2025
Maintainer

LEANN Integration Discussion

I'd like to open a discussion about potentially integrating LEANN (Learning-Enhanced Augmented Neural Network) as a storage backend option for mcp-memory-service.

What is LEANN?

LEANN is a graph-based RAG system that achieves 97% storage savings through selective recomputation. Instead of storing all embedding vectors, it:

Stores a compressed knowledge graph (CSR format)
Computes embeddings on-demand during search using graph traversal
Uses HNSW or DiskANN indexing backends
Already supports MCP servers for live data integration

Potential Benefits

✅ Storage Efficiency: 97% reduction in storage requirements
✅ Privacy-First: Local data processing (aligns with our philosophy)
✅ Graph-Based Queries: Enable relationship-based memory retrieval
✅ Portability: Smaller data transfers for multi-device sync
✅ MCP Compatible: Designed to work with MCP protocol

Trade-offs & Concerns

❌ Performance Impact: On-demand computation vs our current 5ms pre-computed reads
❌ Query Latency: Trades storage for compute time during queries
❌ Complexity: Graph maintenance overhead vs current vector storage
❌ Feature Parity: Need to support tags, time-based search, metadata
❌ Migration: 2,530+ production memories, backward compatibility

Current Architecture Context

Our current backends optimize for query performance:

SQLite-vec: 5ms reads, single-file, <150MB memory
Cloudflare: Edge-deployed vector search
Hybrid: Fast local + background cloud sync (recommended)

LEANN optimizes for storage efficiency, which is a different trade-off.

Possible Integration Approaches

Option 1: Experimental 4th Backend

export MCP_MEMORY_STORAGE_BACKEND=leann  # Alongside hybrid/cloudflare/sqlite_vec

Option 2: Tiered Storage Architecture

Hot tier (recent memories): Current fast backends (5ms)
Archive tier (30+ days): LEANN graph compression (slower, 97% smaller)

Option 3: Document Ingestion Layer

Use LEANN specifically for compressing large document collections while keeping current backends for active memory.

Questions for Discussion

Use Case: Do you have storage constraints that would benefit from 97% savings?
Performance: Would you accept slower queries (>5ms) in exchange for storage efficiency?
Integration: Which approach (experimental backend, tiered storage, document layer) makes most sense?
Priority: Is this valuable enough to pursue given current roadmap?

My Initial Assessment

This seems like a research prototype rather than immediate priority:

Current backends perform well (5ms, production-ready)
Storage isn't a limiting factor for most users
Different optimization target than current focus
Could be valuable for specific use cases (edge devices, large archives)

Community Input Needed

What do you think? Should we pursue LEANN integration? If so, which approach makes the most sense for your use cases?

Reference: https://github.com/yichuan-w/LEANN

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Discussion: LEANN Integration - Graph-Based Storage for 97% Space Savings #219

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Uh oh!

Discussion: LEANN Integration - Graph-Based Storage for 97% Space Savings #219

Uh oh!

doobidoo Nov 9, 2025 Maintainer

LEANN Integration Discussion

What is LEANN?

Potential Benefits

Trade-offs & Concerns

Current Architecture Context

Possible Integration Approaches

Option 1: Experimental 4th Backend

Option 2: Tiered Storage Architecture

Option 3: Document Ingestion Layer

Questions for Discussion

My Initial Assessment

Community Input Needed

Replies: 0 comments

doobidoo
Nov 9, 2025
Maintainer