Transform your documents into searchable knowledge bases with FAISS vector embeddings
🚀 Quick Start • 📚 Documentation • 🐛 Report Bug • 💡 Request Feature
If you find MCPower useful, help us grow the community!
⭐ Star this repo to show your support!
- ✅ Phase 1-5: Complete (All user stories implemented)
- 🚧 Phase 6: Polish & documentation (in progress)
MCPower is a Model Context Protocol (MCP) server that provides powerful semantic search over your document collections. Drop in any folder of .txt or .md files, and get instant AI-powered search capabilities through a beautiful web interface or programmatic API.
Perfect for:
- 📚 Documentation sites
- 🗂️ Knowledge bases
- 💬 Chatbot context
- 🔍 Research papers
- 📝 Note collections
|
Just drop folders into the web console to create searchable datasets. No CLI commands needed! FAISS-powered vector search with <500ms response times. Search thousands of documents instantly. Uses sentence transformers for intelligent matching beyond keyword search. |
Works with Claude Desktop, VS Code, Cherry Studio, and any MCP client. One-click launcher automatically sets up everything. Just run Modern, responsive web console with real-time stats and visual feedback. |
🐧 Linux / 🍎 macOS
# Clone the repository
git clone https://github.com/wspotter/mcpower.git
cd mcpower
# Run the launcher - it does everything!
./launch.shThe web console opens automatically at http://127.0.0.1:4173 🎉
🪟 Windows
# Clone the repository
git clone https://github.com/wspotter/mcpower.git
cd mcpower
# Double-click launch.bat or run:
launch.batYour browser opens automatically to http://127.0.0.1:4173 🎉
- Semantic Search: Search knowledge datasets using natural language queries
- Interactive Web Console: Manage datasets with drag-and-drop interface
- Multiple Datasets: Manage and search across multiple knowledge bases
- MCP Compatible: Works with any MCP client (VS Code, Cherry Studio, etc.)
- Fast & Reliable: FAISS-powered vector search with <500ms p95 latency
- Graceful Degradation: Continues working even with invalid datasets
- Comprehensive Logging: Structured JSON logs with detailed diagnostics
graph TD
A[📄 Your Documents] -->|Python Indexer| B[🧮 Embeddings]
B -->|FAISS| C[💾 Vector Database]
C -->|TypeScript MCP Server| D[🔌 MCP Protocol]
D --> E1[VS Code Copilot]
D --> E2[Cherry Studio]
D --> E3[Any MCP Client]
style A fill:#e3f2fd
style B fill:#fff3e0
style C fill:#f3e5f5
style D fill:#e8f5e9
style E1 fill:#fce4ec
style E2 fill:#fce4ec
style E3 fill:#fce4ec
-
📚 Document Processing
- Python reads your documents (txt, md, pdf)
- Splits them into semantic chunks
- Generates embeddings using
sentence-transformers
-
⚡ Fast Vector Search
- FAISS indexes embeddings for lightning-fast similarity search
- Sub-500ms query latency even on large datasets
- Efficient memory usage with optimized index structures
-
🔌 MCP Integration
- TypeScript server exposes MCP tools
- Clients send queries via stdio protocol
- Python bridge handles FAISS operations
- Results returned as JSON with relevance scores
- Node.js 18+ and npm
- Python 3.10+
- Git
git clone https://github.com/wspotter/mcpower.git
cd mcpower
./launch.sh # Does everything automatically!The launcher will:
- ✅ Create virtual environment
- ✅ Install Python dependencies
- ✅ Install Node.js dependencies
- ✅ Configure environment variables
- ✅ Start the web console
- ✅ Open your browser
Click to expand manual installation steps
1. Clone the repository
git clone https://github.com/wspotter/mcpower.git
cd mcpower2. Install Node.js dependencies
npm install3. Create Python virtual environment
python3 -m venv .venv4. Install Python dependencies
.venv/bin/pip install typer faiss-cpu sentence-transformers5. Configure environment
cat > .env << EOF
MCPOWER_PYTHON=$(pwd)/.venv/bin/python
EOF6. Build and run
npm run build
npm run dev -- --datasets ./datasetsnpm run dev -- [options]Options:
--datasets <path>: Path to datasets directory (default:./datasets)--log-level <level>: Log level: debug, info, warn, error (default:info)--version: Show version information
Create a .env file in the project root:
# Datasets directory path
DATASETS_PATH=./datasets
# Log level (debug, info, warn, error)
LOG_LEVEL=infoThe easiest way to create datasets is through the web console:
- Start the console:
./launch.sh - Add a dataset:
- Click Browse to open directory picker
- Or drag & drop a folder into the input field
- Or type the path manually
- Submit: Click "Create Dataset"
- Monitor: Watch real-time indexing progress
Each dataset has three components stored in datasets/<name>/:
datasets/
└── my-docs/
├── config.json # Dataset configuration
├── index.faiss # FAISS vector index
└── metadata.json # Chunk metadata and text
Advanced: Create datasets via Python CLI
# Index a directory of documents
.venv/bin/python python/src/index.py index \
--source-path ./my-documents \
--dataset-name my-docs \
--output-dir ./datasets/my-docs
# Supported file types: .txt, .md, .pdfConfiguration options:
--chunk-size 512 # Characters per chunk
--chunk-overlap 50 # Overlap between chunks
--model sentence-transformers/all-MiniLM-L6-v2# List all datasets
GET /api/datasets
# Get dataset details
GET /api/datasets/:name
# Delete dataset
DELETE /api/datasets/:name
# Create dataset (via web console or API)
POST /api/datasets
{
"name": "my-docs",
"sourcePath": "/absolute/path/to/documents"
}MCPower works with any MCP-compatible client. Here's how to connect it:
Add to your VS Code settings.json:
{
"github.copilot.chat.codeGeneration.instructions": [
{
"text": "Use the mcpower MCP server for knowledge search"
}
],
"mcp.servers": {
"mcpower": {
"command": "node",
"args": ["/absolute/path/to/mcpower/dist/cli.js", "--datasets", "./datasets"],
"env": {
"MCPOWER_PYTHON": "/absolute/path/to/mcpower/.venv/bin/python"
}
}
}
}Add to Cherry Studio's MCP configuration:
{
"mcpServers": {
"mcpower": {
"command": "node",
"args": ["/absolute/path/to/mcpower/dist/cli.js", "--datasets", "./datasets"]
}
}
}Search your knowledge bases using natural language.
{
dataset: string; // Dataset name (required)
query: string; // Your search query (required)
topK?: number; // Number of results (default: 5)
}Example:
{
"tool": "knowledge.search",
"arguments": {
"dataset": "my-docs",
"query": "How do I configure authentication?",
"topK": 3
}
}Response:
{
"results": [
{
"score": 0.89,
"title": "Authentication Guide",
"path": "docs/auth.md",
"snippet": "To configure authentication, set the AUTH_ENABLED=true..."
}
]
}List all available datasets.
{} // No parametersResponse:
{
"datasets": [
{
"id": "my-docs",
"name": "My Documentation",
"description": "Internal docs",
"chunkCount": 1264,
"defaultTopK": 5
}
],
"metadata": {
"total": 1,
"ready": 1,
"errors": 0
}
}mcpower/
├── src/ # TypeScript MCP server
│ ├── cli.ts # Entry point
│ ├── server.ts # MCP protocol handler
│ ├── bridge/ # Python FAISS bridge
│ ├── config/ # Dataset registry
│ ├── store/ # Knowledge store cache
│ └── tools/ # MCP tool implementations
├── python/src/ # Python indexer & search
│ ├── index.py # CLI for indexing
│ └── search.py # FAISS search operations
├── webapp/ # Web console
│ ├── index.html # SPA interface
│ ├── app.js # Frontend logic
│ └── styles.css # Styling
├── tests/ # Test suites
│ ├── unit/ # Unit tests
│ └── integration/ # Integration tests
└── datasets/ # Your knowledge bases
└── sample-docs/ # Example dataset
# Development mode (auto-reload)
npm run dev -- --datasets ./datasets
# Build TypeScript
npm run build
# Start web console
npm run web
# Run tests
npm test
# Run with coverage
npm run test:coverage
# Type checking & linting
npm run lint- Define the tool in
src/tools/yourTool.ts:
export const yourTool: Tool = {
name: "knowledge.yourTool",
description: "What your tool does",
inputSchema: {
type: "object",
properties: {
param: { type: "string", description: "Parameter description" }
},
required: ["param"]
}
};-
Implement the handler in
src/tools/handlers/yourTool.ts -
Register it in
src/server.ts -
Add tests in
tests/unit/tools/yourTool.test.ts
# Run all 86 tests
npm test
# Run with coverage report
npm run test:coverage✅ 86 tests passing across:
- 🔍 18 search edge cases (empty queries, special chars, large results)
- 🛠️ 15 search tool validations
- 📚 11 dataset registry operations
- 📋 9 listDatasets tool tests
- 🚀 9 startup integration tests
- 💾 8 knowledge store caching
- ⚡ 6 performance benchmarks (<500ms p95)
- 🔗 5+5 integration tests (search + listDatasets)
# Test with real datasets
./test-search.sh
# Test web console API
./test-web.sh❌ Dataset Not Found
Error: Dataset not found: your-dataset
Solutions:
- ✅ Verify dataset exists in
datasets/directory - ✅ Check
config.jsonhas correct name field - ✅ Restart server to reload dataset registry
- ✅ Use web console to verify dataset list
🐍 Python Bridge Failures
Error: Python bridge command failed
Solutions:
- ✅ Verify Python 3.10+ is installed:
python3 --version - ✅ Check virtual environment:
.venv/bin/python --version - ✅ Reinstall dependencies:
.venv/bin/pip install -r python/requirements.txt - ✅ Test FAISS:
.venv/bin/python -c "import faiss; print('OK')" - ✅ Check .env file has correct
MCPOWER_PYTHONpath
🐌 Slow Search Performance
Issue: Queries taking >500ms
Solutions:
- ✅ Check dataset size (>10k chunks may need optimization)
- ✅ Verify FAISS index is properly trained
- ✅ Reduce
topKparameter (try 3-5 instead of 10+) - ✅ Consider using faster embedding model
- ✅ Use GPU-accelerated FAISS for large datasets
🌐 Web Console Connection Issues
Error: ERR_CONNECTION_REFUSED
Solutions:
- ✅ Ensure web server is running:
npm run web - ✅ Check port 4173 isn't blocked by firewall
- ✅ Try accessing
http://127.0.0.1:4173directly - ✅ Check console logs for startup errors
📝 Enable Debug Logging
Get detailed diagnostics:
npm run dev -- --log-level=debug --datasets ./datasetsThis shows:
- Dataset loading details
- Python bridge communication
- FAISS index operations
- Search query execution
- Error stack traces
🚨 We're actively looking for contributors! Check out our good first issues and help wanted labels.
We welcome contributions! Here's how to get started:
# Fork and clone
git clone https://github.com/YOUR_USERNAME/mcpower.git
cd mcpower
# Create feature branch
git checkout -b feature/amazing-feature
# Install dependencies
npm install
.venv/bin/pip install -r python/requirements.txt
# Make changes and test
npm run build
npm test
# Commit with clear message
git commit -m "feat: add amazing feature"
# Push and create PR
git push origin feature/amazing-featureWe're especially looking for contributors in these areas:
- 🎨 UI/UX: Improve web console design
- 📚 Documentation: Tutorials, examples, guides
- 🧪 Testing: More test coverage, edge cases
- 🚀 Performance: Optimization, caching strategies
- 🔌 Integrations: New MCP clients, data sources
- 🐛 Bug Fixes: See issues
- Write tests for new features
- Follow TypeScript/Python best practices
- Update documentation for API changes
- Use conventional commit messages
- Keep PRs focused and atomic
MIT License - see LICENSE for details
Built with amazing open-source tools:
- FAISS - Vector similarity search by Facebook Research
- sentence-transformers - State-of-the-art text embeddings
- MCP - Model Context Protocol by Anthropic
- TypeScript - Type-safe JavaScript
- Express - Fast web framework
Need assistance? We're here to help!
- 🐛 Bug Reports: Open an issue
- 💡 Feature Requests: Request a feature
- ❓ Questions: Search existing issues or open a new one
- 📚 Documentation: Check our Quick Start Guide
⭐ Star this repo if you find it useful!
Made with ❤️ by the MCPower team
