Codebase MCP
Turn any LLM into your personal coding assistant
Codebase MCP is an open-source, privacy-first AI development assistant that connects your LLM (like Claude) to your codebase through the Model Context Protocol (MCP). Build production-ready applications without expensive coding assistant subscriptions.
Why Codebase MCP?
The Problem
Modern AI coding assistants like Cursor and Windsurf are powerful but come with significant costs:
- πΈ Double subscriptions - Pay for Claude AND a separate coding assistant ($20-40/month)
- βοΈ Cloud-dependent - Your code is processed on remote servers
- π Vendor lock-in - Limited to specific LLM providers
- βοΈ Limited control - Can't customize or extend functionality
The Solution
Codebase MCP changes the game by turning YOUR existing LLM into a full-featured coding assistant:
- β Use what you already pay for - Just need Claude subscription (or any MCP-compatible LLM)
- β Privacy-first architecture - Processing happens locally with local embeddings
- β Open source & extensible - Modify, enhance, and contribute
- β LLM-agnostic design - Connect any LLM via Model Context Protocol
- β Production-ready - Quality scoring, auto-formatting, dependency checking
Perfect For
- π― Solo developers who want AI assistance without breaking the bank
- π’ Small teams who need privacy-conscious development tools
- π Privacy-focused developers working on sensitive codebases
- π οΈ Python & React developers building production applications
- π Projects under 20,000 lines - optimal performance range
Quick Start
Prerequisites
- Python 3.11 or higher
- Claude Desktop (or any MCP-compatible LLM client)
- Git installed and configured
- Gemini API key (free tier) - Get one here
5-Minute Setup
1. Clone & Install
# Clone repository
git clone https://github.com/danyQe/codebase-mcp.git
cd codebase-mcp
# Install using uv (recommended)
pip install uv
uv venv
uv pip install -r requirements.txt
# Install code formatters globally
pip install black ruff
2. Configure Environment
# Create .env file
cp .env.example .env
# Edit .env and add your Gemini API key
GEMINI_API_KEY=your_api_key_here
3. Configure Claude Desktop
Add to your claude_desktop_config.json
:
{
"mcpServers": {
"codebase-manager": {
"command": "C:\\path\\to\\codebase-mcp\\.venv\\Scripts\\python.exe",
"args": [
"C:\\path\\to\\codebase-mcp\\mcp_server.py"
]
}
}
}
4. Start FastAPI Server
# Start server with your project directory
python main.py /path/to/your/project
# With auto-reload (development)
python main.py /path/to/your/project --reload
Server runs on http://localhost:6789
5. Restart Claude Desktop
Restart Claude Desktop to load the MCP server. You should now see Codebase MCP tools available!
Installation
System Requirements
Requirement | Minimum | Recommended |
---|---|---|
Python | 3.11 | 3.11+ |
RAM | 4 GB | 8 GB+ |
Storage | 500 MB | 1 GB+ |
OS | Windows 10 | Windows 11 / macOS / Linux |
Installation Methods
Method 1: Using uv (Recommended)
# Install uv package manager
pip install uv
# Clone repository
git clone https://github.com/danyQe/codebase-mcp.git
cd codebase-mcp
# Create virtual environment and install
uv venv
uv pip install -r requirements.txt
# Install formatters globally
pip install black ruff
Method 2: Using pip
# Clone repository
git clone https://github.com/danyQe/codebase-mcp.git
cd codebase-mcp
# Create virtual environment
python -m venv .venv
# Activate virtual environment
# Windows:
.venv\Scripts\activate
# macOS/Linux:
source .venv/bin/activate
# Install dependencies
pip install -r requirements.txt
# Install formatters globally
pip install black ruff
Method 3: Docker (Coming Soon)
Docker support is planned for a future release.
Verify Installation
# Test FastAPI server
python main.py /path/to/test/project
# Check if server is running
curl http://localhost:6789/health
Configuration
Environment Variables
Create a .env
file in the project root:
# Gemini API Configuration (Required for edit_tool)
GEMINI_API_KEY=your_gemini_api_key_here
# Server Configuration (Optional)
HOST=0.0.0.0
PORT=6789
# Working Directory (Optional - can be set via command line)
WORKING_DIR=/path/to/your/project
# Logging (Optional)
LOG_LEVEL=INFO
Claude Desktop Configuration
Windows: %APPDATA%\Claude\claude_desktop_config.json
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json
{
"mcpServers": {
"codebase-manager": {
"command": "/absolute/path/to/.venv/Scripts/python.exe",
"args": [
"/absolute/path/to/mcp_server.py"
]
}
}
}
- Always use absolute paths
- Windows: Use double backslashes
\\
or forward slashes/
- macOS/Linux: Use forward slashes
/
- Point to the Python executable in your virtual environment
Git Configuration
Codebase MCP uses two separate git directories:
.git/
- Your personal git repository (user-managed).codebase/
- AI-tracked changes (managed by Codebase MCP)
Rate Limits (Gemini API)
The free Gemini API tier has the following limits:
- 15 requests per minute (RPM)
- 250,000 tokens per minute (TPM)
- 1,000 requests per day (RPD)
Codebase MCP automatically handles rate limiting with exponential backoff.
Tools Overview
Codebase MCP provides 13+ specialized tools accessible via the Model Context Protocol. Each tool is designed for specific development tasks.
Tool | Purpose | Key Features |
---|---|---|
session_tool |
Session management | Branch creation, isolation, merging |
memory_tool |
Knowledge persistence | Store learnings, search context |
git_tool |
Git operations | Status, commit, diff, log, branches |
write_tool |
Intelligent writing | Format, quality check, auto-commit |
edit_tool |
AI-assisted editing | Gemini-powered, contextual edits |
search_tool |
Code search | Semantic, fuzzy, text, symbol search |
read_code_tool |
Read files/symbols | Line ranges, symbols, whole files |
project_structure_tool |
Project analysis | Structure, dependencies, stats |
Tool Categories
π§ Development Workflow
session_tool
- Manage isolated development sessionsgit_tool
- Full git integrationproject_structure_tool
- Understand project organization
π Code Operations
write_tool
- Write new code with quality checksedit_tool
- AI-assisted code editingread_code_tool
- Read files and symbols
π Code Discovery
search_tool
- Multi-mode code searchlist_file_symbols_tool
- List all symbols in filesread_symbol_from_database
- Find symbols across codebase
π§ Intelligence
memory_tool
- Persistent knowledge storagecode_analysis_tool
- Syntax and lint analysis
session_tool
Purpose: Manage isolated development sessions with automatic branching
Operations
start
- Create new session branchend
- End session and return to mainswitch
- Switch to existing sessionlist
- List all session branchesmerge
- Merge session into maincurrent
- Show current session statusdelete
- Delete a session branch
Example Usage
# Start a new session
session_tool(operation="start", session_name="feature-user-auth")
# Check current session
session_tool(operation="current")
# End session with auto-merge
session_tool(operation="end", auto_merge=True)
# Delete old session
session_tool(operation="delete", session_name="old-feature")
Best Practices
- Use descriptive session names:
feature-X
,fix-Y
,refactor-Z
- Start a session before making changes to isolate work
- Use auto-merge only for well-tested changes
- Check current session status regularly
memory_tool
Purpose: Store and retrieve project knowledge across chat sessions
Operations
store
- Store new memorysearch
- Search memories semanticallycontext
- Get session startup contextlist
- List memories by categoryupdate
- Update existing memorystats
- Get memory statistics
Memory Categories
progress
- Project milestoneslearning
- Technical insightspreference
- User preferencesmistake
- Errors and correctionssolution
- Working solutionsarchitecture
- Design decisionsintegration
- Component interactionsdebug
- Debugging experiences
Importance Levels
- 5 (CRITICAL) - Must always remember
- 4 (HIGH) - Very important
- 3 (MEDIUM) - Standard importance
- 2 (LOW) - Nice to remember
- 1 (MINIMAL) - Archive level
Example Usage
# Store a learning
memory_tool(
operation="store",
category="learning",
content="Always validate email format before database queries",
importance=4
)
# Store a mistake
memory_tool(
operation="store",
category="mistake",
content="MISTAKE: Used sync DB calls in async endpoint. CORRECTION: Use async DB session. PREVENTION: Always check if endpoint is async.",
importance=5
)
# Search memories
memory_tool(
operation="search",
query="database optimization techniques",
max_results=5
)
# Get startup context
memory_tool(operation="context")
Best Practices
- Store memories immediately after learning
- Use
context
at start of new sessions - Search before starting similar work
- Always record mistakes with prevention strategies
- Update importance as patterns prove useful
write_tool
Purpose: Write new code with automatic formatting, dependency checking, and quality analysis
Features
- Automatic code formatting (Black/Ruff for Python, Prettier for JS/TS)
- Dependency conflict detection
- Quality scoring (0-1 scale)
- Auto-commit when quality β₯ 0.8
- Language auto-detection
Parameters
file_path
- Path to writecontent
- Code contentpurpose
- What this code doeslanguage
- python/javascript/typescript (auto-detected)save_to_file
- Whether to save (default: True)
Example Usage
# Write a new Python file
write_tool(
file_path="api/routes/auth.py",
content='''
from fastapi import APIRouter, Depends
from pydantic import EmailStr
router = APIRouter()
@router.post("/login")
async def login(email: EmailStr, password: str):
# Login logic here
return {"status": "success"}
''',
purpose="User authentication endpoint"
)
Quality Scoring
The quality score is calculated based on:
- β Formatting success (0.3)
- β No missing dependencies (0.3)
- β No duplicate definitions (0.2)
- β No syntax errors (0.2)
Score β₯ 0.8 enables auto-commit
edit_tool
Purpose: AI-assisted code editing using Gemini API
Features
- Gemini 2.5 Flash-Lite powered editing
- Intelligent diff generation
- Automatic error correction
- Rate limiting with exponential backoff
- Quality validation after edit
Parameters
target_file
- File to editinstructions
- What to change (single sentence)code_edit
- Minimal edit specificationlanguage
- Auto-detected if not providedsave_to_file
- Whether to save (default: True)
Example Usage
# Edit existing file
edit_file(
target_file="api/routes/auth.py",
instructions="Add rate limiting to login endpoint",
code_edit='''
// ... existing code ...
from slowapi import Limiter
// ... existing code ...
@router.post("/login")
@limiter.limit("5/minute")
async def login(email: EmailStr, password: str):
// ... existing code ...
'''
)
Best Practices
- Use
// ... existing code ...
to skip unchanged sections - Provide clear, single-sentence instructions
- Include enough context around edits
- Don't specify entire file, only changed parts
Timeout Handling
If editing takes >30 seconds, you'll receive a timeout message. This prevents Claude Desktop from timing out. Options:
- Wait 30-60 seconds and check if edit completed
- Check for partial changes with git diff
- Break complex edits into smaller parts
search_tool
Purpose: Multi-mode code search (semantic, fuzzy, text, symbol)
Search Modes
Mode | Use Case | Speed |
---|---|---|
semantic |
Find code by meaning/intent | Fast (<1s) |
fuzzy_symbol |
Find functions/classes by approximate name | Very Fast |
text |
Find exact text/regex patterns | Fast |
symbol_exact |
Find symbols by exact name | Very Fast |
Example Usage
# Semantic search
search_tool(
query="functions that handle user authentication",
search_type="semantic",
max_results=10
)
# Fuzzy symbol search
search_tool(
query="usr_auth",
search_type="fuzzy_symbol",
symbol_type="function"
)
# Text search with regex
search_tool(
query="TODO:|FIXME:",
search_type="text",
use_regex=True,
file_pattern="*.py"
)
Performance
- Semantic search: <1 second for typical codebase
- Uses local AllMiniLM-L6-v2 embeddings
- FAISS vector indexing
- Memory footprint: ~100MB for medium projects
API Reference
Codebase MCP FastAPI server provides 40+ REST endpoints for direct integration.
Base URL
http://localhost:6789
API Categories
- Session Management:
/session/*
- Memory System:
/memory/*
- Git Operations:
/git/*
- File Operations:
/file/*
,/write
,/edit
- Search:
/search/*
- Project Analysis:
/project/*
- Health & Monitoring:
/health
,/logs
Authentication
Currently, the API runs locally without authentication. Future versions may add API key authentication for remote access.
Example API Calls
Health Check
curl http://localhost:6789/health
Search Code
curl -X POST http://localhost:6789/search \
-H "Content-Type: application/json" \
-d '{
"query": "user authentication",
"search_type": "semantic",
"max_results": 5
}'
Write File
curl -X POST http://localhost:6789/write \
-H "Content-Type: application/json" \
-d '{
"file_path": "test.py",
"content": "print(\"Hello World\")",
"purpose": "Test file"
}'
Key Endpoints
Session Endpoints
Endpoint | Method | Description |
---|---|---|
/session |
POST | Create/manage sessions |
/session/current |
GET | Get current session status |
/session/auto-commit |
POST | Auto-commit changes |
Memory Endpoints
Endpoint | Method | Description |
---|---|---|
/memory/store |
POST | Store new memory |
/memory/search |
POST | Search memories |
/memory/context |
GET | Get startup context |
/memory/stats |
GET | Get statistics |
/memory/{id} |
PUT | Update memory |
Search Endpoints
Endpoint | Method | Description |
---|---|---|
/search |
POST | Semantic search |
/search/text |
POST | Text/regex search |
/search/symbols |
POST | Symbol search |
Architecture
System Overview
Codebase MCP uses a three-tier architecture designed for performance and separation of concerns:
βββββββββββββββββββββββ
β Claude Desktop β (MCP Client)
β (or any MCP LLM) β
ββββββββββββ¬βββββββββββ
β MCP Protocol
β
βββββββββββββββββββββββ
β MCP Server β (Lightweight Proxy)
β (mcp_server.py) β
ββββββββββββ¬βββββββββββ
β HTTP/REST
β
βββββββββββββββββββββββ
β FastAPI Server β (Processing Engine)
β (main.py) β
β β
β βββββββββββββββββ β
β β API Routers β β β 40+ endpoints
β βββββββββββββββββ€ β
β β Code Tools β β β Write, Edit, Format
β βββββββββββββββββ€ β
β β Semantic β β β FAISS + Embeddings
β β Search β β
β βββββββββββββββββ€ β
β β Memory System β β β SQLite + Vectors
β βββββββββββββββββ€ β
β β Git Manager β β β .codebase/ tracking
β βββββββββββββββββ β
βββββββββββββββββββββββ
Component Breakdown
1. MCP Server (Proxy Layer)
- Lightweight HTTP client
- Converts MCP tool calls to FastAPI requests
- Handles error formatting for LLM consumption
- Keeps Claude Desktop responsive
- ~3,000 lines of Python
2. FastAPI Server (Processing Engine)
- 40+ REST API endpoints
- Async request handling
- Modular router architecture
- Comprehensive error handling
- Real-time logging
3. Code Tools
- Write Pipeline: Format β Dependency Check β Quality Score
- Edit Pipeline: Gemini API β Error Correction β Format
- Git Manager: Dual directory (.git + .codebase)
- Formatter: Black, Ruff (Python) / Prettier (JS/TS)
- Dependency Checker: AST parsing, conflict detection
4. Semantic Search Engine
- Embeddings: Local AllMiniLM-L6-v2 model
- Vector Store: FAISS indexing
- Metadata: SQLite database
- Chunking: AST-aware code splitting
- Performance: <1s search for typical projects
5. Memory System
- Storage: SQLite + FAISS vectors
- Categories: 8 knowledge types
- Persistence: Survives restarts
- Search: Semantic retrieval
- Importance: 1-5 priority levels
Data Flow
Write Operation Flow
1. Claude: "Create auth endpoint"
β
2. MCP Server: Convert to write_tool call
β
3. FastAPI: POST /write
β
4. Write Pipeline:
- Detect language
- Format code (Black/Ruff/Prettier)
- Check dependencies
- Calculate quality score
β
5. Quality β₯ 0.8? Auto-commit to .codebase
β
6. Return: Results + quality metrics
β
7. Claude: Show success + details
Semantic Search Flow
1. Claude: "Find auth functions"
β
2. MCP Server: search_tool(query="auth functions")
β
3. FastAPI: POST /search
β
4. Semantic Search Engine:
- Generate query embedding
- FAISS vector search
- Retrieve top K results
- Enrich with metadata
β
5. Return: Ranked results with context
β
6. Claude: Format and present to user
Storage Structure
Project Directory
your-project/
βββ .git/ # Your personal git repo
βββ .codebase/ # AI-tracked changes
β βββ HEAD
β βββ objects/
β βββ refs/
βββ .data/ # Codebase MCP data
β βββ metadata.db # SQLite: symbols, files
β βββ vectors.faiss # FAISS: embeddings
βββ src/ # Your source code
βββ ...
Why Separate .codebase/?
- β Clean separation of AI vs human changes
- β Prevents accidental commits of AI experiments
- β Easy rollback of AI-generated code
- β You maintain full control of .git
- β Transparent tracking of AI contributions
Performance Characteristics
Operation | Typical Time | Notes |
---|---|---|
Semantic Search | <1 second | For <20K lines |
Write Tool | 1-3 seconds | Includes formatting |
Edit Tool | 5-15 seconds | Gemini API call |
Memory Search | <0.5 seconds | Fast vector lookup |
Initial Indexing | ~30 seconds | For 10K lines |
Scalability
- Optimal: Projects under 20,000 lines
- Storage: ~100MB for medium project (embeddings + metadata)
- Memory: ~500MB RAM during operation
- Indexing: Linear scaling with codebase size
Prompt Engineering
Codebase MCP includes an optimized system prompt for development workflows. The prompt is exposed as an MCP prompt accessible to Claude.
System Prompt Philosophy
The system prompt configures Claude as an "Elite Solo Software Developer" with complete autonomy over standard development tasks. Key principles:
- π― Autonomous Execution - Make decisions without asking for permission on standard tasks
- π Context-Driven - Always gather context before acting (memory, git, codebase)
- ποΈ Architecture-First - Plan before building, following SOLID principles
- β Quality-Obsessed - Production-ready code on first attempt
- π Memory-Driven - Learn from mistakes, never repeat them
Workflow Loop
The prompt enforces a 5-phase workflow:
GATHER β PLAN β BUILD β VERIFY β FINALIZE
Phase 1: GATHER Intelligence
- Load project memory (
memory_tool operation="context"
) - Check git status (
git_tool operation="status"
) - Search for related code (
search_tool
) - Review past mistakes to avoid repetition
Phase 2: PLAN Architecture
- Design data flow: Request β Controller β Service β Repository
- Define DTOs for type-safe boundaries
- Consider scaling (10x, 100x traffic)
- Plan error handling and validation
- Start development session (
session_tool operation="start"
)
Phase 3: BUILD Systematically
- Write production-quality code with types
- Follow project patterns and conventions
- Use
write_tool
for new files - Use
edit_tool
for modifications - Add comprehensive error handling
Phase 4: VERIFY Rigorously
- Quality score must be β₯ 0.8
- All dependencies resolved
- Type safety complete
- Error handling present
- Tests pass (if test suite exists)
Phase 5: FINALIZE & Learn
- Commit work (auto-commit if quality β₯ 0.8)
- Store learnings in memory
- Record mistakes with prevention strategies
- Provide clear summary of changes
Effective Prompts
β Good Prompts (Action-Oriented)
# Clear goal, let Claude handle details
"Create a user authentication system with JWT tokens"
# Specific but not micromanaging
"Add rate limiting to the login endpoint using SlowAPI"
# Leverages context
"Refactor the database queries to use async properly"
# Trusts autonomy
"Fix the bugs in the payment processing module"
β Avoid (Too Prescriptive)
# Don't micromanage implementation
"First read auth.py lines 10-50, then search for JWT,
then write a new function called validate_token_v2..."
# Don't ask for permission on standard tasks
"Can you please maybe consider possibly adding
some error handling if that's okay?"
# Don't specify tool usage
"Use the write_tool to create a file and then
use the git_tool to commit it..."
Domain-Specific Tips
Backend Development (FastAPI/Python)
- "Create a REST API endpoint for [feature]"
- "Add Pydantic validation to [model]"
- "Implement repository pattern for [entity]"
- "Add async database operations to [service]"
Frontend Development (React/TypeScript)
- "Build a React component for [feature] with TypeScript"
- "Add form validation using React Hook Form"
- "Create a custom hook for [functionality]"
- "Implement responsive design for [component]"
Refactoring
- "Refactor [file] to follow SOLID principles"
- "Extract reusable logic from [component]"
- "Improve type safety in [module]"
- "Add error boundaries to [components]"
Debugging
- "Fix the [specific error] in [module]"
- "Debug why [feature] isn't working correctly"
- "Investigate performance issues in [component]"
Accessing the System Prompt
The system prompt is available as an MCP prompt:
# In Claude, use the prompt
@system-prompt
# Or ask Claude to use it
"Use the system prompt to guide your development approach"
Best Practices
Session Management
- β Start a session before making changes
- β
Use descriptive session names:
feature-X
,fix-Y
- β Keep sessions focused on one logical unit of work
- β Check session status regularly
- β Don't leave sessions open indefinitely
- β Don't mix unrelated changes in one session
Memory System
- β Load context at the start of new chat sessions
- β Store learnings immediately after discovering them
- β Always record mistakes with prevention strategies
- β Search memories before starting similar work
- β Update importance levels as patterns prove useful
- β Don't store trivial information (importance 1-2)
- β Don't forget to use
operation="context"
in new sessions
Code Quality
- β Let quality score guide whether to commit (β₯ 0.8)
- β Fix formatting/dependency issues before proceeding
- β
Add types everywhere (no
any
in TypeScript) - β Include error handling in all new code
- β Write descriptive commit messages
- β Don't ignore quality warnings
- β Don't skip dependency checking
Search & Discovery
- β Use semantic search for conceptual queries
- β Use fuzzy symbol search for approximate names
- β Use text search for exact patterns/TODOs
- β
Limit results to relevant files with
file_pattern
- β Don't search without purpose
- β Don't ignore search results before writing similar code
AI-Assisted Editing
- β Read file first to understand context
- β Use clear, single-sentence instructions
- β
Mark unchanged code with
// ... existing code ...
- β Break complex edits into smaller parts
- β Include context around edited sections
- β Don't specify the entire file content
- β Don't make multiple unrelated edits at once
- β Don't omit existing code markers
Project Organization
- β Keep projects under 20,000 lines for optimal performance
- β Use meaningful file and directory names
- β Follow consistent code organization patterns
- β Separate concerns (API, business logic, data)
- β Don't mix Python and JavaScript in same directory
- β Don't create deeply nested structures
Performance Tips
- β Reindex periodically if codebase changes significantly
- β
Use
file_pattern
to limit search scope - β Keep semantic search queries focused
- β Monitor .data/ directory size
- β Don't index generated/node_modules folders
- β Don't run edit_tool on very large files
Examples
Example 1: Create New Feature
Scenario: Building User Authentication
# 1. Start session
"Start a development session for user authentication feature"
# 2. Search for existing patterns
"Find all authentication-related code in the codebase"
# 3. Create backend endpoint
"Create a FastAPI endpoint for user login with email/password validation"
# 4. Create frontend component
"Create a React login form component with TypeScript and form validation"
# 5. Test and commit
"Check the current session status and show me what's been changed"
# 6. Store learning
"Store this learning: User auth requires rate limiting on login attempts"
# 7. End session
"End the session and merge changes to main"
Example 2: Debug Existing Code
Scenario: Fixing Async Database Bug
# 1. Search for problem area
"Find all database query functions in the user service"
# 2. Analyze the issue
"Read the get_user_by_email function from user_service.py"
# 3. Fix the bug
"Edit user_service.py to make the database queries properly async"
# 4. Store mistake
"Store this mistake: Used sync DB calls in async endpoint.
Always use AsyncSession for database operations."
# 5. Verify fix
"Show git diff for user_service.py"
Example 3: Refactor Code
Scenario: Extract Reusable Validation Logic
# 1. Find duplicate code
"Search for email validation logic across the codebase"
# 2. Plan refactoring
"I want to extract email validation into a reusable utility.
Search for existing validation utilities."
# 3. Create utility
"Create a validation utility module with email, phone,
and password validation functions"
# 4. Update existing code
"Update the user registration endpoint to use the new
validation utility"
# 5. Store solution
"Store this solution: Centralized validation utilities
prevent code duplication and ensure consistency"
Example 4: Add Feature to Existing File
Scenario: Adding Rate Limiting
# 1. Read current implementation
"Show me the login endpoint in auth_routes.py"
# 2. Edit with AI assistance
"Edit auth_routes.py to add rate limiting using SlowAPI.
Limit login attempts to 5 per minute per IP address."
# 3. Check quality
"Show me the quality score and any warnings for auth_routes.py"
# 4. Commit if good
"If quality is above 0.8, commit the changes"
Example 5: Start New Project
Scenario: Create FastAPI Boilerplate
# 1. Create project structure
"Create a FastAPI project structure with:
- main.py (entry point)
- api/routes/ (routers)
- services/ (business logic)
- models/ (Pydantic models)
- config.py (configuration)
Include proper async patterns and dependency injection"
# 2. Store architecture decision
"Store this architecture: Using Clean Architecture with
FastAPI - separate routers, services, and data layers
for maintainability"
# 3. Create initial endpoint
"Create a health check endpoint at /health"
# 4. Set up session tracking
"Start a session called 'project-init' for this work"
Example 6: Complex Multi-Step Task
Scenario: Add Payment Processing
# 1. Start session
"Start a session for payment-processing feature"
# 2. Load context
"Load memory context about payment requirements"
# 3. Search existing integrations
"Search for any existing payment or API integration code"
# 4. Create service layer
"Create a payment service that integrates with Stripe:
- Initialize Stripe client
- Create payment intent
- Confirm payment
- Handle webhooks
Include proper error handling and logging"
# 5. Create API endpoints
"Create FastAPI endpoints for:
- POST /payments/create (create payment intent)
- POST /payments/confirm (confirm payment)
- POST /webhooks/stripe (handle Stripe webhooks)
Include Pydantic models and proper status codes"
# 6. Add frontend
"Create a React payment form component that:
- Collects card details
- Calls the payment API
- Shows loading and error states
- Handles success/failure"
# 7. Store learnings
"Store these learnings:
1. Always validate Stripe webhook signatures
2. Use idempotency keys for payment retries
3. Log all payment events for audit"
# 8. Review and commit
"Show session status and commit high-quality changes"
Troubleshooting
Common Issues
Issue: MCP Server Not Loading in Claude Desktop
Symptoms: Tools don't appear in Claude Desktop after restart
Solutions:
- Verify
claude_desktop_config.json
paths are absolute - Check Python path points to virtual environment python.exe
- Ensure
mcp_server.py
path is correct - Try restarting Claude Desktop completely (quit from system tray)
- Check Claude Desktop logs for errors
Issue: FastAPI Server Won't Start
Symptoms: Error when running python main.py
Solutions:
- Ensure virtual environment is activated
- Verify all dependencies installed:
pip install -r requirements.txt
- Check if port 6789 is already in use
- Try a different port:
python main.py /path/to/project --port 8000
- Check Python version:
python --version
(need 3.11+)
Issue: Gemini API Errors
Symptoms: Edit tool fails with API errors
Solutions:
- Verify
GEMINI_API_KEY
in .env file - Check API key is valid at Google AI Studio
- Verify rate limits not exceeded (15 RPM, 250K TPM, 1K RPD)
- Wait if rate limited (tool has exponential backoff)
- For persistent issues, replace with local LLM
Issue: Edit Tool Times Out
Symptoms: Edit operations take >30 seconds
Solutions:
- Wait 30-60 seconds and check if edit completed
- Check git diff to see partial changes
- Break large edits into smaller chunks
- Edit smaller files (<500 lines)
- Use write_tool instead for major rewrites
Issue: Semantic Search Returns No Results
Symptoms: Search finds nothing despite code existing
Solutions:
- Check if .data/ directory exists and has files
- Reindex the codebase (restart FastAPI server)
- Verify file patterns in .gitignore don't exclude code
- Try different search query phrasing
- Use text search instead of semantic for exact matches
Issue: Quality Score Always Low
Symptoms: Write tool reports quality < 0.8
Solutions:
- Check formatting errors (Black/Ruff for Python)
- Ensure all imports are available
- Fix syntax errors
- Remove duplicate definitions
- Add missing dependencies to project
Issue: Session Branch Conflicts
Symptoms: Can't switch or merge session branches
Solutions:
- Check uncommitted changes:
git_tool operation="status"
- Commit or stash changes before switching
- Use
session_tool operation="list"
to see all sessions - Delete problematic session:
session_tool operation="delete"
- Manually resolve conflicts in .codebase/ directory
Issue: Memory Not Persisting
Symptoms: Memories disappear after restart
Solutions:
- Check if .data/metadata.db exists and has data
- Verify file permissions on .data/ directory
- Ensure no errors when storing memories
- Check SQLite database integrity
Performance Issues
Slow Semantic Search
- Reduce max_results parameter
- Use file_pattern to limit scope
- Ensure project is under 20,000 lines
- Check system RAM availability
High Memory Usage
- Restart FastAPI server periodically
- Reduce embedding model size (if customized)
- Clear old memory entries
- Limit concurrent operations
Getting Help
- π Check GitHub Issues
- π¬ Open a new issue with:
- Operating system and Python version
- Full error message
- Steps to reproduce
- Relevant logs from FastAPI server
- π Include FastAPI server logs:
http://localhost:6789/logs
Contributing
Codebase MCP is open source and welcomes contributions! Whether you're fixing bugs, adding features, or improving documentation, your help is appreciated.
Ways to Contribute
π Report Bugs
- Check existing issues first
- Include Python version, OS, and full error messages
- Provide steps to reproduce
- Share relevant logs
π‘ Suggest Features
- Open an issue with [Feature Request] tag
- Explain the use case and benefits
- Discuss implementation approach
π Improve Documentation
- Fix typos and unclear explanations
- Add examples and tutorials
- Translate documentation
- Create video guides
π§ Submit Code
- Fork the repository
- Create a feature branch
- Write clean, well-documented code
- Add tests if applicable
- Submit pull request
Priority Areas
We're especially interested in contributions for:
1. Language Support
- Add Java, Go, Rust chunkers
- Improve JavaScript/TypeScript parsing
- Add more formatter integrations
2. Local LLM Integration
- Replace Gemini API with local models
- Add LLM provider options
- Optimize for CPU/GPU inference
3. Semantic Search Improvements
- Better chunking algorithms
- Improved embedding models
- Faster indexing
- Multi-language support
4. UI/UX Enhancements
- Web dashboard improvements
- Real-time progress indicators
- Better visualizations
- Mobile responsiveness
5. Testing & Quality
- Increase test coverage
- Add integration tests
- Performance benchmarks
- CI/CD improvements
Development Setup
# Fork and clone
git clone https://github.com/YOUR_USERNAME/codebase-mcp.git
cd codebase-mcp
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # or .venv\Scripts\activate on Windows
# Install dependencies + dev dependencies
pip install -r requirements.txt
pip install pytest pytest-asyncio black ruff mypy
# Run tests
pytest
# Run linters
black .
ruff check .
mypy .
Code Guidelines
- Follow existing code style (Black formatting)
- Add type hints to all functions
- Write docstrings for public APIs
- Keep functions focused and testable
- Add comments for complex logic
- Update documentation for new features
Pull Request Process
- Create a descriptive branch name:
feature/add-java-support
- Write clear commit messages
- Ensure all tests pass
- Update README/docs if needed
- Submit PR with detailed description
- Respond to review feedback
Code of Conduct
We're committed to providing a welcoming and inclusive environment. Please be respectful and constructive in all interactions.
See CONTRIBUTING.md for full guidelines.
Development Setup
Detailed development environment setup for contributors.
Prerequisites
- Python 3.11+
- Git
- Virtual environment tool (venv or uv)
Full Development Environment
# Clone your fork
git clone https://github.com/YOUR_USERNAME/codebase-mcp.git
cd codebase-mcp
# Add upstream remote
git remote add upstream https://github.com/danyQe/codebase-mcp.git
# Create and activate virtual environment
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# Install all dependencies
pip install -r requirements.txt
# Install development tools
pip install pytest pytest-asyncio pytest-cov black ruff mypy pre-commit
# Install pre-commit hooks (optional)
pre-commit install
# Run full test suite
pytest --cov=. --cov-report=html
# Start dev server with auto-reload
python main.py /path/to/test/project --reload
Running Tests
# Run all tests
pytest
# Run specific test file
pytest tests/test_write_tool.py
# Run with coverage
pytest --cov=. --cov-report=html
# Run specific test
pytest tests/test_write_tool.py::test_quality_score
Code Quality Checks
# Format code
black .
# Check linting
ruff check .
# Type checking
mypy .
# Fix auto-fixable issues
ruff check --fix .
Roadmap
Current Version: v1.0.0-beta
Near Term (v1.1.0)
- π§ Docker support for easier deployment
- π More language support (Java, Go, Rust)
- π¨ Web dashboard improvements
- β‘ Performance optimizations
- π Video tutorials and guides
Mid Term (v1.2.0 - v1.5.0)
- π€ Local LLM integration (Ollama, LLaMA)
- π Improved semantic search algorithms
- π§ͺ Comprehensive test suite
- π Analytics and insights dashboard
- π Plugin system for extensibility
Long Term (v2.0.0+)
- βοΈ Optional cloud sync for memory
- π₯ Team collaboration features
- π’ Enterprise support
- π Multi-LLM orchestration
- π± Mobile app support
Want to contribute to the roadmap? Open an issue or join development!