QDrant Loader

📝 Changelog v1.0.0 - Latest improvements and bug fixes

A comprehensive toolkit for loading data into Qdrant vector database with advanced MCP server support for AI-powered development workflows.

🎯 What is QDrant Loader?

QDrant Loader is a data ingestion and retrieval system that collects content from multiple sources, processes and vectorizes it, then provides intelligent search capabilities through a Model Context Protocol (MCP) server for AI development tools.

Perfect for:

🤖 AI-powered development with Cursor, Windsurf, and other MCP-compatible tools
📚 Knowledge base creation from technical documentation
🔍 Intelligent code assistance with contextual information
🏢 Enterprise content integration from multiple data sources

📦 Packages

This monorepo contains three complementary packages:

🔄 QDrant Loader

Data ingestion and processing engine

Collects and vectorizes content from multiple sources into QDrant vector database.

Key Features:

Multi-source connectors: Git, Confluence (Cloud & Data Center), JIRA (Cloud & Data Center), Public Docs, Local Files
File conversion: PDF, Office docs (Word, Excel, PowerPoint), images, audio, EPUB, ZIP, and more using MarkItDown
Smart chunking: Modular chunking strategies with intelligent document processing and hierarchical context
Incremental updates: Change detection and efficient synchronization
Multi-project support: Organize sources into projects with shared collections
Provider-agnostic LLM: OpenAI, Azure OpenAI, Ollama, and custom endpoints with unified configuration

⚙️ QDrant Loader Core

Core library and LLM abstraction layer

Provides the foundational components and provider-agnostic LLM interface used by other packages.

Key Features:

LLM Provider Abstraction: Unified interface for OpenAI, Azure OpenAI, Ollama, and custom endpoints
Configuration Management: Centralized settings and validation for LLM providers
Rate Limiting: Built-in rate limiting and request management
Error Handling: Robust error handling and retry mechanisms
Logging: Structured logging with configurable levels

🔌 QDrant Loader MCP Server

AI development integration layer

Model Context Protocol server providing search capabilities to AI development tools.

Key Features:

MCP Protocol 2025-06-18: Latest protocol compliance with dual transport support (stdio + HTTP)
Advanced search tools: Semantic search, hierarchy-aware search, attachment discovery, and conflict detection
Cross-document intelligence: Document similarity, clustering, relationship analysis, and knowledge graphs
Streaming capabilities: Server-Sent Events (SSE) for real-time search results
Production-ready: HTTP transport with security, session management, and health checks

🚀 Quick Start

Installation

# Install both packages
pip install qdrant-loader qdrant-loader-mcp-server

# Or install individually
pip install qdrant-loader          # Data ingestion only
pip install qdrant-loader-mcp-server  # MCP server only

5-Minute Setup

Create a workspace
```
mkdir my-workspace && cd my-workspace
```
Initialize workspace with templates
```
qdrant-loader init --workspace .
```

Configure your environment (edit .env)

# Qdrant connection
QDRANT_URL=http://localhost:6333
QDRANT_COLLECTION_NAME=my_docs

# LLM provider (new unified configuration)
OPENAI_API_KEY=your_openai_key
LLM_PROVIDER=openai
LLM_BASE_URL=https://api.openai.com/v1
LLM_EMBEDDING_MODEL=text-embedding-3-small
LLM_CHAT_MODEL=gpt-4o-mini

Configure data sources (edit config.yaml)

global:
  qdrant:
    url: "http://localhost:6333"
    collection_name: "my_docs"
  llm:
    provider: "openai"
    base_url: "https://api.openai.com/v1"
    api_key: "${OPENAI_API_KEY}"
    models:
      embeddings: "text-embedding-3-small"
      chat: "gpt-4o-mini"
    embeddings:
      vector_size: 1536

projects:
  my-project:
    project_id: "my-project"
    sources:
      git:
        docs-repo:
          base_url: "https://github.com/your-org/your-repo.git"
          branch: "main"
          file_types: ["*.md", "*.rst"]

Load your data
```
qdrant-loader ingest --workspace .
```

Start the MCP server

mcp-qdrant-loader --env /path/tp/your/.env

🔧 Integration with Cursor

Add to your Cursor settings (.cursor/mcp.json):

{
  "mcpServers": {
    "qdrant-loader": {
      "command": "/path/to/venv/bin/mcp-qdrant-loader",
      "env": {
        "QDRANT_URL": "http://localhost:6333",
        "QDRANT_COLLECTION_NAME": "my_docs",
        "OPENAI_API_KEY": "your_key"
      }
    }
  }
}

Alternative: Use configuration file (recommended for complex setups):

{
  "mcpServers": {
    "qdrant-loader": {
      "command": "/path/to/venv/bin/mcp-qdrant-loader",
      "args": [
        "--config",
        "/path/to/your/config.yaml",
        "--env",
        "/path/to/your/.env"
      ]
    }
  }
}

Example queries in Cursor:

"Find documentation about authentication in our API"
"Show me examples of error handling patterns"
"What are the deployment requirements for this service?"
"Find all attachments related to database schema"

📚 Documentation

Getting Started

Installation Guide - Complete setup instructions
Quick Start - Step-by-step tutorial
Core Concepts - Covered inline in Getting Started

User Guides

Configuration - Complete configuration reference
Data Sources - Git, Confluence, JIRA setup
File Conversion - File processing capabilities
MCP Server - AI tool integration

⚠️ Migration Guide (v0.7.1+)

LLM Configuration Migration Required

New unified configuration: global.llm.* replaces legacy global.embedding.* and file_conversion.markitdown.*
Provider-agnostic: Now supports OpenAI, Azure OpenAI, Ollama, and custom endpoints
Legacy support: Old configuration still works but shows deprecation warnings
Action required: Update your config.yaml to use the new syntax (see examples above)

Migration Resources

Configuration File Reference - Complete new schema
Environment Variables - Updated variable names

Developer Resources

Architecture - System design overview
Testing - Testing guide and best practices
Contributing - Development setup and guidelines

🤝 Contributing

We welcome contributions! See our Contributing Guide for:

Development environment setup
Code style and standards
Pull request process

Quick Development Setup

# Clone and setup
git clone https://github.com/martin-papy/qdrant-loader.git
cd qdrant-loader

# Sync workspace environment (recommended)
uv sync --all-packages --all-extras

# Add a new dependency during development
uv add fastapi
uv sync

📄 License

This project is licensed under the GNU GPLv3 - see the LICENSE file for details.

Ready to get started? Check out our Quick Start Guide or browse the complete documentation.

Version	Changes	Urgency	Date
qdrant-loader-v1.0.0	### Added #### Qdrant-loader - Contextual embeddings for enriched chunk context during ingestion [#221] ### Fixed #### Qdrant-loader - Jira Cloud connection failure due to deprecated search API endpoints [#215] - Duplicate chunks for Python files by rewriting AST parser [#217] - Duplicate document IDs causing missing chunks in metric tracking [#222] - Ingestion metrics: aligned size metrics and aggregated project results [#222] - JQL injection and query breaking when configuration values in	High	4/14/2026
qdrant-loader-v0.9.0	### Added #### Qdrant-loader - `enable_semantic_analysis` global NLP kill switch in `chunking` config to skip spaCy/LDA processing entirely for faster ingestion [#189] - `enable_enhanced_semantic_analysis` opt-in flag in `chunking` config to gate advanced NLP fields (`pos_tags`, `dependencies`, `document_similarity`) [#195] #### Qdrant-loader-mcp-server - `expand_chunk_context` MCP tool to retrieve surrounding chunks for richer context around a specific chunk [#185] - `cluster_session_id` re	Medium	3/27/2026

qdrant-loader

Description

README

QDrant Loader

🎯 What is QDrant Loader?

📦 Packages

🔄 QDrant Loader

⚙️ QDrant Loader Core

🔌 QDrant Loader MCP Server

🚀 Quick Start

Installation

5-Minute Setup

🔧 Integration with Cursor

📚 Documentation

Getting Started

User Guides

⚠️ Migration Guide (v0.7.1+)

LLM Configuration Migration Required

Migration Resources

Developer Resources

🤝 Contributing

Quick Development Setup

📄 License

Release History

Dependencies & License Audit

Similar Packages