freshcrate
Skin:/
Home > Frameworks > kg-rag

kg-rag

This project implements a comprehensive framework for Knowledge Graph Retrieval Augmented Generation (KG-RAG). It focuses on financial data from SEC 10-Q filings and explores how knowledge graphs can

Why this rank:Strong adoptionRecent releaseHealthy release cadence

Description

This project implements a comprehensive framework for Knowledge Graph Retrieval Augmented Generation (KG-RAG). It focuses on financial data from SEC 10-Q filings and explores how knowledge graphs can improve information retrieval and question answering compared to baseline approaches.

README

KG-RAG: Knowledge Graph-based Retrieval Augmented Generation

This repository contains a collection of implementations for Knowledge Graph-based RAG (Retrieval Augmented Generation) approaches and baseline methods for comparison. The code is structured as a Python package with modular components.

Overview

The repository implements several RAG approaches:

  1. Baseline approaches:

    • Standard RAG: Traditional retrieval-based approach using vector similarity
    • Chain-of-Thought RAG: Enhanced retrieval with explicit reasoning steps
  2. KG-RAG approaches:

    • Entity-based approach: Uses embedding-based entity matching and beam search to find relevant information in the knowledge graph
    • Cypher-based approach: Uses Cypher queries to retrieve information from a Neo4j graph database
    • GraphRAG-based approach: Implements a community detection and hierarchical search strategy

Installation

Using uv (Recommended)

This project uses uv for dependency management.

# Clone the repository
git clone https://github.com/yourusername/kg-rag.git
cd kg-rag

# Install uv if you don't have it
curl -sSf https://astral.sh/uv/install.sh | bash

uv sync
source .venv/bin/activate

For development, you can install the dev dependencies:

uv sync --dev
source .venv/bin/activate

Environment Variables

Export the following environment variables:

OPENAI_API_KEY=your_openai_api_key

For the Cypher-based approach, also add:

NEO4J_URI=bolt://localhost:7687
NEO4J_USER=neo4j
NEO4J_PASSWORD=your_password

Usage

1. Building Vector Store for Baseline Methods

First, build a vector store for the baseline RAG methods:

python -m scripts.build_baseline_vectordb \
    --docs-dir data/sec-10-q/docs \
    --collection-name sec_10q \
    --persist-dir chroma_db \
    --verbose

2. Building Knowledge Graphs

Build a knowledge graph for KG-RAG methods:

python -m scripts.build_entity_graph \
    --docs-dir data/sec-10-q/docs \
    --output-dir data/graphs \
    --graph-name sec10q_entity_graph \
    --verbose

3. Running Interactive Query Mode

To interactively query using baseline methods:

python -m scripts.run_baseline_rag \
    --collection-name sec_10q \
    --persist-dir chroma_db \
    --model gpt-4o \
    --verbose

To interactively query using KG-RAG methods:

python -m scripts.run_entity_rag \
    --graph-path data/graphs/sec10q_entity_graph.pkl \
    --beam-width 10 \
    --max-depth 8 \
    --top-k 100 \
    --verbose

4. Running Evaluation

To evaluate the performance of various RAG methods on a test dataset:

python -m kg_rag.evaluation.run_evaluation \
    --data-path data/test_questions.csv \
    --graph-path data/graphs/sec10q_entity_graph.pkl \
    --method all \
    --output-dir evaluation_results \
    --collection-name sec_10q \
    --persist-dir chroma_db \
    --max-samples 50 \
    --verbose

5. Running Hyperparameter Search

To find the optimal hyperparameters for a method:

python -m kg_rag.evaluation.hyperparameter_search \
    --data-path data/test_questions.csv \
    --graph-path data/graphs/sec10q_entity_graph.pkl \
    --method entity \
    --configs-path kg_rag/evaluation/hyperparameter_configs.json \
    --output-dir hyperparameter_search \
    --max-samples 10 \
    --verbose

Development

Pre-commit hooks

This project uses pre-commit hooks to ensure code quality:

# Run pre-commit hooks on all files
pre-commit run --all-files

Running tests

# Run tests
pytest

# Run tests with coverage
pytest --cov=kg_rag tests/

Release History

VersionChangesUrgencyDate
main@2026-06-03Latest activity on main branchHigh6/3/2026
0.0.0No release found โ€” using repo HEADHigh4/9/2026

Dependencies & License Audit

Loading dependencies...

Similar Packages

memoraGive your AI agents persistent memory.v0.2.29
fed-ragA framework for fine-tuning retrieval-augmented generation (RAG) systems.v0.0.27
FlexRAGFlexRAG: A RAG Framework for Information Retrieval and Generation.0.3.0
DrugClaw๐Ÿฆ€ Agentic RAG for drug intelligence ยท 57 skills ยท 15 task categories ยท DTI ยท ADR ยท DDI ยท PGx ยท Repurposing ยท Powered by LangGraphmain@2026-06-03
monocleMonocle is a framework for tracing GenAI app code. This repo contains implementation of Monocle for GenAI apps written in Python. v0.8.3

More from VectorInstitute

fed-ragA framework for fine-tuning retrieval-augmented generation (RAG) systems.
retrieval-augmented-generationReference Implementations for the RAG bootcamp

More in Frameworks

ctranslate2Fast inference engine for Transformer models
schemathesisProperty-based testing framework for Open API and GraphQL based apps
spec_driven_developSpec-Driven Develop is a platform-agnostic AI agent skill that automates the pre-development workflow for large-scale complex tasks. It is not a framework, not a runtime, not a package manager โ€” it is
DrasilGenerate all the things (focusing on research software)