freshcrate
Skin:/
Home > Testing > fastRAG

fastRAG

Efficient Retrieval Augmentation and Generation Framework

Why this rank:Strong adoptionHealthy release cadence

Description

Efficient Retrieval Augmentation and Generation Framework

README

THIS PROJECT IS ARCHIVED

Intel will not provide or guarantee development of or support for this project, including but not limited to, maintenance, bug fixes, new releases or updates.
Patches to this project are no longer accepted by Intel.
If you have an ongoing need to use this project, are interested in independently developing it, or would like to maintain patches for the community, please create your own fork of the project.


Build and explore efficient retrieval-augmented generative models and applications

PyPI - Version PyPI - Downloads

📍 Installation • 🚀 Components • 📚 Examples • 🚗 Getting Started • 💊 Demos • ✏️ Scripts • 📊 Benchmarks

fastRAG is a research framework for efficient and optimized retrieval augmented generative pipelines, incorporating state-of-the-art LLMs and Information Retrieval. fastRAG is designed to empower researchers and developers with a comprehensive tool-set for advancing retrieval augmented generation.

Comments, suggestions, issues and pull-requests are welcomed! ❤️

Important

Now compatible with Haystack v2+. Please report any possible issues you find.

📣 Updates

  • 2024-05: fastRAG V3 is Haystack 2.0 compatible 🔥
  • 2023-12: Gaudi2 and ONNX runtime support; Optimized Embedding models; Multi-modality and Chat demos; REPLUG text generation.
  • 2023-06: ColBERT index modification: adding/removing documents; see IndexUpdater.
  • 2023-05: RAG with LLM and dynamic prompt synthesis example.
  • 2023-04: Qdrant DocumentStore support.

Key Features

  • Optimized RAG: Build RAG pipelines with SOTA efficient components for greater compute efficiency.
  • Optimized for Intel Hardware: Leverage Intel extensions for PyTorch (IPEX), 🤗 Optimum Intel and 🤗 Optimum-Habana for running as optimal as possible on Intel® Xeon® Processors and Intel® Gaudi® AI accelerators.
  • Customizable: fastRAG is built using Haystack and HuggingFace. All of fastRAG's components are 100% Haystack compatible.

🚀 Components

For a brief overview of the various unique components in fastRAG refer to the Components Overview page.

LLM Backends
Intel Gaudi Accelerators Running LLMs on Gaudi 2
ONNX Runtime Running LLMs with optimized ONNX-runtime
OpenVINO Running quantized LLMs using OpenVINO
Llama-CPP Running RAG Pipelines with LLMs on a Llama CPP backend
Optimized Components
Embedders Optimized int8 bi-encoders
Rankers Optimized/sparse cross-encoders
RAG-efficient Components
ColBERT Token-based late interaction
Fusion-in-Decoder (FiD) Generative multi-document encoder-decoder
REPLUG Improved multi-document decoder
PLAID Incredibly efficient indexing engine

📍 Installation

Preliminary requirements:

  • Python 3.8 or higher.
  • PyTorch 2.0 or higher.

To set up the software, install from pip or clone the project for the bleeding-edge updates. Run the following, preferably in a newly created virtual environment:

pip install fastrag

Extra Packages

There are additional dependencies that you can install based on your specific usage of fastRAG:

# Additional engines/components
pip install fastrag[intel]               # Intel optimized backend [Optimum-intel, IPEX]
pip install fastrag[openvino]            # Intel optimized backend using OpenVINO
pip install fastrag[elastic]             # Support for ElasticSearch store
pip install fastrag[qdrant]              # Support for Qdrant store
pip install fastrag[colbert]             # Support for ColBERT+PLAID; requires FAISS
pip install fastrag[faiss-cpu]           # CPU-based Faiss library
pip install fastrag[faiss-gpu]           # GPU-based Faiss library

To work with the latest version of fastRAG, you can install it using the following command:

pip install .

Development tools

pip install .[dev]

License

The code is licensed under the Apache 2.0 License.

Disclaimer

This is not an official Intel product.

Release History

VersionChangesUrgencyDate
v3.1.2## What's Changed * Updated IPEX embedder to work with new Haystack version (2.7) by @gadmarkovits in https://github.com/IntelLabs/fastRAG/pull/74 ## New Contributors * @gadmarkovits made their first contribution in https://github.com/IntelLabs/fastRAG/pull/74 **Full Changelog**: https://github.com/IntelLabs/fastRAG/compare/v3.1.1...v3.1.2Low11/25/2024
v3.1.1## What's Changed * Relax dependencies, add Streaming Callback by @dnoliver in https://github.com/IntelLabs/fastRAG/pull/71 * OpenVINO Serialization fix by @danielfleischer in https://github.com/IntelLabs/fastRAG/pull/73 ## New Contributors * @dnoliver made their first contribution in https://github.com/IntelLabs/fastRAG/pull/71 **Full Changelog**: https://github.com/IntelLabs/fastRAG/compare/v3.1.0...v3.1.1Low11/24/2024
v3.1.0## What's Changed * Update llava.py by @mosheber in https://github.com/IntelLabs/fastRAG/pull/54 * Remove indexing function by @mosheber in https://github.com/IntelLabs/fastRAG/pull/55 * IPEX benchmarking fix by @peteriz in https://github.com/IntelLabs/fastRAG/pull/58 * Removing Handlers with Phi3.5 Suppport by @mosheber in https://github.com/IntelLabs/fastRAG/pull/59 * replaced list[str] with List[str] by @mosheber in https://github.com/IntelLabs/fastRAG/pull/67 * Adding files for multi mLow11/7/2024
v3.0.2## What's Changed * Fix IPEX embedders performance by @peteriz in https://github.com/IntelLabs/fastRAG/pull/52 * Fix support python versions by @peteriz in https://github.com/IntelLabs/fastRAG/pull/53 **Full Changelog**: https://github.com/IntelLabs/fastRAG/compare/v3.0.1...v3.0.2Low7/9/2024
v3.0.1## What's Changed * Gaudi Generator by @mosheber in https://github.com/IntelLabs/fastRAG/pull/50 * Adding pypi packaging support by @peteriz in https://github.com/IntelLabs/fastRAG/pull/51 **Full Changelog**: https://github.com/IntelLabs/fastRAG/compare/v3.0...v3.0.1Low7/2/2024
v3.0# Compatibility with Haystack v2 - ⚡ All our classes are now compatible with 🤖 Haystack v2, including the example notebooks and yaml pipeline configurations. - 💻 We based our demos on the [Chainlit](https://github.com/Chainlit/chainlit) UI library; examples include RAG chat with multi-modality! 🖼️ ❤️ Feel free to report any issue, bug or question!Low5/22/2024
v2.0# fastRAG 2.0: Let's do RAG Efficiently :fire: fastRAG 2.0 includes new highly-anticipated efficiency-oriented components, an updated chat-like demo experience with multi-modality and improvements to existing components. The library now utilizes efficient **Intel optimizations** using [Intel extensions for PyTorch (IPEX)](https://github.com/intel/intel-extension-for-pytorch), [🤗 Optimum Intel](https://github.com/huggingface/optimum-intel) and [🤗 Optimum-Habana](https://github.com/huggingLow12/24/2023
v1.3.0## What's Changed * ColBERT Upstream Updates by @danielfleischer in https://github.com/IntelLabs/fastRAG/pull/19 **Full Changelog**: https://github.com/IntelLabs/fastRAG/compare/v1.2.1...v1.3.0Low6/20/2023
v1.2.1## What's Changed * Update plaid_colbert_pipeline.ipynb by @mosheber in https://github.com/IntelLabs/fastRAG/pull/17 * Update colbert.py by @mosheber in https://github.com/IntelLabs/fastRAG/pull/18 **Full Changelog**: https://github.com/IntelLabs/fastRAG/compare/v1.2.0...v1.2.1Low6/13/2023
v1.2.0Release v1.2.0Low5/21/2023

Dependencies & License Audit

Loading dependencies...

Similar Packages

OpenClawProBenchOpenClawProBench is a live-first benchmark harness for evaluating LLM agents in the OpenClaw runtime with deterministic grading and repeated-trial reliability.main@2026-05-19
awesome-opensource-aiCurated list of the best truly open-source AI projects, models, tools, and infrastructure.main@2026-06-06
vector-db-benchmarkFramework for benchmarking vector search enginesmaster@2026-06-05
agentscopeBuild and run agents you can see, understand and trust.v2.0.1
MiniSearchMinimalist web-searching platform with an AI assistant that runs directly from your browser. Uses WebLLM, Wllama and SearXNG. Demo: https://felladrin-minisearch.hf.spacemain@2026-06-05

More in Testing

vector-db-benchmarkFramework for benchmarking vector search engines
GitoAn AI-powered GitHub code review tool that uses LLMs to detect high-confidence, high-impact issues—such as security vulnerabilities, bugs, and maintainability concerns.
mxcliMendix cli tool, a headless way to work with Mendix projects. Enables Mendix projects for use with 3rd party agentic coding tools like Claude Code and Copilot. Includes a starlark linter for quality v
llm_context_benchmarks 📊 LLM Context Benchmarks - A comprehensive benchmarking tool for testing LLMs with varying context sizes using Ollama. Features dual benchmark modes (API/CLI), automatic hardware detection (optimiz