Home > Frameworks > Agentic-RAG-R1

Agentic-RAG-R1

Agentic RAG R1 Framework via Reinforcement Learning

agentic grpo python rag rl

Why this rank:Strong adoptionRecent releaseHealthy release cadence

Description

Agentic RAG R1 Framework via Reinforcement Learning

README

🤖 Agentic RAG-R1: Enhance Agentic RAG Reasoning Capacity via Reinforcement Learning 🚀

🤖 Agentic RAG-R1: Enhance Agentic RAG Reasoning Capacity via Reinforcement Learning 🚀

Introduction 🌟

Agentic RAG‑R1 is an open‑source initiative to build an Agentic Retrieval‑Augmented Generation (RAG) system by endowing a base language model with autonomous search & reasoning skills through reinforcement learning (currently using the GRPO algorithm).

Chinese Language Version:

English Language Version:

What is Agentic RAG? 💡

Agentic RAG combines two powerful concepts:

Retrieval‑Augmented Generation (RAG): Combines generative power with on‑the‑fly retrieval from external knowledge bases, ensuring factual and up‑to‑date answers.
Agentic AI: Gives the model the ability to decide when to retrieve, what to retrieve, and how to weave the retrieved evidence into its reasoning.

Architecture 🏗️

Our architecture is inspired by TC‑RAG and features an agent memory stack that orchestrates the full deliberation loop, supporting the following actions:

Plan (❌)
Reasoning (✅)
Backtrack (✅)
Summary (✅)
Tool Observation – wiki/document/knowledge‑graph search, etc. (✅)
Conclusion (✅)

Training Strategy 🧠

Motivated by DeepSeek-R1, we apply GRPO (Generalized Relevance Policy Optimization) to reinforce the agent's choice of reasoning steps and retrieval actions, effectively boosting both search depth and answer quality.

Rollout Generation 🔄

Installation 🛠️

We use conda to manage the environment. Follow these steps to set up:

conda create -n AgenticRAG python=3.11 -y
conda activate AgenticRAG 
pip install -r requirements.txt

Tools Environment (Optional) 🧰

We provide our search tool repository ArtSearch as the search engine, which supports retrieval of information from Wikipedia. You can follow the instructions in that repository to deploy a local instance of the search system.

Folder Structure 📁

.
├── ArtSearch                 # Search tool integration
├── checkpoints               # Model checkpoints
├── examples                  # Example use cases
├── experiments
│   ├── evaluation            # Evaluation scripts and results
│   └── training              # Training configurations
├── README.md
├── requirements.txt
├── script
│   ├── evaluation            # Evaluation scripts
│   ├── run_server.sh         # Server deployment script
│   └── training              # Training scripts
├── service
│   ├── chat_client.py        # Client for interacting with the model
│   └── chat_server.py        # Server for hosting the model
├── src
│   ├── config                # Configuration files
│   ├── data                  # Data processing utilities
│   ├── evaluation            # Evaluation metrics and tools
│   ├── models                # Model definitions
│   ├── train.py              # Main training script
│   └── utils                 # Utility functions

Quick Start ⚡

Follow the steps below to get up and running with Agentic RAG‑R1.

Before you start, rename file ".env_format" to ".env" and fill the necessary os enviroment variables.

Training

Zero‑2 Mode

./script/training/train_zero2.sh

Zero‑3 Mode

./script/training/train_zero3.sh

Inference

Example Mode

comming soon~

Server Mode

Launch the chat server:

./script/run_server.sh

Features ✨

LoRA Tuning Support 🔧: Fine-tune efficiently with Low-Rank Adaptation
Model Quant Support 💻: Support model quant to nf4 and ..
Custom Agent Tools 🛠️: Integrate your own tools and personal RAG datasets
Distributed Training 🌐: Support for Deepspeed Zero 2 Stage and Zero 3 Stage
Efficient Resource Usage 💻: Support for models up to 32B parameters using only 2 A100 GPUs
Tool Calling Reward 🎯: Enhanced reward model that includes:
- Accuracy reward
- Format reward
- RAG accuracy reward using the RAGAS framework
The total reward is calculated as:

$$r_{total} = r_{accuracy} + r_{format} + r_{rag}$$
TCRAG Integration 🔗: Use TCRAG as the rollout generator

Results 📊

Experiment Log on Qwen 2.5-7B-Instruct

We have made our training logs publicly available at: SwanLab Training Log

Results on MedQA Test Set 🏥

Our Qwen 2.5-7B-Instruct model was evaluated on the MedQA test set using Qwen‑2.5‑72B as the judge:

Configuration	Format Accuracy	Answer Accuracy
Before fine-tuning	39%	84%
Before fine-tuning + search	56%	79%
After fine-tuning (200 steps) + search	92%	87%

Roadmap 🗺️

Add more tools
[Additional planned features]

Acknowledgements 🙏

The concept of Agentic-RAG-R1 is inspired by Deepseek-R1 and TC-RAG. We sincerely appreciate the efforts of these teams for their contributions to open-source research and development. This work is in the same period as work with Search-R1 and ReSearch.

Contributors📝

Supervisors: Junfeng Zhao, Xu Chu, Yasha Wang

Affiliation: Key Laboratory of High Confidence Software Technologies (Peking University), School of Computer Science, Peking University, China

Citation 📝

If you use this work in your research, please cite:

@misc{Agentic_RAG_R1,
  title       = {Agentic RAG-R1: Enhance Agentic RAG Reasoning Capacity via Reinforcement Learning},
  author      = {Xinke Jiang, Jiaran Gao, Rihong Qiu, Zhixin Zhang, Wentao Zhang, Yue Fang, Hongxin Ding},
  year        = {2025},
  howpublished= {\url{https://github.com/jiangxinke/Agentic-RAG-R1}},
  note        = {GitHub repository},
}

🌟 Star History

License 📄

This project is licensed under the Apache License. See the LICENSE file for details.

Release History

Version	Changes	Urgency	Date
dev@2026-06-29	Latest activity on dev branch	High	6/29/2026
0.0.0	No release found — using repo HEAD	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026
dev@2026-02-16	Latest activity on dev branch	Low	2/16/2026

Dependencies & License Audit

Loading dependencies...

Similar Packages

evalsA comprehensive evaluation framework for AI agents and LLM applications.v1.0.3

opentulpaSelf-hosted personal AI agent that lives in your DMs. Describe any workflow: triage Gmail, pull a Giphy feed, build a Slack bot, monitor markets. It writes the code, runs it, schedules it, and saves imain@2026-07-21

kg-ragThis project implements a comprehensive framework for Knowledge Graph Retrieval Augmented Generation (KG-RAG). It focuses on financial data from SEC 10-Q filings and explores how knowledge graphs can main@2026-07-20

auto-re-agentAutomate binary analysis by coordinating LLM agents with Ghidra, enabling scalable and precise reverse engineering workflows.main@2026-07-19

langgraphBuild resilient language agents as graphs.1.2.9

More in Frameworks

bamlThe AI framework that adds the engineering to prompt engineering (Python/TS/Ruby/Java/C#/Rust/Go compatible)

saas-builderAI-native SaaS framework that builds full-stack apps using autonomous AI agents

djangoA high-level Python web framework that encourages rapid development and clean, pragmatic design.

sglangSGLang is a fast serving framework for large language models and vision language models.