freshcrate

Search results for "grpo"

Clear filters
5 results found (Python)
llm-rl-environments-lil-course๐Ÿ“main@2026-04-17๐ŸŒฟ Growingโญ57

๐ŸŒฑ A little course on Reinforcement Learning Environments for evaluating and training Language Models

Agentic-RAG-R1๐Ÿ“0.0.0๐ŸŒฟ Growingโญ412

Agentic RAG R1 Framework via Reinforcement Learning

unsloth-buddy๐Ÿ“main@2026-04-15๐ŸŒฟ Growingโญ212

Zero-friction LLM fine-tuning skill for Claude Code, Gemini CLI & any ACP agent. Unsloth on NVIDIA ยท TRL+MPS/MLX on Apple Silicon. Automates env setup, LoRA training (SFT, DPO, GRPO, vision), post-hoc

AReaL๐Ÿ“v1.0.3๐ŸŒฟ Growingโญ5,017

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

OpenRA-RL๐Ÿ“v0.4.1๐ŸŒฑ Seedlingโญ118

Open Framework for AI Agents to play Red Alert through Reinforcement Learning