Search results for "grpo"
5 results found (Python)
๐ฑ A little course on Reinforcement Learning Environments for evaluating and training Language Models
Agentic RAG R1 Framework via Reinforcement Learning
Zero-friction LLM fine-tuning skill for Claude Code, Gemini CLI & any ACP agent. Unsloth on NVIDIA ยท TRL+MPS/MLX on Apple Silicon. Automates env setup, LoRA training (SFT, DPO, GRPO, vision), post-hoc
Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.
