Democratizing Reinforcement Learning for LLMs
-
Updated
Jan 9, 2026 - Python
Democratizing Reinforcement Learning for LLMs
MiroThinker is an open-source search agent model, built for tool-augmented reasoning and real-world information seeking, aiming to match the deep research experience of OpenAI Deep Research and Gemini Deep Research.
[EMNLP'25] s3 - ⚡ Efficient & Effective Search Agent Training via RL for RAG (RLVR for Search with Minimal Data)
[COLM’25] DeepRetrieval — 🔥 Training Search Agent by RLVR with Retrieval Outcome
An open-source LLM based automatically daily news collecting workflow showcase powered by Agently AI application development framework.
8 Puzzle solver using uninformed and informed search algorithms as DFS, BFS and A*.
A RAG-based AI agent for breast cancer that uses Qdrant to retrieve answers, filter relevance, and grow its knowledge base.
Official code implementation for my ready tensor publication, an ai agent that retrieves data from a Turkish - Islamic website and uses the data as alignment criteria to answer the user
8 Puzzle Game Solver using A*, BFS and DFS
Add a description, image, and links to the search-agent topic page so that developers can more easily learn about it.
To associate your repository with the search-agent topic, visit your repo's landing page and select "manage topics."