FareedKhan-dev/training-ai-agents

Training architecture for self-improving AI agents.

47
/ 100
Emerging

Implements a multi-agent training pipeline using LangGraph and distributed RL algorithms (SFT, PPO, contextual bandits) with real-time observability via LangSmith and Weights & Biases. Agents collaborate through shared hierarchical state, exchange knowledge in parallel, and self-improve through dynamic reward systems that adapt based on performance and task alignment. The architecture progresses through supervised fine-tuning, reinforcement learning phases, and includes tracing hooks and logging adapters for capturing every interaction and learning step.

No Package No Dependents
Maintenance 6 / 25
Adoption 8 / 25
Maturity 13 / 25
Community 20 / 25

How are scores calculated?

Stars

54

Forks

24

Language

Jupyter Notebook

License

MIT

Last pushed

Nov 04, 2025

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/agents/FareedKhan-dev/training-ai-agents"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.