AgentLab and HelloAgents

AgentLab provides a comprehensive benchmarking and testing infrastructure for web agents, while HelloAgents offers a lightweight tutorial-based framework for agent development, making them complements that could be used together where HelloAgents serves as a starting point and AgentLab validates the resulting agents.

AgentLab
78
Verified
HelloAgents
63
Established
Maintenance 13/25
Adoption 16/25
Maturity 25/25
Community 24/25
Maintenance 13/25
Adoption 10/25
Maturity 15/25
Community 25/25
Stars: 530
Forks: 108
Downloads: 460
Commits (30d): 1
Language: Python
License:
Stars: 793
Forks: 215
Downloads:
Commits (30d): 2
Language: Python
License:
No risk flags
No Package No Dependents

About AgentLab

ServiceNow/AgentLab

AgentLab: An open-source framework for developing, testing, and benchmarking web agents on diverse tasks, designed for scalability and reproducibility.

Built on **BrowserGym** for standardized web task environments, AgentLab provides large-scale parallel experiment execution via Ray with unified LLM API support across OpenAI, Azure, OpenRouter, and self-hosted TGI backends. It supports 12+ benchmarks including WebArena, WorkArena, and VisualWebArena with configurable reproducibility features like task seeding and deterministic execution, enabling systematic ablation studies and agent comparisons across thousands of tasks.

About HelloAgents

jjyaoao/HelloAgents

A agent framework based on the tutorial hello-agents

Implements 16 production-grade capabilities including ToolResponse protocol, context engineering (HistoryManager/TokenCounter), session persistence, sub-agent mechanisms via TaskTool, circuit breakers, and observability through TraceLogger. Built on OpenAI-compatible APIs with multi-provider support (OpenAI, Anthropic, Gemini, DeepSeek, local vLLM/Ollama) through three adapter patterns, offering Function Calling architecture across multiple agent types (SimpleAgent, ReActAgent, ReflectionAgent, PlanAndSolveAgent). Provides complete engineering infrastructure for complex multi-agent applications including streaming output (SSE), async lifecycles, optimistic locking for file operations, and decision logging.

Scores updated daily from GitHub, PyPI, and npm data. How scores work