witanlabs/research-log
How we built Witan - four months of engineering an LLM spreadsheet agent
Leverages a multi-agent architecture with a REPL-based execution layer instead of discrete tool calls—allowing agents to write and execute inline scripts that compose operations more efficiently than sequential API requests. Implements structured reasoning workflows (disambiguate → define end state → plan → execute → verify) paired with domain-specific financial knowledge as reusable prompt skills, which proved more impactful than tool improvements alone. Uses programmatic benchmarking with pixel/layout/sequence comparison rather than LLM-as-judge evaluation to reliably detect regressions and guide development—the evaluation framework itself became critical to discovering architectural breakthroughs (74%→92% accuracy).
Stars
52
Forks
1
Language
—
License
—
Category
Last pushed
Mar 02, 2026
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/agents/witanlabs/research-log"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
unicodeveloper/globalthreatmap
Global threat map. Learn wars, conflicts, military bases and history of nations.
aiagenta2z/ai-agent-marketplace
AI Agent Marketplace | AI Agent Directory | AI Agent Index Repo for Public Available AI Agents Community
tum-ewk/lemlab
an open-source tool for the multi-agent-based development and testing of local energy market applications
AudtheiaOfficial/audtheia-environmental-monitoring
AI-powered environmental monitoring system for marine and terrestrial ecosystems using computer...
lhg96/smartEMS-MultiAgent-Demo
Multi-AI Agent Energy Management System with HILS simulation, Hybrid AI (ML+LLM), and MCP...