witanlabs/research-log

How we built Witan - four months of engineering an LLM spreadsheet agent

/ 100

Experimental

Leverages a multi-agent architecture with a REPL-based execution layer instead of discrete tool calls—allowing agents to write and execute inline scripts that compose operations more efficiently than sequential API requests. Implements structured reasoning workflows (disambiguate → define end state → plan → execute → verify) paired with domain-specific financial knowledge as reusable prompt skills, which proved more impactful than tool improvements alone. Uses programmatic benchmarking with pixel/layout/sequence comparison rather than LLM-as-judge evaluation to reliably detect regressions and guide development—the evaluation framework itself became critical to discovering architectural breakthroughs (74%→92% accuracy).

No License No Package No Dependents

Maintenance 10 / 25

Adoption 8 / 25

Maturity 3 / 25

Community 3 / 25

How are scores calculated?

Stars

Forks

Language

—

License

—

Higher-rated alternatives

unicodeveloper/globalthreatmap

Global threat map. Learn wars, conflicts, military bases and history of nations.

aiagenta2z/ai-agent-marketplace

AI Agent Marketplace | AI Agent Directory | AI Agent Index Repo for Public Available AI Agents Community

tum-ewk/lemlab

an open-source tool for the multi-agent-based development and testing of local energy market applications

AudtheiaOfficial/audtheia-environmental-monitoring

AI-powered environmental monitoring system for marine and terrestrial ecosystems using computer...

lhg96/smartEMS-MultiAgent-Demo

Multi-AI Agent Energy Management System with HILS simulation, Hybrid AI (ML+LLM), and MCP...

Explore AI Agents

All categories Trending AI Agent directory Insights