memvid/maw
Crawl any website into a single searchable file. Query it forever, offline.
Automatically detects crawling strategy—starting with fast HTTP fetches, falling back to Playwright browser rendering, then stealth mode for protected sites—without manual configuration. Stores crawled content in `.mv2` files (memvid's document database format) supporting both BM25 keyword search and optional semantic embeddings via OpenAI. Includes CLI commands for searching, AI-powered Q&A, exporting to markdown/JSON, and supports crawling documentation sites, blogs, local codebases, and git repositories with configurable depth and rate-limiting.
Available on npm.
Stars
27
Forks
4
Language
TypeScript
License
—
Category
Last pushed
Jan 19, 2026
Monthly downloads
20
Commits (30d)
0
Dependencies
10
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/agents/memvid/maw"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
vakra-dev/reader
Open-source, production-grade web scraping engine built for LLMs. Scrape and crawl the entire...
joaobenedetmachado/scrapit
A (really) easy way to web scrape
firecrawl/open-scouts
🔥 AI-powered web monitoring platform. Create automated scouts that search the web and send email...
BrowserCash/teracrawl
High-performance web crawler API optimized for LLMs. Turn any search or website into clean...
poneoneo/Alibaba-CLI-Scraper
Create your own Alibaba dataset and interact with it in plain English.