arunjeyaprasad/mcp-rag-web-scraper
Customizable web scraper that can be used to build a knowledge base which can be integrated with a RAG system for Search. Supports MCP integration as well for querying
This tool helps businesses, researchers, or anyone needing to create a private, searchable knowledge base from public websites. You provide a list of website URLs, and it automatically scrapes their content to build an offline database. This database can then be queried using natural language, acting like a private search engine for the information you've gathered.
No commits in the last 6 months.
Use this if you need to gather specific information from websites regularly and want to create a private, AI-searchable reference system for your team or personal use, especially for integrating with AI assistants like Claude Desktop or local LLMs like Ollama.
Not ideal if you need a general-purpose web crawler for broad data collection across the entire internet, or if you don't require the natural language search and AI integration capabilities.
Stars
2
Forks
—
Language
Python
License
MIT
Category
Last pushed
Jul 03, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/mcp/arunjeyaprasad/mcp-rag-web-scraper"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
zilliztech/claude-context
Code search MCP for Claude Code. Make entire codebase the context for any coding agent.
Wildcard-Official/deepcontext-mcp
DeepContext is an MCP server that adds symbol-aware semantic search to Claude Code, Codex CLI,...
dan6684/smart-connections-mcp
MCP server that exposes Obsidian Smart Connections vector database to Claude Code via semantic search
baptiste-mnh/bigrack.dev
BigRack.dev - Intelligent MCP server for task/project & context management in AI development tools
Moxnyyy/smart-coding-mcp
🔍 Enhance code search accuracy with Smart Coding MCP, an AI-driven server that uses intelligent...