ChenTaHung/HTML-Text-Parser

This project is designed to extract text from documents and prepare it for processing by Large Language Models (LLM). Implemented a feature to store and utilize text style information, enabling the program to identify and segment content based on potential headers and titles.

/ 100

Experimental

No commits in the last 6 months.

No License Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 5 / 25

Maturity 1 / 25

Community 7 / 25

How are scores calculated?

Stars

Forks

Language

HTML

License

—

Category

llm-web-scraping

Last pushed

Nov 17, 2024

Commits (30d)

GitHub

LLM Web Scraping · 33 tools

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/llm-tools/ChenTaHung/HTML-Text-Parser"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

Higher-rated alternatives

carlosplanchon/spidercreator

Automated web scraping spider generation using Browser Use and LLMs. Streamline the creation of...

raznem/parsera

Lightweight library for scraping web-sites with LLMs

Riddhish1/CogniScrape

Intelligent Web Scraping Library with LLMs

yeahhe365/JustSearch

基于 Playwright 的自主 AI 搜索智能体。支持迭代式任务规划、深度网页爬取，以及带引用来源的多源知识整合。

poodle64/supacrawl

Zero-infrastructure web scraping for the terminal

Explore LLM Tools

All categories Trending LLM Tool directory Insights