lampard08/Listed-Company-Annual-Report-Web-Scraper
This project built a Selenium-based web scraper to scrape Chinesse listed companies' annual reports on cninfo.com. The codes are intended to download pdf files and compile them. Then analyze the content in MD&A part to extract and count the frequency of digital-transformation-related keywords.
No commits in the last 6 months.
Stars
1
Forks
1
Language
Jupyter Notebook
License
Apache-2.0
Category
Last pushed
May 02, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/perception/lampard08/Listed-Company-Annual-Report-Web-Scraper"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
seleniumbase/SeleniumBase
APIs for browser automation, testing, and bypassing bot-detection.
intoli/user-agents
A JavaScript library for generating random user agents with data that's updated daily.
apify/crawlee
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In...
Kaliiiiiiiiii-Vinyzu/patchright
Undetected version of the Playwright testing and automation library.
Kaliiiiiiiiii-Vinyzu/patchright-python
Undetected Python version of the Playwright testing and automation library.