zjukg/SKA-Bench
[Paper][EMNLP 2025] SKA-Bench: A Fine-Grained Benchmark for Evaluating Structured Knowledge Understanding of LLMs
This project helps AI researchers and engineers evaluate how well Large Language Models (LLMs) understand structured information like knowledge graphs and tables. It takes your chosen LLM and a variety of structured datasets (generated by the tool) as input, and outputs detailed performance metrics on the model's ability to process and reason with this information. It's designed for those who develop or deploy LLMs and need to rigorously test their comprehension of complex, structured data.
No commits in the last 6 months.
Use this if you are an AI researcher or engineer developing or deploying LLMs and need a rigorous, fine-grained benchmark to understand their capabilities and limitations in handling structured knowledge from tables or knowledge graphs.
Not ideal if you are a general user looking to apply an LLM to unstructured text tasks or if you need to build custom datasets outside of the structured knowledge evaluation scope.
Stars
11
Forks
—
Language
Python
License
—
Category
Last pushed
Aug 27, 2025
Commits (30d)
0
Get this data via API
curl "https://pt-edge.onrender.com/api/v1/quality/embeddings/zjukg/SKA-Bench"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
Higher-rated alternatives
SemBench/SemBench
Benchmarking Semantic Query Processing Engines
mangopy/tool-retrieval-benchmark
Official code for ACL2025 "🔍 Retrieval Models Aren’t Tool-Savvy: Benchmarking Tool Retrieval for...
DIA-Bench/DIA-Bench
The DIA Benchmark Dataset is a benchmarking tool consisting of 150 dynamic question generators...