zjukg/SKA-Bench

[Paper][EMNLP 2025] SKA-Bench: A Fine-Grained Benchmark for Evaluating Structured Knowledge Understanding of LLMs

/ 100

Experimental

This project helps AI researchers and engineers evaluate how well Large Language Models (LLMs) understand structured information like knowledge graphs and tables. It takes your chosen LLM and a variety of structured datasets (generated by the tool) as input, and outputs detailed performance metrics on the model's ability to process and reason with this information. It's designed for those who develop or deploy LLMs and need to rigorously test their comprehension of complex, structured data.

No commits in the last 6 months.

Use this if you are an AI researcher or engineer developing or deploying LLMs and need a rigorous, fine-grained benchmark to understand their capabilities and limitations in handling structured knowledge from tables or knowledge graphs.

Not ideal if you are a general user looking to apply an LLM to unstructured text tasks or if you need to build custom datasets outside of the structured knowledge evaluation scope.

LLM-evaluation knowledge-representation AI-benchmarking natural-language-processing structured-data-understanding

No License Stale 6m No Package No Dependents

Maintenance 2 / 25

Adoption 5 / 25

Maturity 7 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

—

Higher-rated alternatives

SemBench/SemBench

Benchmarking Semantic Query Processing Engines

mangopy/tool-retrieval-benchmark

Official code for ACL2025 "🔍 Retrieval Models Aren’t Tool-Savvy: Benchmarking Tool Retrieval for...

DIA-Bench/DIA-Bench

The DIA Benchmark Dataset is a benchmarking tool consisting of 150 dynamic question generators...

Explore Embedding Tools

All categories Trending Embeddings directory Insights