Ziechoes/reasoning-invariance-benchmark

Experiments testing whether LLM reasoning trajectories remain invariant when constraint layers are applied. If reasoning paths diverge under identical logical problems, this suggests architectural coupling between inference state and constraint enforcement.

/ 100

Experimental

No License No Package No Dependents

Maintenance 10 / 25

Adoption 1 / 25

Maturity 1 / 25

Community 0 / 25

How are scores calculated?

Stars

Forks

—

Language

Python

License

—

Category

llm-evaluation-frameworks

Last pushed

Mar 04, 2026

Commits (30d)

GitHub

LLM Evaluation Frameworks · 101 tools

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/prompt-engineering/Ziechoes/reasoning-invariance-benchmark"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

Featured in

You're Shipping AI You Can't Measure

Higher-rated alternatives

microsoft/promptbench

A unified evaluation framework for large language models

uptrain-ai/uptrain

UpTrain is an open-source unified platform to evaluate and improve Generative AI applications....

gabe-mousa/Apolien

AI Safety Evaluation Library

microsoftarchive/promptbench

A unified evaluation framework for large language models

babelcloud/LLM-RGB

LLM Reasoning and Generation Benchmark. Evaluate LLMs in complex scenarios systematically.

Explore Prompt Engineering Tools

All categories Trending Prompt Engineering directory Insights