Mathematical Reasoning Transformers Transformer Models

Tools for training transformers to solve mathematical and symbolic reasoning problems through techniques like pretraining, reinforcement learning, and neuro-symbolic methods. Does NOT include general question-answering, commonsense reasoning without mathematical focus, or pure symbolic solvers without neural components.

There are 84 mathematical reasoning transformers models tracked. 1 score above 50 (established tier). The highest-rated is UKPLab/gpl at 50/100 with 340 stars and 175 monthly downloads.

Get all 84 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=mathematical-reasoning-transformers&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

#	Model	Score	Tier	Stars	Language
1	UKPLab/gpl Powerful unsupervised domain adaptation method for dense retrieval. Requires...	50	Established	340	Python
2	galilai-group/stable-pretraining Reliable, minimal and scalable library for pretraining foundation and world models	49	Emerging	133	Python
3	svdrecbd/mhc-mlx MLX + Metal implementation of mHC: Manifold-Constrained Hyper-Connections by...	48	Emerging	3	Python
4	CognitiveAISystems/MAPF-GPT [AAAI-2025] This repository contains MAPF-GPT, a deep learning-based model...	46	Emerging	119	C++
5	larslorch/avici Amortized Inference for Causal Structure Learning, NeurIPS 2022	42	Emerging	72	Python
6	kyegomez/MHMoE Community Implementation of the paper: "Multi-Head Mixture-of-Experts" In PyTorch	40	Emerging	29	Python
7	Cognitive-AI-Systems/MAPF-GPT-DDG [IROS-2025] MAPF-GPT-DDG is a scalable decentralized multi-agent pathfinding...	39	Emerging	61	Python
8	ai4co/routefinder [TMLR 2025 + ICML 2024 FM-Wild Oral] RouteFinder: Towards Foundation Models...	39	Emerging	111	Python
9	chaitjo/learning-tsp Code for the paper 'Learning TSP Requires Rethinking Generalization' (CP 2021)	39	Emerging	241	Jupyter Notebook
10	eloialonso/iris Transformers are Sample-Efficient World Models. ICLR 2023, notable top 5%.	38	Emerging	870	Python
11	softengg-manoj/dreamer4 🌟 Implement Dreamer 4 for training agents within scalable world models,...	37	Emerging	4	Python
12	deep-symbolic-mathematics/TPSR [NeurIPS 2023] This is the official code for the paper "TPSR:...	37	Emerging	81	Python
13	IntelLabs/causality-lab Causal discovery algorithms and tools for implementing new ones	37	Emerging	247	Jupyter Notebook
14	pjlab-sys4nlp/llama-moe ⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual...	34	Emerging	1,002	Python
15	RobertCsordas/modules The official repository for our paper "Are Neural Nets Modular? Inspecting...	34	Emerging	46	Python
16	ai4co/parco [NeurIPS 2025] PARCO: Parallel AutoRegressive Combinatorial Optimization	33	Emerging	44	Python
17	vmicheli/delta-iris Efficient World Models with Context-Aware Tokenization. ICML 2024	33	Emerging	119	Python
18	levashi/reprobe Phase-aware LLM activation steering and linear probing. A memory-efficient,...	33	Emerging	2	Python
19	IDSIA/lmtool-fwp PyTorch Language Modeling Toolkit for Fast Weight Programmers	32	Emerging	19	Python
20	microsoft/COCO-LM [NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for...	32	Emerging	118	Python
21	IDSIA/automated-cl Official repository for the paper "Automating Continual Learning"	32	Emerging	18	Python
22	deep-symbolic-mathematics/Multimodal-Symbolic-Regression [ICLR 2024 Spotlight] SNIP on Symbolic Regression: Deep Symbolic Regression...	31	Emerging	21	Python
23	IDSIA/fpainter Official repository for the paper "Images as Weight Matrices: Sequential...	30	Emerging	12	Python
24	deep-symbolic-mathematics/Multimodal-Math-Pretraining [ICLR 2024 Spotlight] This is the official code for the paper "SNIP:...	29	Experimental	58	Python
25	alexliap/greek_gpt MoE Decoder Transformer implementation with MLX	28	Experimental	6	Python
26	srvCodes/continual_learning_with_vit Code for our CVPR 2022 workshop paper "Towards Exemplar-Free Continual...	28	Experimental	24	Python
27	IDSIA/modern-srwm Official repository for the paper "A Modern Self-Referential Weight Matrix...	27	Experimental	176	Python
28	czg1225/CoDe [CVPR 2025] CoDe: Collaborative Decoding Makes Visual Auto-Regressive...	27	Experimental	108	Python
29	cifkao/context-probing Black-box language model explanation by context length probing	27	Experimental	9	Jupyter Notebook
30	softsys4ai/differentiable-proving Code and data for the paper "Pretrained Language Models are Symbolic...	25	Experimental	12	Python
31	AIRI-Institute/Probing_framework Framework for probing tasks	25	Experimental	31	Python
32	elijahnzeli1/CausalTorch CausalTorch is a PyTorch library for building generative models with...	24	Experimental	5	Python
33	Shekswess/tiny-reasoning-language-model Code repository dedicated to experimenting and research with tiny reasoning...	24	Experimental	49	Python
34	ashimmortallp/mHC-manifold-constrained-hyper-connections 🔍 Explore mHC for manifold-constrained hyper-connections in PyTorch,...	24	Experimental	—	Python
35	yyDing1/GNER [ACL 2024 Findings] Code implementation of Paper "Rethinking Negative...	24	Experimental	60	Python
36	NellyW8/VeriReason This is the Github Repo for the paper: VeriReason: Reinforcement Learning...	24	Experimental	21	Python
37	OrigamiDream/CoRT CoRT: Contrastive Rhetorical Tagging - KISTI 2022 AI/ML Competition	23	Experimental	6	Python
38	Ultron09/Mirror_mind A production-ready adaptive meta-learning framework for continuous...	23	Experimental	5	Python
39	microsoft/AMOS [ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training...	23	Experimental	26	Python
40	RitoCryo/DeepRWKV-Reasoning 🔍 Enhance reasoning in Large Language Models with DeepRWKV-Reasoning, using...	23	Experimental	1	Python
41	relign-ai/relign post train language models on multi-step reasoning with reinforcement learning	23	Experimental	20	Python
42	DataArcTech/ChartMoE [ICLR2025 Oral] ChartMoE: Mixture of Diversely Aligned Expert Connector for...	22	Experimental	94	Jupyter Notebook
43	ianchute/generative-reflections A two-model system for reasonable text generation	22	Experimental	1	Jupyter Notebook
44	gpt-reasoning/ReasoningCombinatorials [NeurIPS'25] Teaching Transformers to Solve Combinatorial Problems through...	22	Experimental	—	C
45	aliuyar1234/proberoute Research code for ProbeRoute, a probe-initialized sparse routing method for...	22	Experimental	—	Python
46	IDSIA/recurrent-fwp Official repository for the paper "Going Beyond Linear Transformers with...	21	Experimental	51	Python
47	anastadimi/Contra-Sformer Code for 'Keep Your Eye on the Best: Contrastive Regression Transformer for...	21	Experimental	12	Python
48	cpuheater/cause-life-is-a-game Solving games with reinforcement learning	21	Experimental	7	Python
49	The-Swarm-Corporation/MoF This work introduces Flow Matching Mixture of Experts (FM-MoE), a framework...	21	Experimental	2	Python
50	cattolatte/reflective-reasoning-transformer 🧠 R2T Prototype: An LLM pre-trained on causal graphs (not just text) to...	21	Experimental	2	Python
51	cui-shaobo/causal-strength evaluating the causal strength between cause and effect	20	Experimental	2	Python
52	Pomilon-Intelligence-Lab/ALSI Early baby steps towards a long-term vision regarding Mamba-2's state...	20	Experimental	1	Python
53	ImMohammadHosseini/MKP-RL :sparkles: Solve multi_dimensional multiple knapsack problem using...	20	Experimental	13	Python
54	axonura/axonura-X1 The First AI Model Of Axonura	19	Experimental	—	Python
55	discover-Austin/Architectural-Emergence-of-Synchronization Modular Recursive Workspace (MRW) - Complete Phase Transition Detection...	19	Experimental	—	Python
56	AndreaCossu/continual-pretraining-nlp-vision Code to reproduce experiments from the paper "Continual Pre-Training...	19	Experimental	22	Jupyter Notebook
57	matlok-ai/bampe-weights This repository is for profiling, extracting, visualizing and reusing...	19	Experimental	9	Python
58	The-Swarm-Corporation/ClusterMoE A novel neural network architecture that extends Mixture of Experts (MoE)...	18	Experimental	4	Python
59	Francesco-Sovrano/PROBE-SWE Replication package for PROBE-SWE: a dynamic benchmark to generate,...	17	Experimental	—	Jupyter Notebook
60	capybara-brain346/moe-router A small Mixture-of-Experts (MoE) Transformer trained from scratch to learn...	17	Experimental	2	Python
61	mduffster/self-referent-test Testing role-based pathways on small LLMs	16	Experimental	1	Python
62	Eran-BA/MoP Mixture of Products (MoP) for Transformers — research prototype	15	Experimental	6	Python
63	TheAeryan/strips-transformer Code for work "From Next Token Prediction to (STRIPS) World Models --...	15	Experimental	—	PDDL
64	torotoki/reasoning-minimal Minimal code to train reasoning model with reinforcement learning.	14	Experimental	3	Python
65	nlx-group/Shortcutted-Commonsense-Reasoning Code for the article "Shortcutted Commonsense: Data Spuriousness in Deep...	14	Experimental	10	Jupyter Notebook
66	CheongWoong/impact_of_cooccurrence A repository for analyzing the impact of co-occurrence statistics on factual...	14	Experimental	10	Jupyter Notebook
67	AndrewBoessen/neural-game-engine Neural network approach for modeling interactive game environments using...	13	Experimental	5	Python
68	NISL-MSU/MultiSetSR Decomposable Neuro Symbolic Regression	13	Experimental	2	Python
69	cyan-ide/nn_models Neural network / AI models / LLM models - implementations from scratch in pytorch	12	Experimental	1	Jupyter Notebook
70	The-Swarm-Corporation/awesome-humanoid-papers A list of awesome research papers for humanoids	12	Experimental	4	—
71	AdamG012/moe-paper-models A sumary of MoE experimental setups across a number of different papers.	12	Experimental	16	—
72	nlx-group/Commonsense-Reasoning-Neuro-only-vs-Neuro-Symbolic-Methods Code for the article "Commonsense Reasoning: how do Neuro-only and hybrid...	12	Experimental	4	Python
73	UIC-Liu-Lab/CPT [EMNLP 2022] Continual Training of Language Models for Few-Shot Learning	12	Experimental	44	Python
74	omron-sinicx/transformer4sr [NeurIPS 2023 AI4Science] "A Transformer Model for Symbolic Regression...	12	Experimental	18	Python
75	kreasof-ai/stable-latent-reasoning Stable Latent Reasoning --- Enhancing Inference in Large Language Models...	11	Experimental	2	—
76	bihani-g/rel-paradox This repository contains code and experiments for the paper 'The Reliability...	11	Experimental	—	Jupyter Notebook
77	chaowei312/dsan6650_final Recursive reasoning with tiny transformers (<1M params): TRM + MoE + MCTS...	11	Experimental	—	Jupyter Notebook
78	CheongWoong/knowledge_probing A repository for factual knowledge probing with large language models.	11	Experimental	—	Python
79	bassrehab/steering-vectors-agents Runtime control of LLM agent behaviors through activation steering vectors....	10	Experimental	3	Python
80	UKPLab/starsem2023-arithmetic-based-pretraining Code and data for the StarSem 2023 paper "Arithmetic-Based Pretraining --...	10	Experimental	1	Julia
81	neuro-symbolic-ai/latent_mathematical_reasoning Multi-Operational Mathematical Derivations in Latent Space	10	Experimental	1	Python
82	eljandoubi/PaliGemma Coding PaliGemma from scratch using pytorch for inference.	10	Experimental	1	Python
83	alessoh/ssi1 Developing neural-symbolic transformer models for superintelligence method	10	Experimental	1	Python
84	moxin-org/CC-MoE Collaborative Compression for Large-Scale MoE Deployment on Edge	10	Experimental	4	Python

Comparisons in this category

MAPF-GPT and MAPF-GPT-DDG (46 vs 39)