Mathematical Reasoning Transformers Transformer Models
Tools for training transformers to solve mathematical and symbolic reasoning problems through techniques like pretraining, reinforcement learning, and neuro-symbolic methods. Does NOT include general question-answering, commonsense reasoning without mathematical focus, or pure symbolic solvers without neural components.
There are 84 mathematical reasoning transformers models tracked. 1 score above 50 (established tier). The highest-rated is UKPLab/gpl at 50/100 with 340 stars and 175 monthly downloads.
Get all 84 projects as JSON
curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=transformers&subcategory=mathematical-reasoning-transformers&limit=20"
Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.
| # | Model | Score | Tier |
|---|---|---|---|
| 1 |
UKPLab/gpl
Powerful unsupervised domain adaptation method for dense retrieval. Requires... |
|
Established |
| 2 |
galilai-group/stable-pretraining
Reliable, minimal and scalable library for pretraining foundation and world models |
|
Emerging |
| 3 |
svdrecbd/mhc-mlx
MLX + Metal implementation of mHC: Manifold-Constrained Hyper-Connections by... |
|
Emerging |
| 4 |
CognitiveAISystems/MAPF-GPT
[AAAI-2025] This repository contains MAPF-GPT, a deep learning-based model... |
|
Emerging |
| 5 |
larslorch/avici
Amortized Inference for Causal Structure Learning, NeurIPS 2022 |
|
Emerging |
| 6 |
kyegomez/MHMoE
Community Implementation of the paper: "Multi-Head Mixture-of-Experts" In PyTorch |
|
Emerging |
| 7 |
Cognitive-AI-Systems/MAPF-GPT-DDG
[IROS-2025] MAPF-GPT-DDG is a scalable decentralized multi-agent pathfinding... |
|
Emerging |
| 8 |
ai4co/routefinder
[TMLR 2025 + ICML 2024 FM-Wild Oral] RouteFinder: Towards Foundation Models... |
|
Emerging |
| 9 |
chaitjo/learning-tsp
Code for the paper 'Learning TSP Requires Rethinking Generalization' (CP 2021) |
|
Emerging |
| 10 |
eloialonso/iris
Transformers are Sample-Efficient World Models. ICLR 2023, notable top 5%. |
|
Emerging |
| 11 |
softengg-manoj/dreamer4
🌟 Implement Dreamer 4 for training agents within scalable world models,... |
|
Emerging |
| 12 |
deep-symbolic-mathematics/TPSR
[NeurIPS 2023] This is the official code for the paper "TPSR:... |
|
Emerging |
| 13 |
IntelLabs/causality-lab
Causal discovery algorithms and tools for implementing new ones |
|
Emerging |
| 14 |
pjlab-sys4nlp/llama-moe
⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual... |
|
Emerging |
| 15 |
RobertCsordas/modules
The official repository for our paper "Are Neural Nets Modular? Inspecting... |
|
Emerging |
| 16 |
ai4co/parco
[NeurIPS 2025] PARCO: Parallel AutoRegressive Combinatorial Optimization |
|
Emerging |
| 17 |
vmicheli/delta-iris
Efficient World Models with Context-Aware Tokenization. ICML 2024 |
|
Emerging |
| 18 |
levashi/reprobe
Phase-aware LLM activation steering and linear probing. A memory-efficient,... |
|
Emerging |
| 19 |
IDSIA/lmtool-fwp
PyTorch Language Modeling Toolkit for Fast Weight Programmers |
|
Emerging |
| 20 |
microsoft/COCO-LM
[NeurIPS 2021] COCO-LM: Correcting and Contrasting Text Sequences for... |
|
Emerging |
| 21 |
IDSIA/automated-cl
Official repository for the paper "Automating Continual Learning" |
|
Emerging |
| 22 |
deep-symbolic-mathematics/Multimodal-Symbolic-Regression
[ICLR 2024 Spotlight] SNIP on Symbolic Regression: Deep Symbolic Regression... |
|
Emerging |
| 23 |
IDSIA/fpainter
Official repository for the paper "Images as Weight Matrices: Sequential... |
|
Emerging |
| 24 |
deep-symbolic-mathematics/Multimodal-Math-Pretraining
[ICLR 2024 Spotlight] This is the official code for the paper "SNIP:... |
|
Experimental |
| 25 |
alexliap/greek_gpt
MoE Decoder Transformer implementation with MLX |
|
Experimental |
| 26 |
srvCodes/continual_learning_with_vit
Code for our CVPR 2022 workshop paper "Towards Exemplar-Free Continual... |
|
Experimental |
| 27 |
IDSIA/modern-srwm
Official repository for the paper "A Modern Self-Referential Weight Matrix... |
|
Experimental |
| 28 |
czg1225/CoDe
[CVPR 2025] CoDe: Collaborative Decoding Makes Visual Auto-Regressive... |
|
Experimental |
| 29 |
cifkao/context-probing
Black-box language model explanation by context length probing |
|
Experimental |
| 30 |
softsys4ai/differentiable-proving
Code and data for the paper "Pretrained Language Models are Symbolic... |
|
Experimental |
| 31 |
AIRI-Institute/Probing_framework
Framework for probing tasks |
|
Experimental |
| 32 |
elijahnzeli1/CausalTorch
CausalTorch is a PyTorch library for building generative models with... |
|
Experimental |
| 33 |
Shekswess/tiny-reasoning-language-model
Code repository dedicated to experimenting and research with tiny reasoning... |
|
Experimental |
| 34 |
ashimmortallp/mHC-manifold-constrained-hyper-connections
🔍 Explore mHC for manifold-constrained hyper-connections in PyTorch,... |
|
Experimental |
| 35 |
yyDing1/GNER
[ACL 2024 Findings] Code implementation of Paper "Rethinking Negative... |
|
Experimental |
| 36 |
NellyW8/VeriReason
This is the Github Repo for the paper: VeriReason: Reinforcement Learning... |
|
Experimental |
| 37 |
OrigamiDream/CoRT
CoRT: Contrastive Rhetorical Tagging - KISTI 2022 AI/ML Competition |
|
Experimental |
| 38 |
Ultron09/Mirror_mind
A production-ready adaptive meta-learning framework for continuous... |
|
Experimental |
| 39 |
microsoft/AMOS
[ICLR 2022] Pretraining Text Encoders with Adversarial Mixture of Training... |
|
Experimental |
| 40 |
RitoCryo/DeepRWKV-Reasoning
🔍 Enhance reasoning in Large Language Models with DeepRWKV-Reasoning, using... |
|
Experimental |
| 41 |
relign-ai/relign
post train language models on multi-step reasoning with reinforcement learning |
|
Experimental |
| 42 |
DataArcTech/ChartMoE
[ICLR2025 Oral] ChartMoE: Mixture of Diversely Aligned Expert Connector for... |
|
Experimental |
| 43 |
ianchute/generative-reflections
A two-model system for reasonable text generation |
|
Experimental |
| 44 |
gpt-reasoning/ReasoningCombinatorials
[NeurIPS'25] Teaching Transformers to Solve Combinatorial Problems through... |
|
Experimental |
| 45 |
aliuyar1234/proberoute
Research code for ProbeRoute, a probe-initialized sparse routing method for... |
|
Experimental |
| 46 |
IDSIA/recurrent-fwp
Official repository for the paper "Going Beyond Linear Transformers with... |
|
Experimental |
| 47 |
anastadimi/Contra-Sformer
Code for 'Keep Your Eye on the Best: Contrastive Regression Transformer for... |
|
Experimental |
| 48 |
cpuheater/cause-life-is-a-game
Solving games with reinforcement learning |
|
Experimental |
| 49 |
The-Swarm-Corporation/MoF
This work introduces Flow Matching Mixture of Experts (FM-MoE), a framework... |
|
Experimental |
| 50 |
cattolatte/reflective-reasoning-transformer
🧠 R2T Prototype: An LLM pre-trained on causal graphs (not just text) to... |
|
Experimental |
| 51 |
cui-shaobo/causal-strength
evaluating the causal strength between cause and effect |
|
Experimental |
| 52 |
Pomilon-Intelligence-Lab/ALSI
Early baby steps towards a long-term vision regarding Mamba-2's state... |
|
Experimental |
| 53 |
ImMohammadHosseini/MKP-RL
:sparkles: Solve multi_dimensional multiple knapsack problem using... |
|
Experimental |
| 54 |
axonura/axonura-X1
The First AI Model Of Axonura |
|
Experimental |
| 55 |
discover-Austin/Architectural-Emergence-of-Synchronization
Modular Recursive Workspace (MRW) - Complete Phase Transition Detection... |
|
Experimental |
| 56 |
AndreaCossu/continual-pretraining-nlp-vision
Code to reproduce experiments from the paper "Continual Pre-Training... |
|
Experimental |
| 57 |
matlok-ai/bampe-weights
This repository is for profiling, extracting, visualizing and reusing... |
|
Experimental |
| 58 |
The-Swarm-Corporation/ClusterMoE
A novel neural network architecture that extends Mixture of Experts (MoE)... |
|
Experimental |
| 59 |
Francesco-Sovrano/PROBE-SWE
Replication package for PROBE-SWE: a dynamic benchmark to generate,... |
|
Experimental |
| 60 |
capybara-brain346/moe-router
A small Mixture-of-Experts (MoE) Transformer trained from scratch to learn... |
|
Experimental |
| 61 |
mduffster/self-referent-test
Testing role-based pathways on small LLMs |
|
Experimental |
| 62 |
Eran-BA/MoP
Mixture of Products (MoP) for Transformers — research prototype |
|
Experimental |
| 63 |
TheAeryan/strips-transformer
Code for work "From Next Token Prediction to (STRIPS) World Models --... |
|
Experimental |
| 64 |
torotoki/reasoning-minimal
Minimal code to train reasoning model with reinforcement learning. |
|
Experimental |
| 65 |
nlx-group/Shortcutted-Commonsense-Reasoning
Code for the article "Shortcutted Commonsense: Data Spuriousness in Deep... |
|
Experimental |
| 66 |
CheongWoong/impact_of_cooccurrence
A repository for analyzing the impact of co-occurrence statistics on factual... |
|
Experimental |
| 67 |
AndrewBoessen/neural-game-engine
Neural network approach for modeling interactive game environments using... |
|
Experimental |
| 68 |
NISL-MSU/MultiSetSR
Decomposable Neuro Symbolic Regression |
|
Experimental |
| 69 |
cyan-ide/nn_models
Neural network / AI models / LLM models - implementations from scratch in pytorch |
|
Experimental |
| 70 |
The-Swarm-Corporation/awesome-humanoid-papers
A list of awesome research papers for humanoids |
|
Experimental |
| 71 |
AdamG012/moe-paper-models
A sumary of MoE experimental setups across a number of different papers. |
|
Experimental |
| 72 |
nlx-group/Commonsense-Reasoning-Neuro-only-vs-Neuro-Symbolic-Methods
Code for the article "Commonsense Reasoning: how do Neuro-only and hybrid... |
|
Experimental |
| 73 |
UIC-Liu-Lab/CPT
[EMNLP 2022] Continual Training of Language Models for Few-Shot Learning |
|
Experimental |
| 74 |
omron-sinicx/transformer4sr
[NeurIPS 2023 AI4Science] "A Transformer Model for Symbolic Regression... |
|
Experimental |
| 75 |
kreasof-ai/stable-latent-reasoning
Stable Latent Reasoning --- Enhancing Inference in Large Language Models... |
|
Experimental |
| 76 |
bihani-g/rel-paradox
This repository contains code and experiments for the paper 'The Reliability... |
|
Experimental |
| 77 |
chaowei312/dsan6650_final
Recursive reasoning with tiny transformers (<1M params): TRM + MoE + MCTS... |
|
Experimental |
| 78 |
CheongWoong/knowledge_probing
A repository for factual knowledge probing with large language models. |
|
Experimental |
| 79 |
bassrehab/steering-vectors-agents
Runtime control of LLM agent behaviors through activation steering vectors.... |
|
Experimental |
| 80 |
UKPLab/starsem2023-arithmetic-based-pretraining
Code and data for the StarSem 2023 paper "Arithmetic-Based Pretraining --... |
|
Experimental |
| 81 |
neuro-symbolic-ai/latent_mathematical_reasoning
Multi-Operational Mathematical Derivations in Latent Space |
|
Experimental |
| 82 |
eljandoubi/PaliGemma
Coding PaliGemma from scratch using pytorch for inference. |
|
Experimental |
| 83 |
alessoh/ssi1
Developing neural-symbolic transformer models for superintelligence method |
|
Experimental |
| 84 |
moxin-org/CC-MoE
Collaborative Compression for Large-Scale MoE Deployment on Edge |
|
Experimental |