Chemistry LLM Benchmarks LLM Tools

Tools, datasets, and benchmarks for evaluating and fine-tuning large language models on chemistry and molecular property prediction tasks. Does NOT include general scientific LLM frameworks, materials science benchmarks, or chemistry software without LLM components.

There are 21 chemistry llm benchmarks tools tracked. The highest-rated is maxischuh/TwinBooster at 47/100 with 6 stars and 103 monthly downloads.

Get all 21 projects as JSON

curl "https://pt-edge.onrender.com/api/v1/datasets/quality?domain=llm-tools&subcategory=chemistry-llm-benchmarks&limit=20"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.

# Tool Score Tier
1 maxischuh/TwinBooster

Package for TwinBooster. Enables fast and powerful zero-shot molecular...

47
Emerging
2 theochem/ModelHamiltonian

Generate 1- and 2-electron integrals so that molecular quantum chemistry...

47
Emerging
3 lamalab-org/chembench

How good are LLMs at chemistry?

43
Emerging
4 pnnl/cactus

LLM Agent that leverages cheminformatics tools to provide informed responses.

37
Emerging
5 jan-janssen/LangSim

Application of Large Language Models (LLM) for computational materials...

36
Emerging
6 MasterAI-EAM/Darwin

An open-source project dedicated to build foundational large language model...

34
Emerging
7 andresilvapimentel/AI4Chem

AI4Chem is a code to test the ability of large language models (ChatGPT) to...

33
Emerging
8 lamalab-org/chemlift

Language-interfaced fine-tuning for chemistry

31
Emerging
9 lamalab-org/macbench

Probing the limitations of multimodal language models for chemistry and...

29
Experimental
10 jschrier/SynthGPT

Code and Data for "Large Language Models for Inorganic Synthesis Prediction"

27
Experimental
11 lamalab-org/chem-bench-app

Frontend for evaluating humans on chemistry questions

26
Experimental
12 google/task-oriented-queries

Task-oriented queries (e.g., one-shot queries to play videos, order food, or...

25
Experimental
13 chemkg/c3p

LLM-generated CHEBI classifiers

23
Experimental
14 ai4cat/AI4C-LitMiner

Developed for AI-driven catalyst discovery, integrating LLM-based knowledge...

22
Experimental
15 Eljefaso2949/QuantumChem-200K

🧬 Discover and utilize QuantumChem-200K, a dataset of 200,000 organic...

22
Experimental
16 renjieli08/QuantumChem-200K

QuantumChem-200K: A Large-Scale Open Organic Molecular Dataset for...

21
Experimental
17 ChemFoundationModels/ChemLLMBench

Official Code for What can Large Language Models do in chemistry? A...

19
Experimental
18 jschrier/KRICT_hackathon_phosphors

KRICT ChemDX Hackathon project: Inorganic Phosphors

12
Experimental
19 ehrenhofer-group/LLM_Material_Property_Benchmark

A Python toolkit for evaluating Large Language Models (LLMs) in materials...

11
Experimental
20 drakedu/formalize

FORMALIZE is a lightweight framework that improves LLM-based program...

11
Experimental
21 apekshyasharma/AAII_Intelligence_Idex_Analysis

A data-driven benchmarking analysis of leading Artificial Intelligence...

11
Experimental