facebookresearch/mmf

A modular framework for vision & language multimodal research from Facebook AI Research (FAIR)

/ 100

Established

Built on PyTorch with distributed training support, MMF provides reference implementations of state-of-the-art vision-language models and includes pre-built datasets for VQA, TextVQA, TextCaps, and Hateful Memes challenges. The framework emphasizes modularity and scalability, enabling researchers to bootstrap multimodal projects through composable model and dataset components. It serves dual purposes as both a research platform and official starter codebase for vision-language benchmarks.

5,622 stars. Actively maintained with 2 commits in the last 30 days.

No Package No Dependents

Maintenance 13 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 23 / 25

How are scores calculated?

Stars

5,622

Forks

944

Language

Python

License

—

Related frameworks

open-mmlab/mmpretrain

OpenMMLab Pre-training Toolbox and Benchmark

adambielski/siamese-triplet

Siamese and triplet networks with online pair/triplet mining in PyTorch

pliang279/awesome-multimodal-ml

Reading list for research topics in multimodal machine learning

friedrichor/Awesome-Multimodal-Papers

A curated list of awesome Multimodal studies.

mlfoundations/open_flamingo

An open-source framework for training large multimodal models.

Explore ML Frameworks

All categories Trending ML Framework directory Insights