peteanderson80/Matterport3DSimulator

AI Research Platform for Reinforcement Learning from Real Panoramic Images.

/ 100

Established

Renders agents within 90 real indoor environments from densely sampled 360° RGB-D panoramas, supporting both GPU (EGL) and CPU (OSMesa) off-screen rendering at ~1000 fps. Provides C++ and Python APIs with batched agent support, customizable camera parameters, and includes the Room-to-Room (R2R) navigation dataset for vision-and-language grounding tasks. Built on Matterport3D's real depth data rather than synthetic imagery, enabling research in embodied AI where agents follow natural language instructions through previously unseen buildings.

683 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents

Maintenance 0 / 25

Adoption 10 / 25

Maturity 16 / 25

Community 25 / 25

How are scores calculated?

Stars

683

Forks

138

Language

C++

License

—

Related tools

daveredrum/ScanRefer

[ECCV 2020] ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language

cambridgeltl/visual-spatial-reasoning

[TACL'23] VSR: A probing benchmark for spatial undersranding of vision-language models.

TheShadow29/vognet-pytorch

[CVPR20] Video Object Grounding using Semantic Roles in Language Description...

jianghaojun/Awesome-3D-Vision-and-Language

A collection of 3D vision and language (e.g., 3D Visual Grounding, 3D Question Answering and 3D...

clairecyq/whos-waldo

Who's Waldo? Linking People Across Text and Images. ICCV 2021.

Explore Computer Vision Tools

All categories Trending Computer Vision directory Insights