fundamentalvision/BEVFormer

[ECCV 2022] This is the official implementation of BEVFormer, a camera-only framework for autonomous driving perception, e.g., 3D object detection and semantic map segmentation.

49
/ 100
Emerging

Employs spatiotemporal transformer architecture with spatial cross-attention across multi-camera views and temporal self-attention for recurrent BEV fusion, enabling unified representations for both 3D detection and map segmentation tasks. Built on MMDetection3D framework with multiple configuration variants (tiny to base) trading memory for performance, from 6.5GB to 28.5GB, and includes BEVFormerV2 with improved backbone integration reaching 56.9% NDS on nuScenes—9 points above prior camera-only methods.

4,356 stars. No commits in the last 6 months.

Stale 6m No Package No Dependents
Maintenance 0 / 25
Adoption 10 / 25
Maturity 16 / 25
Community 23 / 25

How are scores calculated?

Stars

4,356

Forks

709

Language

Python

License

Apache-2.0

Last pushed

Aug 15, 2024

Commits (30d)

0

Get this data via API

curl "https://pt-edge.onrender.com/api/v1/quality/computer-vision/fundamentalvision/BEVFormer"

Open to everyone — 100 requests/day, no key needed. Get a free key for 1,000/day.