Skip to content

ShuoZhang-code/FusionOcc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FusionOcc

FusionOcc: Multi-Modal Fusion for 3D Occupancy Prediction, MM 2024 [paper]

INTRODUCTION

FusionOcc is a new multi-modal fusion network for 3D occupancy prediction by fusing features of LiDAR point clouds and surround-view images. The model fuses features of these two modals in 2D and 3D space, respectively. Semi-supervised method is utilized to generate dense depth map, which is integrated by BEV images via a cross-modal fusion module. Features of voxelized point clouds are aligned and merged with BEV images' features converted by a view-transformer in 3D space. FusionOcc establishes a new baseline for further research in multi-modal fusion for 3D occupancy prediction, while achieves the new state-of-the-art on Occ3D-nuScenes dataset.

pipeline

Getting Started

# main prerequisites 
Python = 3.8
nuscenes-devkit = 1.1.11
PyTorch = 1.10.0
torch-scatter = 2.0.9
opencv-python = 4.9.0
Pillow = 10.0.1
mmcv-ful = 1.5.3
mmdetection = 2.25.1
FusionOcc
├── data
│   ├── nuscenes
│   │   ├── maps
│   │   ├── samples
│   │   ├── sweeps
│   │   ├── lidarseg
│   │   ├── imgseg
│   │   ├── gts
|   |   ├── v1.0-trainval
|   |   ├── fusionocc-nuscenes_infos_train.pkl
|   |   ├── fusionocc-nuscenes_infos_val.pkl

Model Zoo

Backbone Config Mask Pretrain mIoU Checkpoints
Swin-Base Base ✖️ ImageNet, nuImages 56.62 BaseWoMask

Evaluation

We provide instructions for evaluating our pretrained models. Download checkpoints above first.

the config file is here fusion_occ.py

Run:

./tools/dist_test.sh $config $checkpoint num_gpu

Training

Modify the "load_from" path at the end of the config file to load pre-trained weights, run:

./tools/dist_train.sh $config num_gpu

To obtain the version without using mask, simply modify the use_mask field in the config file to False and train several epochs.

You can also acquire pre-trained weights from BEVDet to start training from the very beginning.

Acknowledgement

Thanks a lot to these excellent open-source projects, our code is based on them:

Some other related projects for Occ3d prediction:

BibTeX

If this work is helpful for your research, please consider citing the following paper:

@inproceedings{
    zhang2024fusionocc,
    title={FusionOcc: Multi-Modal Fusion for 3D Occupancy Prediction},
    author={Shuo Zhang and Yupeng Zhai and Jilin Mei and Yu Hu},
    booktitle={ACM Multimedia 2024},
    year={2024},
    url={https://openreview.net/forum?id=xX66hwZJWa}
}

About

[MM2024] FusionOcc: Multi-Modal Fusion for 3D Occupancy Prediction

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages