MSVoxelDNN_yehua
Point Cloud Compression, context model, Deep Generative Models, G-PCC, base on VoxelDNN
MSVoxelDNN, is a kind of point cloud lossless compression method, optimized on the basis of VoxelDNN.
our contributions
1.transplant from pytorch to tensorlayer
2.model is not provided by author, so we train MSVoxelDNN's 24 model and VoxelDNN's 2 models
3.test with testsets, on pytorch and tensorlayer
4.benchmark and ablation test on downsample depth of 3 and 1
file structure
root
└── MSVoxelDNN-master: tensorlayer code
└── MSVoxelDNN pytorch source code
└── MSVoxelDNN-master/Model: model files on pytorch and tensorlayer
└── MSVoxelDNN explanation.docx: introduction for paper, migration, performance, instructions
└── Multiscale deep context modeling for lossless point cloud geometry compression.pdf: origional paper
└── datasets: see the dataset.zip in dataset part
└── MSVoxelDNN performance.xlsx: detailed performance list
environment
refer to MSVoxelDNN-master/readme.md and MSVoxelDNN explanation.docx
command
1.pytorch
VoxelDNN training:
python3 -m training.voxel_dnn_training_torch -usecuda 1 -dataset /userhome/VoxelDNN/datasets/ModelNet40_200_pc512_oct3/ -dataset /userhome/VoxelDNN/datasets/MVUB/10bitdepth_2_oct4/ -dataset /userhome/VoxelDNN/datasets/8iVFBv2/10bitdepth_2_oct4/ -dataset /userhome/VoxelDNN/datasets/CAT1/10bitdepth_2_oct4/ -outputmodel Model/VoxelDNN32Torch/ -epoch 20 --scratch=1 -batch 8 -tf 0 -lr 3 -nfilters 64 -blocksize 32
MSVoxelDNN training:
bash training.sh
encode:
python3 -m ms_voxel_dnn_coder.ms_voxel_dnn_encoder -level 10 -depth 1 -ply 28_airplane_0270.ply -output bitstream_output -signaling msvxdnn -model Model/MSVoxelCNN/ -model8 Model/VoxelDNN32Torch/BL32_tf0/
2.tensorlayer
VoxelDNN training:
python -m training.voxel_dnn_training_tensorlayer -inputmodel /userhome/MSVoxelDNN/MSVoxelDNN-master/Model/voxeldnn32_tl_test/model_1.npz -dataset /userhome/VoxelDNN/datasets/8iVFBv2/10bitdepth_2_oct4/ -dataset /userhome/VoxelDNN/datasets/CAT1/10bitdepth_2_oct4/ -dataset /userhome/VoxelDNN/datasets/MVUB/10bitdepth_2_oct4/ -dataset /userhome/VoxelDNN/datasets/ModelNet40_200_pc512_oct3/
encode:
python3 -m ms_voxel_dnn_coder.ms_voxel_dnn_encoder_tensorlayer
ablation study
As the paper shows in table II, the bpov of VoxelDNN is smaller than that of MSVoxelDNN, so in order to improve the performance, we should reduce the downsample depth to 1, and rely more on VoxelDNN coding. We make an ablation test on pytorch version to compare the encoding time and bpov between downsample depth 1 and 3.
The results shows below. We can see from the results that compared with depth=3, depth=1 achieves less encoding time (avg: 2568.8s VS 6777.1s) and smaller bpov (avg: 3.039 VS 4.037)!
performance
- test on testsets, to compare the performance of TensorLayer and Pytorch. From the results below, we can see that the bpov of TL and PT are close, but the encoding time of PT is often less than TL, maybe because of the fast operation of 3D convolution in PT.
Encodedfile |
TL_EncTime |
TL_bpov |
PT_EncTime |
PT_bpov |
sarah_vox10_0023.ply |
625.301 |
0.871 |
646.468 |
0.871 |
sarah_vox9_0023.ply |
156.309 |
0.854 |
173.102 |
0.855 |
phil_vox9_0139.ply |
176.158 |
0.923 |
161.953 |
0.923 |
phil_vox10_0139.ply |
728.739 |
0.935 |
710.697 |
0.935 |
redandblack_vox10_1550.ply |
411.62 |
0.784 |
369.163 |
0.786 |
queen_vox10_0200.ply |
460.982 |
0.714 |
359.852 |
0.713 |
longdress_vox10_1300.ply |
466.588 |
0.701 |
382.268 |
0.703 |
basketball_player_vox11_00000200.ply |
1676.731 |
0.607 |
1385.684 |
0.609 |
loot_vox10_1200.ply |
456.226 |
0.674 |
389.951 |
0.676 |
dancer_vox11_00000001.ply |
1479.507 |
0.605 |
1188.119 |
0.607 |
soldier_vox10_0690.ply |
589.835 |
0.703 |
472.866 |
0.704 |
Cite from:
@article{nguyen2021multiscale,
title={Multiscale deep context modeling for lossless point cloud geometry compression},
author={Nguyen, Dat Thanh and Quach, Maurice and Valenzise, Giuseppe and Duhamel, Pierre},
journal={arXiv preprint arXiv:2104.09859},
year={2021}
}
contributors
name: Ye Hua
email: yeh@pcl.ac.cn