DVC
DVC: An End-to-end Deep Video Compression Framework
key words: end-to-end, video compression, deep learning
This paper proposes the first end-to-end video compression deep model that jointly optimizes all the components for video compression. The paper is published in 2019, and readers can read the original paper via the link.
Our Contibutions
- Translate from pytorch to mindspore.
- Provide pre-trained models of 4 rates.
- Test the mindspore version on UVG dataset.
File Structure
source
├── augmentation.py
├── dataset.py
├── drawuvg.py
├── flow_pretrain_np
├── get_trainfiles.py
├── main.py
├── net.py
├── run_1024.py #run training for lambda 1024
├── run_2048.py
├── run_256.py
├── run_512.py
├── run_test.py #run tests for all lambdas on UVG dataset
├── snapshot
│ ├── best_1024.ckpt #pretrained model for lambda 1024
│ ├── best_2048.ckpt
│ ├── best_256.ckpt
│ ├── best_512.ckpt
| ├── train_1024.log # train log for lambda 1024
| ├── train_2048.log
| ├── train_256.log
| └── train_512.log
├── subnet
│ ├── GDN.py
│ ├── __init__.py
│ ├── analysis.py
│ ├── analysis_mv.py
│ ├── analysis_prior.py
│ ├── basics.py
│ ├── bitEstimator.py
│ ├── endecoder.py
│ ├── flowlib.py
│ ├── ms_ssim_mindspore.py
│ ├── synthesis.py
│ ├── synthesis_mv.py
│ └── synthesis_prior.py
├── test-yh.py
├── test.py
├── train_1024.py #training for lambda 1024
├── train_2048.py
├── train_256.py
└── train_512.py
Environment
- mindspore-dev==2.0.0.dev20230109
- It is recommended to install through the link.
- cuda 11.1
- python 3.7
- pytorch_msssim
Command
Train
- download the vimeo90k dataset to the {vimeo90k_dir}
python train_256.py -d {vimeo90k_dir}/vimeo_septuplet -l train.log --epochs 100 #for lambda 256
or
python run_256.py #for continuous training and avoid memory explosion
test
- download and prepare UVG dataset, encoding with H265 for reference image
python test-yh.py -C DVC_mindspore_1024_YachtRide.csv -L 1024 -F YachtRide -d /userhome/DVC/PyTorch/data/UVG/images/ --ckpt_path /userhome/DVC/MindSpore/snapshot/best_256.ckpt
or
python run_test.py #for all UVG dataset test
Quality measurements on UVG
MindSpore version, models trained on subset of vimeo90k
bpp |
PSNR |
MSSSIM |
run_time |
GPU memory MiB |
lambda |
0.059 |
33.617 |
0.92 |
3277.053 |
8098 |
256 |
0.092 |
34.753 |
0.936 |
3464.094 |
8098 |
512 |
0.158 |
36.35 |
0.946 |
5020.429 |
8098 |
1024 |
0.266 |
37.272 |
0.956 |
5607.829 |
8098 |
2048 |
Citation
@inproceedings{lu2019dvc,
title={Dvc: An end-to-end deep video compression framework},
author={Lu, Guo and Ouyang, Wanli and Xu, Dong and Zhang, Xiaoyun and Cai, Chunlei and Gao, Zhiyong},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={11006--11015},
year={2019}
}
Contributors
name: Ye Hua, Zhang Yongchi
email: yeh@pcl.ac.cn