Cheng2020Attention
Learned Image Compression with Discretized Gaussian Mixture Likelihoods and Attention Modules
key words: Image Compression, Gaussian Mixture Model
This paper generalizes the hyperprior from lossy model to lossless compression, and proposes a L2-norm term into the loss function to speed up training procedure. Besides, this paper also investigated different parameterized models for latent codes, and propose to use Gaussian mixture likelihoods to achieve adaptive and flexible context models. The paper is published in 2020, and readers can read the original paper via the link.
Our Contributions
- Translate from pytorch to mindspore
- Successfully run forward, backward and parameter update
- Test the mindspore version
File Structure
source
├── __pycache__
├── decoded files # decoded images
├── model
│ └── best.ckpt # model file with best performance
│ └── train_log.txt # train log
├── model_baseline.py # model baseline
├── test.py # test in reality
└── train.py # train
Environment
- mindspore-dev==2.0.0dev20230116
- It is recommended to install through the link.
- cuda 11.1
- python 3.7
Command
Please intall compressai via
pip install -e .
under directory CompressAI_MindSpore.
Train
Go to directory source.
Train the model via:
python train.py -d "your/own/dataset/address" --seed 0 --batch-size 16 --test-batch-size 1 --save --lambda 0.01 -e 10
test
python test.py -d "your/own/dataset/address" --seed 0 --test-batch-size 1 --lambda 0.01 --pretrained_file "model/best.ckpt"
Comparison
Reconstructed image
mindspore version
|
Comparison of the original images with reconsructed ones (original above, reconstructed below) |
Quality measurements on Kodak
MindSpore version, models trained on div2k
bpp |
PSNR |
MSSSIM |
run_time |
GPU Memory(MiB) |
lambda |
0.339 |
26.436 |
0.931 |
39.207 |
4002 |
lambda0.01 |
0.555 |
27.445 |
0.949 |
19.941 |
4002 |
lambda0.03 |
0.779 |
28.761 |
0.965 |
33.257 |
4002 |
lambda0.05 |
0.986 |
28.911 |
0.967 |
34.209 |
4002 |
lambda0.08 |
PyTorch version, models are pretrained from official
bpp |
PSNR |
MSSSIM |
enc_time |
dec_time |
GPU Memory(MiB) |
lambda |
0.092 |
25.584 |
0.937 |
103.44 |
232.992 |
1464 |
qp1 |
0.215 |
27.705 |
0.972 |
105.96 |
236.736 |
1464 |
qp3 |
0.32 |
29.002 |
0.982 |
86.16 |
212.16 |
1620 |
qp4 |
0.596 |
31.44 |
0.991 |
87.072 |
216.84 |
1620 |
qp6 |
Citation
@misc{cheng2020learned,
title={Learned Lossless Image Compression with a HyperPrior and Discretized Gaussian Mixture Likelihoods},
author={Zhengxue Cheng and Heming Sun and Masaru Takeuchi and Jiro Katto},
year={2020},
eprint={2002.01657},
archivePrefix={arXiv},
primaryClass={eess.IV}
}
Contributors
Name:
Chenhao Zhang