Cheng2020Attention

Learned Image Compression with Discretized Gaussian Mixture Likelihoods and Attention Modules

key words: Image Compression, Gaussian Mixture Model

This paper generalizes the hyperprior from lossy model to lossless compression, and proposes a L2-norm term into the loss function to speed up training procedure. Besides, this paper also investigated different parameterized models for latent codes, and propose to use Gaussian mixture likelihoods to achieve adaptive and flexible context models. The paper is published in 2020, and readers can read the original paper via the link.

Our Contributions

Translate from pytorch to mindspore
Successfully run forward, backward and parameter update
Test the mindspore version

File Structure

source
├── __pycache__
├── decoded files        # decoded images
├── model          
│   └── best.ckpt        # model file with best performance
│   └── train_log.txt    # train log
├── model_baseline.py    # model baseline
├── test.py              # test in reality
└── train.py             # train

Environment

mindspore-dev==2.0.0dev20230116
- It is recommended to install through the link.
cuda 11.1
python 3.7

Command

Please intall compressai via

pip install -e .

under directory CompressAI_MindSpore.

Train

Go to directory source.
Train the model via:

python train.py -d "your/own/dataset/address" --seed 0  --batch-size 16 --test-batch-size 1 --save --lambda 0.01 -e 10

test

python test.py -d "your/own/dataset/address" --seed 0   --test-batch-size 1 --lambda 0.01 --pretrained_file "model/best.ckpt"

Comparison

Reconstructed image

mindspore version


Comparison of the original images with reconsructed ones (original above, reconstructed below)

Quality measurements on Kodak

MindSpore version, models trained on div2k

bpp	PSNR	MSSSIM	run_time	GPU Memory(MiB)	lambda
0.339	26.436	0.931	39.207	4002	lambda0.01
0.555	27.445	0.949	19.941	4002	lambda0.03
0.779	28.761	0.965	33.257	4002	lambda0.05
0.986	28.911	0.967	34.209	4002	lambda0.08

PyTorch version, models are pretrained from official

bpp	PSNR	MSSSIM	enc_time	dec_time	GPU Memory(MiB)	lambda
0.092	25.584	0.937	103.44	232.992	1464	qp1
0.215	27.705	0.972	105.96	236.736	1464	qp3
0.32	29.002	0.982	86.16	212.16	1620	qp4
0.596	31.44	0.991	87.072	216.84	1620	qp6

Citation

@misc{cheng2020learned,
      title={Learned Lossless Image Compression with a HyperPrior and Discretized Gaussian Mixture Likelihoods}, 
      author={Zhengxue Cheng and Heming Sun and Masaru Takeuchi and Jiro Katto},
      year={2020},
      eprint={2002.01657},
      archivePrefix={arXiv},
      primaryClass={eess.IV}
}

Contributors

Name:

Chenhao Zhang

3.0 KiB Raw Permalink Blame History