mindocr-fork

关于GCU、沐曦GPGPU、MLU、0卡V100资源4月7日恢复上架的公告>>> 关于共建具身智能开源数据集的倡议>>> 关于云脑任务中统一路径访问方式的公告>>> 关于将启智集群GPU资源迁移至智算集群的公告>>>

History

liangxhao 858e8f80ef rename dataset filename (#419 )		11 months ago
..
README.md	change polygon preprocessing pipeline (#292)	11 months ago

README_CN.md	rename dataset filename (#419)	11 months ago

convert.py	change polygon preprocessing pipeline (#292)	11 months ago

ctw1500.py	change polygon preprocessing pipeline (#292)	11 months ago

ic15.py	change polygon preprocessing pipeline (#292)	11 months ago

mlt2017.py	change polygon preprocessing pipeline (#292)	11 months ago

svt.py	Fix static inspection issues. (#365)	11 months ago

syntext150k.py	Fix static inspection issues. (#365)	11 months ago

synthtext.py	change polygon preprocessing pipeline (#292)	11 months ago

td500.py	Fix static inspection issues. (#365)	11 months ago

totaltext.py	change polygon preprocessing pipeline (#292)	11 months ago

English | 中文

This document shows how to convert ocr annotation to the general format (not including LMDB) for model training.

You may also refer to convert_datasets.sh which is a quick solution for converting annotation files of all datasets under a given directory.

To download and convert OCR datasets to the required data format, please refer to the following instructions: Chinese text recognition, CTW1500, ICDAR2015, MLT2017, SVT, Syntext 150k, TD500, Total Text, SynthText.

Text Detection/Spotting Annotation

The format of the converted annotation file should follow:

img_61.jpg\t[{"transcription": "MASA", "points": [[310, 104], [416, 141], [418, 216], [312, 179]]}, {...}]

Taking ICDAR2015 (ic15) dataset as an example, to convert the ic15 dataset to the required format, please run

# convert training anotation
python tools/dataset_converters/convert.py \
        --dataset_name  ic15 \
        --task det \
        --image_dir /path/to/ic15/det/train/ch4_training_images \
        --label_dir /path/to/ic15/det/train/ch4_training_localization_transcription_gt \
        --output_path /path/to/ic15/det/train/det_gt.txt

# convert testing anotation
python tools/dataset_converters/convert.py \
        --dataset_name  ic15 \
        --task det \
        --image_dir /path/to/ic15/det/test/ch4_test_images \
        --label_dir /path/to/ic15/det/test/ch4_test_localization_transcription_gt \
        --output_path /path/to/ic15/det/test/det_gt.txt

Text Recognition Annotation

The annotation format for text recognition dataset follows

word_7.png	fusionopolis
word_8.png	fusionopolis
word_9.png	Reserve
word_10.png	CAUTION
word_11.png	citi

Note that image name and text label are seperated by \t.

To convert, please run:

# convert training anotation
python tools/dataset_converters/convert.py \
        --dataset_name  ic15 \
        --task rec \
        --label_dir /path/to/ic15/rec/ch4_training_word_images_gt/gt.txt
        --output_path /path/to/ic15/rec/train/ch4_training_word_images_gt/rec_gt.txt

# convert testing anotation
python tools/dataset_converters/convert.py \
        --dataset_name  ic15 \
        --task rec \
        --label_dir /path/to/ic15/rec/ch4_test_word_images_gt/gt.txt
        --output_path /path/to/ic15/rec/ch4_test_word_images_gt/rec_gt.txt