jwliu aaa92cbe8f 更新 'README.md'		2 years ago
log	update README	2 years ago

model	upload	2 years ago

LICENSE	代码上传	2 years ago

README.md	更新 'README.md'	2 years ago

analysis.py	代码上传	2 years ago

datasets.py	代码上传	2 years ago

opt.py	代码上传	2 years ago

text_utils.py	代码上传	2 years ago

train.py	代码上传	2 years ago

train_1p.sh	代码上传	2 years ago

utils.py	代码上传	2 years ago

README.md

finetune-transformer-lm

finetune-transformer-lm

Code and model for the paper "Improving Language Understanding by Generative Pre-Training"

第四届中国软件开源创新大赛·赛道二：任务挑战赛（模型王者挑战赛）：基于华为Ascend 910，利用Tensorflow 1.15.0 实现GPT语言模型在ROCStories数据集上的finetune，最终精度：Accuracy=87.60%，性能：14.55 sec/epoch。

原始参考论文：https://paperswithcode.com/paper/improving-language-understanding-by

原始参考代码：https://github.com/openai/finetune-transformer-lm

最终结果

Ascend 910	精度（ROCStories Test Accuracy）	性能（sec/epoch）
基线	89.90%	24.72
论文	86.5%	/
此迁移	87.60%	14.55

注意：此迁移性能14.70 sec/epoch 是由于第一个epoch涉及编译，导致该epoch耗时约140+ seconds, 后续运行的每个epoch耗时很小，约为12.30 sec/epoch

代码目录


├── README.md                                 //代码说明文档
├── dataset                                 //数据集存放目录
│    ├──cloze_test_test__spring2016 - cloze_test_ALL_test.csv                       
│    ├──cloze_test_val__spring2016 - cloze_test_ALL_val.csv                       
│    ├──ROCStories__spring2016 - ROCStories_spring2016.csv                      
│    ├──ROCStories_winter2017 - ROCStories_winter2017.csv                      
├──model                                 //model存放目录
│    ├──此处应该13个文件，请到原始参考代码的同名model文件夹下下载
├──analysis.py
├──datasets.py                                 
├──LICENSE                             
├──opt.py                             
├──text_utils.py                              
├──train.log                                 //训练日志
├──train.py                                 //训练启动文件
├──train_1p.sh                                 //训练启动脚本
├──utils.py

准备数据集

The ROCStories dataset can be downloaded from the associated website.

若不方便下载，可使用如下百度网盘下载链接:

链接：https://pan.baidu.com/s/19DxrwMzjiAzC-Jbp-tVeEg

提取码：65fu

请将下载后的四个.csv文件放到本代码目录下的dataset文件夹下

运行

Currently this code implements the ROCStories Cloze Test result reported in the paper by running:

# 安装需要的第三方库并运行程序
bash train_1p.sh

请耐心等待，运行完成，可查看train.log，在最后一行可以看到精度指标：

ROCStories Test Accuracy: 87.60

在精度指标上数第四行，可以看到如下内容：
最终性能指标： 14.697019650936127 sec/epoch 100

Note: The median accuracy of run with this codebase (using default hyperparameters) is 87.60% - slightly higher than the reported single run of 86.5% from the paper.

第四届中国软件开源创新大赛·赛道二：任务挑战赛（模型王者挑战赛）：基于华为Ascend 910，利用Tensorflow 1.15.0 实现GPT语言模型在ROCStories数据集上的finetune，最终精度：Accuracy=87.60%，性能：14.55 sec/epoch。

Python Shell

How to access data resources in code