Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
ZhangbuDong 363b16346c | 1 year ago | |
---|---|---|
Models | 1 year ago | |
.gitignore | 1 year ago | |
Config.py | 1 year ago | |
Dataset.py | 1 year ago | |
README.md | 1 year ago | |
Vocabulary.py | 1 year ago | |
main.py | 1 year ago | |
preprocess.sh | 1 year ago | |
run.sh | 1 year ago | |
test.sh | 1 year ago | |
utils.py | 1 year ago | |
word2vec.py | 1 year ago |
Existing neural methods for data-to-text generation are still struggling to produce long and diverse texts: they are insufficient to model input data dynamically during generation, to capture inter-sentence coherence, or to generate diversified expressions. To address these issues, we propose a Planning-based Hierarchical Variational Model (PHVM). Our model first plans a sequence of groups (each group is a subset of input items to be covered by a sentence) and then realizes each sentence conditioned on the planning result and the previously generated context, thereby decomposing long text generation into dependent sentence generation sub-tasks. To capture expression diversity, we devise a hierarchical latent structure where a global planning latent variable models the diversity of reasonable planning and a sequence of local latent variables controls sentence realization.
This project is a Tensorflow implementation of our work.
Dataset
Our dataset contains 119K pairs of product specifications and the corresponding advertising text. For more information, please refer to our paper.
Preprocess
data
. The path to our dataset is ./data/data.jsonl
../data/processed/
except pre-trained word embeddings which can be generated with the following command line:bash preprocess.sh
Train
./run.sh
Test
./test.sh
Our paper is available at https://arxiv.org/abs/1908.06605v2.
Please kindly cite our paper if this paper and the code are helpful.
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》