SFT : Split Fine-Tuning

This repository contains the code for our paper An Efficient Split Fine-tuning Framework for Edge and Cloud Collaborative Learning.

dataset and pretrained model

We use Bert-base as our experiment neural network and finetune the model on 9 datasets from GLUE and SQuAD, the download link is shown below.

CoLA = "https://dl.fbaipublicfiles.com/glue/data/CoLA.zip",
SST = "https://dl.fbaipublicfiles.com/glue/data/SST-2.zip",
QQP = "https://dl.fbaipublicfiles.com/glue/data/QQP-clean.zip",
STS = "https://dl.fbaipublicfiles.com/glue/data/STS-B.zip",
MNLI = "https://dl.fbaipublicfiles.com/glue/data/MNLI.zip",
QNLI = "https://dl.fbaipublicfiles.com/glue/data/QNLIv2.zip",
RTE = "https://dl.fbaipublicfiles.com/glue/data/RTE.zip",
WNLI = "https://dl.fbaipublicfiles.com/glue/data/WNLI.zip"
MRPC_TRAIN = "https://dl.fbaipublicfiles.com/senteval/senteval_data/msr_paraphrase_train.txt"
MRPC_TEST = "https://dl.fbaipublicfiles.com/senteval/senteval_data/msr_paraphrase_test.txt"
SQuAD_TRAIN = "https://rajpurkar.github.io/SQuAD-explorer/dataset/train-v1.1.json"
SQuAD_DEV = "https://rajpurkar.github.io/SQuAD-explorer/dataset/dev-v1.1.json"
bert-base-uncased = "https://huggingface.co/bert-base-uncased/resolve/main/"

requirement

pytorch == 1.10.2
transformers == 4.20.1
scikit-learn == 1.1.1
scipy == 1.8.1
numpy == 1.22.3

usage

We have implemented the evaluation script and split finetune framework for our paper. Evaluation script is used to evaluate the impact on convergence results of our decomposition policies. The codes can be found in directories without "rpc" suffix. For these codes, There are 3 steps to reproduce our results.

download datasets and pretrained model
modify the arguments --pretrain_dir and --data_dir in config.py .
modify --split and --rank
run python train.py.

The directory with "rpc" suffix contains our implementation of split finetune framework. You should run the script on 2 interconnected machines. For test, you could also run it on 1 machine with 2 processes. The instructions is shown below.

download datasets and pretrained model
modify the arguments --edge_pretrain_dir, --cloud_pretrain_dir and --data_dir in config.py .
modify the MASTER_ADDR and MASTER_PORT in config.py
modify --split and --rank
run python edge.py on edge machine and python cloud.py on cloud machine.

descriptions for some key arguments:

--split: Int type. Indicates which layer to be decomposed and splited. The value range is determined according to the model structure. For example, value ranges from 1 to 12 for bert-base and 1 to 24 for bert-large.
--rank: Int type. Indicates the rank of svd decomposition of the split layer. The value range is determined according to the model structure. For example, value ranges from 1 to 768 for bert-base and 1 to 1024 for bert-large.

2.8 KiB Raw Permalink Blame History

SFT : Split Fine-Tuning

dataset and pretrained model

requirement

usage

2.8 KiB

Raw Permalink Blame History