#766 大模型训练命令行输入后训练,一直运行中,无日志等信息

Open
created 6 months ago by dx5235 · 4 comments
dx5235 commented 6 months ago
cd /code/ChatGLM2-6B-PT/ptuning; //进入训练代码路径 conda activate chatglm2-pt; //切换python环境 CUDA_VISIBLE_DEVICES=0 torchrun --nproc_per_node 1 main.py --do_train --train_file train_data/violation_train0925_4000.json --validation_file train_data/violation_train0925_4000.json --preprocessing_num_workers 8 --prompt_column q --response_column a --overwrite_cache --model_name_or_path /code/models/chatglm2-6b --output_dir /model --overwrite_output_dir --max_source_length 256 --max_target_length 256 --per_device_train_batch_size 8 --per_device_eval_batch_size 8 --gradient_accumulation_steps 1 --predict_with_generate --max_steps 100 --logging_steps 50 --save_steps 1000 --learning_rate 5e-3 --pre_seq_len 256
dx5235 commented 6 months ago
Poster
没有任何报错,一直显示运行中
dx5235 commented 6 months ago
Poster
运行了20分钟失败了,日志也没显示任何原因
dx5235 commented 6 months ago
Poster
运行命令这里如何使用conda切换环境啊,使用该命令报 conda:not found
zoulk commented 5 months ago
遇到相同问题,请问你解决了么?
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date

No due date set.

Dependencies

This issue currently doesn't have any dependencies.

Loading…
There is no content yet.