Deleting a branch is permanent. It CANNOT be undone. Continue?
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》
问题描述
使用昇腾训练yolov5目标检测模型的时候提示ModuleNotFoundError: No module named 'torch',我新建了调试任务,在调试任务的Terminal窗口写了,pip install torch>=1.7.0,然后使用 pip list也看到安装了torch,但是训练模型却出错,哪个环节不对呢?不是在调试任务区安装所需的库吗?
相关环境(GPU/NPU)
Ascend NPU
相关集群(启智/智算)
启智集群
任务类型(调试/训练/推理)
新建训练任务
任务名
swim
日志说明或问题截图
[Modelarts Service Log]2023-05-19 11:00:22,512 - INFO - bootstrap proc-rank-0-device-0
Traceback (most recent call last):
File "/home/work/user-job-dir/V0003/train.py", line 20, in
import torch
ModuleNotFoundError: No module named 'torch'
[Modelarts Service Log]2023-05-19 11:00:23,523 - ERROR - proc-rank-0-device-0 (pid: 308) has exited with non-zero code: 1
[Modelarts Service Log]2023-05-19 11:00:23,523 - INFO - Begin destroy training processes
[Modelarts Service Log]2023-05-19 11:00:23,523 - INFO - proc-rank-0-device-0 (pid: 308) has exited
[Modelarts Service Log]2023-05-19 11:00:23,523 - INFO - End destroy training processes
期望的解决方案或建议
起训练任务要选择对应的引擎,你的任务里选了
MindSpore-1.7-c81-python3.7-euleros2.8-aarch64,这个是没有装torch的,这个是支持MindSpore框架的镜像。若你有使用torch的需求,你可以在调试环境里进行调试。
可以起GPU训练任务,官方提供的镜像有已经安装好的torch版本