#2022 新建npu训练任务时指定2个计算节点报runtime error

Closed
created 2 years ago by wangj · 1 comments
wangj commented 2 years ago
复现步骤: 1.新建npu训练任务,指定计算节点2个 2.任务运行Failed 3.点击修改,创建新版本,这个任务能运行成功 期望结果:V0001、V0002版本都运行成功 实际结果: 1.V0001运行失败 2. V0002运行成功 两个版本的差别在于V0001有带device_target参数,V0002没有带(已知 bug #2021 )。
wangj added this to the V20220428 milestone 2 years ago
wangj added the
bug
label 2 years ago
liuzx was assigned by wangj 2 years ago
liuzx commented 2 years ago
Collaborator
多节点训练需要并行训练的代码,目前的训练示例是单卡单机训练,不是并行训练。
liuzx closed this issue 2 years ago
wangj added the
invalid
label 2 years ago
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date

No due date set.

Loading…
There is no content yet.