关于GCU、沐曦GPGPU、MLU、0卡V100资源4月7日恢复上架的公告>>> 关于共建具身智能开源数据集的倡议>>> 关于云脑任务中统一路径访问方式的公告>>> 关于将启智集群GPU资源迁移至智算集群的公告>>>

New Pull Request

#61 add lamma inference

Open

frelam wants to merge 12 commits from frelam/MSAdapterModelZoo:master-lamma into master

inference code from https://github.com/facebookresearch/llama
fairscale code from https://github.com/facebookresearch/fairscale.git

执行命令：

./download.sh 下载模型checkpoint
执行命令： msrun --worker_num=2 --local_worker_num=2 example_chat_completion.py ./llama-2-13b-chat ./tokenizer.model --max_seq_len 512 --max_batch_size 6

只修改评论中的代码行，其他无需做适配。

frelam changed title from ~~[WIP] add lamma inference~~ to add lamma inference 2 months ago

frelam reviewed 2 months ago

research/nlp/llama/llama/fairscale/nn/model_parallel/mappings.py

												
			@@ -0,0 +165,4 @@
		
			# def gather_from_model_parallel_region(input_: torch.Tensor) -> torch.Tensor:
		
			#     return _GatherFromModelParallelRegion.apply(input_)
		
			def copy_to_model_parallel_region(input_: torch.Tensor) -> torch.Tensor:

frelam commented 2 months ago

适配了这几个函数。先不考虑训练反向，只考虑推理前向，所以未使用torch.autograd.Function封装。

frelam reviewed 2 months ago

research/nlp/llama/llama/model.py

												
			@@ -0,0 +5,4 @@
		
			from dataclasses import dataclass
		
			from typing import Optional, Tuple
		
			# import fairscale.nn.model_parallel.initialize as fs_init

frelam commented 2 months ago

将fairscale 改为当前目录下迁移适配推理场景的fairscale

frelam reviewed 2 months ago

research/nlp/llama/llama/fairscale/nn/model_parallel/initialize.py

												
			@@ -0,0 +80,4 @@
		
			        print("> initializing ddp with size {}".format(data_parallel_size))
		
			        print("> initializing pipeline with size {}".format(pipeline_length))
		
			    # groups = torch.LongTensor(range(world_size)).reshape(data_parallel_size, pipeline_length, model_parallel_size)

frelam commented 2 months ago

torch.LongTensor(range(xxx))当前不支持。
尝试过在Tensor中加这块初始化逻辑，当图模式下不兼容，所以暂取消这块逻辑的加入。在网络这里需要适配一下

Erpim reviewed 2 months ago

research/nlp/llama/.github/ISSUE_TEMPLATE/bug_report.md

inference code from https://github.com/facebookresearch/llama

fairscale code from https://github.com/facebookresearch/fairscale.git

执行命令：

./download.sh 下载模型checkpoint

执行命令： msrun --worker_num=2 --local_worker_num=2 example_chat_completion.py ./llama-2-13b-chat ./tokenizer.model --max_seq_len 512 --max_batch_size 6

只修改评论中的代码行，其他无需做适配。

这些内容写到readme

最外层网络列表更新

zoulq reviewed 2 months ago

research/nlp/llama/example_text_completion.py

												
			@@ -0,0 +1,71 @@
		
			# Copyright (c) Meta Platforms, Inc. and affiliates.
		
			# This software may be used and distributed according to the terms of the Llama 2 Community License Agreement.

zoulq commented 2 months ago

example_chat_completion.py
example_text_completion.py
这两个文件都验证过了吗？

frelam commented 2 months ago

是的。

是否提供torch源码对标文件？如果没有，那就在readme.md文档里写清楚源码地址，并且去掉不涉及代码的公共文件(bug_report.md/CONTRIBUTING.md等)；

inference code from https://github.com/facebookresearch/llama

fairscale code from https://github.com/facebookresearch/fairscale.git

执行命令：

./download.sh 下载模型checkpoint

执行命令： msrun --worker_num=2 --local_worker_num=2 example_chat_completion.py ./llama-2-13b-chat ./tokenizer.model --max_seq_len 512 --max_batch_size 6

只修改评论中的代码行，其他无需做适配。

这些内容写到readme

done

是否提供torch源码对标文件？如果没有，那就在readme.md文档里写清楚源码地址，并且去掉不涉及代码的公共文件(bug_report.md/CONTRIBUTING.md等)；

done

最外层网络列表更新

done

No reviewers

No Label

No Milestone

No Assignees

3 Participants

Notifications

Due Date

No due date set.

Dependencies

This pull request currently doesn't have any dependencies.

@@ -0,0 +1,38 @@
 ---
 name: Bug report