#61 add lamma inference

Open
frelam wants to merge 12 commits from frelam/MSAdapterModelZoo:master-lamma into master
frelam commented 2 months ago
1. inference code from https://github.com/facebookresearch/llama 2. fairscale code from https://github.com/facebookresearch/fairscale.git 执行命令: 1. ./download.sh 下载模型checkpoint 2. 执行命令: msrun --worker_num=2 --local_worker_num=2 example_chat_completion.py ./llama-2-13b-chat ./tokenizer.model --max_seq_len 512 --max_batch_size 6 只修改评论中的代码行, 其他无需做适配。
frelam changed title from [WIP] add lamma inference to add lamma inference 2 months ago
frelam reviewed 2 months ago
@@ -0,0 +165,4 @@
# def gather_from_model_parallel_region(input_: torch.Tensor) -> torch.Tensor:
# return _GatherFromModelParallelRegion.apply(input_)

def copy_to_model_parallel_region(input_: torch.Tensor) -> torch.Tensor:
frelam commented 2 months ago
适配了这几个函数。 先不考虑训练反向, 只考虑推理前向,所以未使用torch.autograd.Function封装。
frelam reviewed 2 months ago
@@ -0,0 +5,4 @@
from dataclasses import dataclass
from typing import Optional, Tuple

# import fairscale.nn.model_parallel.initialize as fs_init
frelam commented 2 months ago
将fairscale 改为当前目录下迁移适配推理场景的fairscale
frelam reviewed 2 months ago
@@ -0,0 +80,4 @@
print("> initializing ddp with size {}".format(data_parallel_size))
print("> initializing pipeline with size {}".format(pipeline_length))

# groups = torch.LongTensor(range(world_size)).reshape(data_parallel_size, pipeline_length, model_parallel_size)
frelam commented 2 months ago
torch.LongTensor(range(xxx))当前不支持。 尝试过在Tensor中加这块初始化逻辑, 当图模式下不兼容, 所以暂取消这块逻辑的加入。 在网络这里需要适配一下
Erpim reviewed 2 months ago
@@ -0,0 +1,38 @@
---
name: Bug report
Erpim commented 2 months ago
github相关文件可删除
frelam commented 2 months ago
done
Erpim commented 2 months ago
Collaborator
> 1. inference code from https://github.com/facebookresearch/llama > 2. fairscale code from https://github.com/facebookresearch/fairscale.git > > 执行命令: > 1. ./download.sh 下载模型checkpoint > 2. 执行命令: msrun --worker_num=2 --local_worker_num=2 example_chat_completion.py ./llama-2-13b-chat ./tokenizer.model --max_seq_len 512 --max_batch_size 6 > > 只修改评论中的代码行, 其他无需做适配。 这些内容写到readme
Erpim commented 2 months ago
Collaborator
最外层网络列表更新
zoulq reviewed 2 months ago
@@ -0,0 +1,71 @@
# Copyright (c) Meta Platforms, Inc. and affiliates.
# This software may be used and distributed according to the terms of the Llama 2 Community License Agreement.

zoulq commented 2 months ago
example_chat_completion.py example_text_completion.py 这两个文件都验证过了吗?
frelam commented 2 months ago
是的。
zoulq commented 2 months ago
Collaborator
是否提供torch源码对标文件?如果没有,那就在readme.md文档里写清楚源码地址,并且去掉不涉及代码的公共文件(bug_report.md/CONTRIBUTING.md等);
frelam commented 2 months ago
Poster
> > 1. inference code from https://github.com/facebookresearch/llama > > 2. fairscale code from https://github.com/facebookresearch/fairscale.git > > > > 执行命令: > > 1. ./download.sh 下载模型checkpoint > > 2. 执行命令: msrun --worker_num=2 --local_worker_num=2 example_chat_completion.py ./llama-2-13b-chat ./tokenizer.model --max_seq_len 512 --max_batch_size 6 > > > > 只修改评论中的代码行, 其他无需做适配。 > > 这些内容写到readme done
frelam commented 2 months ago
Poster
> 是否提供torch源码对标文件?如果没有,那就在readme.md文档里写清楚源码地址,并且去掉不涉及代码的公共文件(bug_report.md/CONTRIBUTING.md等); done
frelam commented 2 months ago
Poster
> 最外层网络列表更新 done
This pull request can be merged automatically.
You are not authorized to merge this pull request.
This branch is out-of-date with the base branch
Sign in to join this conversation.
No reviewers
No Label
No Milestone
No Assignees
3 Participants
Notifications
Due Date

No due date set.

Dependencies

This pull request currently doesn't have any dependencies.

Loading…
There is no content yet.