thomas-yanxin

thomas-yanxin 从镜像同步了 main 分支的代码到 thomas-yanxin/MedicalGPT

726bd2a626 update cohere.
5799ee483d update readme.
3b604e5f12 Merge pull request #360 from ker2xu/main Updates for readme and demo ipynb and a small update for deprecated function
a99e3ee0ef update stream output
030779456b 1. Add scikit-learn to requirements; 2. Update deprecated API of peft; 3. set CUDA_VISIBLE_DEVICES=0 in ppo part of demo ipynb such that users with multi CUDA devices can run it smoothly; 4; Modify the test step in demo ipynb to non-interactive; 5. Copy INSTALL step to ENG doc.
比较 20 提交 »

15 小时前

thomas-yanxin 从镜像同步了 v0.29.0-release 分支的代码到 thomas-yanxin/accelerate

e82de1215a Release: v0.29.3
02f6abcfd2 add strict arg to load_checkpoint_and_dispatch (#2641)
fa0bd4005c fix backend check (#2670) * fix backend check * reformat backend check * Update src/accelerate/state.py Co-authored-by: Zach Mueller <muellerzr@gmail.com> * Update src/accelerate/state.py Co-authored-by: Zach Mueller <muellerzr@gmail.com> * raise value error if backend mismatch * Update src/accelerate/state.py Co-authored-by: Zach Mueller <muellerzr@gmail.com> --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com>
比较 3 提交 »

15 小时前

thomas-yanxin 从镜像同步了 main 分支的代码到 thomas-yanxin/accelerate

abc86c0e35 Enable BF16 autocast to everything during FP8 + some tweaks to enable FSDP (#2655) * Basic autocasting stuff * Delay fp8 autocast until after DDP wrapping * More fixes * Bookmark: without dtype change * Bookmark: with dtype changes * Different alternative, better results * Didn't matter what order, same result * Revert + maintain * Fin * Refactor based on feedback * native_amp bool * Final nits
4450cb3132 Deprecate tqdm args + slight logic tweaks (#2673) * Deprecate + slight logic fix * Maybe fix test?
fd0dcd1c45 fix backend check (#2670) * fix backend check * reformat backend check * Update src/accelerate/state.py Co-authored-by: Zach Mueller <muellerzr@gmail.com> * Update src/accelerate/state.py Co-authored-by: Zach Mueller <muellerzr@gmail.com> * raise value error if backend mismatch * Update src/accelerate/state.py Co-authored-by: Zach Mueller <muellerzr@gmail.com> --------- Co-authored-by: Zach Mueller <muellerzr@gmail.com>
f478201c28 Pin DS...again.. (#2679)
c7046845e7 Fix deepspeed moe test with version check (#2677)
比较 6 提交 »

15 小时前

thomas-yanxin 从镜像同步了 fully-remove-accelerate-config 分支的代码到 thomas-yanxin/accelerate

c89da3b9c6 Fix import circular

15 小时前

thomas-yanxin 从镜像同步了 develop 分支的代码到 thomas-yanxin/PaddleMIX

8b896d5338 ppdiffusers 0.24.0 release (#507)
487d493cb3 [ppdiffusers] AnimateAnyone Training Support (#501) Co-authored-by: xuzhang <westfish@126.com>
比较 2 提交 »

1 天前

thomas-yanxin 从镜像同步了 update-tokenizers-version 分支的代码到 thomas-yanxin/transformers

d5cb129e3d fixup
d5a20b8e94 Accounting for the breaking change.
ea98377415 oups
cfccbcc513 "tokenizers": "tokenizers>=0.19,<0.20",
884a51f516 last two
比较 11 提交 »

1 天前

thomas-yanxin 从镜像同步了 mi300-ci 分支的代码到 thomas-yanxin/transformers

91b1769b81 Update tests/trainer/test_trainer_seq2seq.py
781dfcf476 Update tests/trainer/test_trainer_seq2seq.py
0ff348e2df Update tests/trainer/test_trainer_seq2seq.py
a89331f70d Update tests/trainer/test_trainer_seq2seq.py
8f530869ae style
比较 7 提交 »

1 天前

thomas-yanxin 从镜像同步了 main 分支的代码到 thomas-yanxin/transformers

df96438484 Fix missing `prev_ci_results` (#30313) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
ce8e64fbe2 Dev version
5728b5ad00 FIX: Fixes unexpected behaviour for Llava / LLama & AWQ Fused modules + revert #30070 at the same time (#30317) * Update awq.py * style * revert felix PR * fix * add felix comments
005b957fb8 Add DBRX Model (#29921) * wip * fix __init__.py * add docs * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * address comments 1 * work on make fixup * pass configs down * add sdpa attention * remove DbrxBlock * add to configuration_auto * docstring now passes formatting test * fix style * update READMEs * add dbrx to modeling_auto * make fix-copies generated this * add DBRX_PRETRAINED_CONFIG_ARCHIVE_MAP * config docstring passes formatting test * rename moe_loss_weight to router_aux_loss_coef * add to flash-attn documentation * fix model-path in tests * Explicitly make `"suli"` the default `ffn_act_fn` Co-authored-by: Wing Lian <wing.lian@gmail.com> * default to using router_aux_loss_coef over ffn_config[moe_loss_weight] * fix _flash_attn_uses_top_left_mask and is_causal * fix tests path * don't use token type IDs * follow Llama and remove token_type_ids from test * init ConfigTester differently so tests pass * remove multiple choice test * remove question + answer test * remove sequence classification test * remove token classification test * copy Llama tests and remove token_type_ids from test inputs * do not test pruning or headmasking; style code * add _tied_weights_keys parameter to pass test * add type hints * fix type check * update config tester * remove masked_lm test * remove encoder tests * initialize DbrxModelTester with correct params * style * torch_dtype does not rely on torch * run make fixup, fix-copies * use https://huggingface.co/v2ray/dbrx-base-fixed/blob/main/modeling_dbrx.py * add copyright info * fix imports and DbrxRotaryEmbedding * update DbrxModel docstring * use copies * change model path in docstring * use config in DbrxFFN * fix flashattention2, sdpaattention * input config to DbrXAttention, DbrxNormAttentionNorm * more fixes * fix * fix again! * add informative comment * fix ruff? * remove print statement + style * change doc-test * fix doc-test * fix docstring * delete commented out text * make defaults match dbrx-instruct * replace `router_aux_loss_coef` with `moe_loss_weight` * is_decoder=True * remove is_decoder from configtester * implement sdpa properly * make is_decoder pass tests * start on the GenerationTesterMixin tests * add dbrx to sdpa documentation * skip weight typing test * style * initialize smaller model Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Add DBRX to toctree * skip test_new_cache_format * make config defaults smaller again * add pad_token_id * remove pad_token_id from config * Remove all references to DBRX_PRETRAINED_CONFIG_ARCHIVE_MAP * Update src/transformers/models/dbrx/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/dbrx/modeling_dbrx.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/model_doc/dbrx.md Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Update src/transformers/models/dbrx/configuration_dbrx.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/model_doc/dbrx.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix typo * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update docs, fix configuration_auto.py * address pr comments * remove is_decoder flag * slice * fix requires grad * remove grad * disconnect differently * remove grad * enable grads * patch * detach expert * nissan al ghaib * Update modeling_dbrx.py * Update src/transformers/models/dbrx/modeling_dbrx.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * replace "Gemma" with "Dbrx" * remove # type: ignore * don't hardcode vocab_size * remove ToDo * Re-add removed idefics2 line * Update test to use tiny-random! * Remove TODO * Remove one more case of loading the entire dbrx-instruct in the tests * Update src/transformers/models/dbrx/modeling_dbrx.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * address some comments * small model * add dbrx to tokenization_auto * More docstrings with add_start_docstrings * Dbrx for now * add PipelineTesterMixin * Update src/transformers/models/dbrx/configuration_dbrx.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * remove flash-attn2 import error * fix docstring Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add useage example * put on one line Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix ffn_act_fn Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * change "dbrx" to "DBRX" for display purposes. * fix __init__.py? * fix __init__.py * fix README * return the aux_loss * remove extra spaces * fix configuration_auto.py * fix format in tokenization_auto * remove new line * add more useage examples --------- Co-authored-by: Abhi Venigalla <abhi.venigalla@databricks.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Eitan Turok <eitan.turok@databricks.com> Co-authored-by: Eitan Turok <150733043+eitanturok@users.noreply.github.com> Co-authored-by: Wing Lian <wing.lian@gmail.com> Co-authored-by: Eitan Turok <eitanturok@gmail.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: Matt <rocketknight1@gmail.com> Co-authored-by: Your Name <you@example.com> Co-authored-by: Mihir Patel <mihir.v.patel7@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
63c5e27efb Do not drop mask with SDPA for more cases (#30311) * overlooked * style * cleaner
比较 26 提交 »

1 天前

thomas-yanxin 从镜像同步了 master 分支的代码到 thomas-yanxin/lightning

c235f20e71 Remove the requirement for FSDPStrategy subclasses to only support GPU (#19781)

1 天前

thomas-yanxin 从镜像同步了 dev 分支的代码到 thomas-yanxin/BMTrain

5dad728f5f add grad scale for optim_manager
6670f5c889 Merge pull request #188 from OpenBMB/dev Update workflow config
5fde9e009d Merge pull request #187 from OpenBMB/dev Update Release document
dd2b5bc59d Merge pull request #182 from OpenBMB/dev BMTrain New Version Release v1.0.0
584359041a Pull request template (#158)
比较 5 提交 »

1 天前

thomas-yanxin 从镜像同步了 main 分支的代码到 thomas-yanxin/lit-parrot

e1df1b4cce Update README.md
d68d4271cc proposed LitGPT logo update (#1320)
c351717d70 Add deploy studio link (#1317)
21e9a0de85 Rewrite finetune command if subcommand is not provided (#1313)
c67de02705 Litgpt serve API (#1299) Co-authored-by: Luca Antiga <luca@lightning.ai>
比较 12 提交 »

1 天前

thomas-yanxin 从镜像同步了 main 分支的代码到 thomas-yanxin/peft

144b7345c2 ENH Support safetensor in multitask_prompt_tuning (#1662) Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
bdb856786e MNT Remove dreambooth git submodule (#1660) Leftover that was not removed in BOFT PR.
ed865e2812 FIX Bug with handling of active adapters (#1659) There was a bug for some models like IA3, LoHa, etc., where calling set_adapter would not correctly update the active_adapter. This is now fixed. Note that this is not about the active_adapter attribute on PeftModel or layers, which are handled separately. This PR also ensures that LoraModel, IA3Model, etc. consistently use self.active_adapters, not self.active_adapter. The latter should be treated more like a private attribute (but this isn't changed for backwards compatibility).
比较 3 提交 »

1 天前

thomas-yanxin 从镜像同步了 main 分支的代码到 thomas-yanxin/bitsandbytes

ffd7d0db6a (docs) integrations: fix omission in bf16 related warning (#1183) * (docs) integrations: fix omission in bf16 related warning * (docs) integrations: further clarifications to prior fix * (docs) integrations: fix punctuation Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * (docs) integrations: fix omitted code formatting --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
6cecb65a56 Update pandas requirement from ~=2.2.1 to ~=2.2.2 in the major group (#1182) Updates the requirements on [pandas](https://github.com/pandas-dev/pandas) to permit the latest version. Updates `pandas` to 2.2.2 - [Release notes](https://github.com/pandas-dev/pandas/releases) - [Commits](https://github.com/pandas-dev/pandas/compare/v2.2.1...v2.2.2) --- updated-dependencies: - dependency-name: pandas dependency-type: direct:development dependency-group: major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
比较 2 提交 »

1 天前

thomas-yanxin 从镜像同步了 docs-bf16-warning-fix 分支的代码到 thomas-yanxin/bitsandbytes

1ea5f203bd (docs) integrations: fix omitted code formatting
7eb44a93d4 (docs) integrations: fix punctuation Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
比较 2 提交 »

1 天前

thomas-yanxin 从镜像同步了 main 分支的代码到 thomas-yanxin/DB-GPT

00af9fed35 fix(awel): Fix awel check for empty DataFrame data bug (#1430)
8625690107 feat(model): Support CodeQwen1.5-7B-Chat (#1431)
比较 2 提交 »

1 天前

thomas-yanxin 从镜像同步了 master 分支的代码到 thomas-yanxin/Langchain-Chatchat

2c7feae7bb Update README.md (#3804)
52f2a2a046 Update requirements.txt
9bf0612e3d Feat/action update readme (#3779)
比较 3 提交 »

1 天前

thomas-yanxin 从镜像同步了 release/2.7 分支的代码到 thomas-yanxin/PaddleNLP

904c1fb812 add checkpoint_done to last model (#8223)

1 天前

thomas-yanxin 从镜像同步了 refactor-training-loop 分支的代码到 thomas-yanxin/PaddleNLP

fc0892b901 revert fix-sync 这个分支居然没有任何拦截？
eaaccdd6cd fix-sync
比较 2 提交 »

1 天前

thomas-yanxin 从镜像同步了 develop 分支的代码到 thomas-yanxin/PaddleNLP

3bb4bb751e add a100 test ground truth (#8249) * add a100 test ground truth * add requirements * cache is_a100 result * update * update * update sp allclose * fix check_result * add ground truth for llm_gpt_dygraph
909ff315d5 Add p2p_comm_overlap for Llama-2-70b benchmark. (#8276)
beb433a9ae [LLM] add memory stats to logger of trainer (#8269)
比较 3 提交 »

1 天前

thomas-yanxin 从镜像同步了 release/2.0 分支的代码到 thomas-yanxin/swift

d1376a6ed2 bump version
5347814d4e Fix loss scale (#720) (cherry picked from commit 87d24cba18125c2ee0677121fc21bdfe51b4acdc)
5cbaf3d889 Merge commit '3fecc8cfa2d0181589d711aff3da5b6904c291ac' into release/2.0 * commit '3fecc8cfa2d0181589d711aff3da5b6904c291ac': support Codeqwen-7b-chat model (#718) Fix bugs (#714) Fix many bug (#716) fix (#711) [doc] Update index.md (#709) support Llava-v1.6-34b model (#708) Support mPLUG-Owl2 (#706) fix minicpm-v-v2 bug (#703) fix readme (#704) Drop data by gradient_accumulation_steps (#626) Fix stream 0415 (#702) feat(model): support minicpm-v-2 (#699) bump version # Conflicts: # docs/source/Multi-Modal/minicpm-v-2最佳实践.md # swift/llm/utils/template.py # swift/version.py
3fecc8cfa2 support Codeqwen-7b-chat model (#718)
fde8927024 Fix bugs (#714)
比较 16 提交 »

1 天前