You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.
PCL-张晗 9338f32d36 更新 'README.md' 3 months ago
trlx update 3 months ago
README.md 更新 'README.md' 3 months ago
dpo_accelerate_config.yaml update 3 months ago
dpo_bf16_accelerate_config.yaml update 3 months ago
train_DPO.py update 3 months ago

复现了offline对齐算法的一系列工作,欢迎大家交流。 包括DPO, PRO, RRHF和SPIN。还有团队发表在ICLR2024的CPPO,以及最新的研究工作COPR。

Python

Contributors (2)