You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.
Hanlard 0e55ac941f update 3 months ago
..
__init__.py update 3 months ago
accelerate_base_datatypes.py update 3 months ago
configs.py update 3 months ago
default_configs.py update 3 months ago
dpo_types.py update 3 months ago
ilql_types.py update 3 months ago
method_configs.py update 3 months ago
ppo_types.py update 3 months ago
spin_types.py update 3 months ago

复现了offline对齐算法的一系列工作,欢迎大家交流。 包括DPO, PRO, RRHF和SPIN。还有团队发表在ICLR2024的CPPO,以及最新的研究工作COPR。

Python

Contributors (2)