Deleting a branch is permanent. It CANNOT be undone. Continue?
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》
参考样例:
AISynergy/examples/quickstart_pytorch
增加server端接口参数,进行配置加密模块以及压缩模块
syncore.server.run_server(
...
mask_protocol='DHProtocol',
compression_protocol='TopKProtocol',
server_with_compression=True,
client_with_compression=True,
)
mask_protocol: 加密mask协议名称#TODO only Diffie–Hellman key excheange now
compression_protocol: 压缩协议名称#TODO only topk now
server_with_compression: server端模型是否需要压缩
client_with_compression: client端模型是否需要压缩
server端完成相应接口配置,即会创新默认的DHProtocol类实例进行mask计算和TopKCompressionProtocol类实例进行通信压缩
topk压缩协议说明:
server端接口配置topk协议名称,会在server端和client端创TopKCompressionProtocol类默认参数实例负责topk压缩协议
TopKCompressionProtocol在
AISynergy/AISynergy-core/src/AISyncore/common/compression/topk_compression_protocol.py定义
目前compress_ratio是默认配置0.5,可以自行定义默认值进行对比实验
添加其他压缩协议:
可参考TopKCompressionProtocol实现
1 继承CompressionProtocol完成compress,decompress功能,并完成注册协议名称,例如:
compression_protocol_register.register('TopKProtocol', TopKCompressionProtocol)
2 添加相应的压缩参数的数据格式,grpc通信参数的数据格式,以及序列化方法
压缩参数数据格式:AISynergy/AISynergy-core/src/AISyncore/common/typing.py
grpc通信参数的数据格式:AISynergy/AISynergy-core/src/AISyncore/proto/transport.proto
序列化方法:AISynergy/AISynergy-core/src/AISyncore/common/serde.py
3 添加压缩参数数据格式和numpy数据格式的转换方法:
(AISynergy/AISynergy-core/src/AISyncore/common/parameter.py)
weights_to_parameters
parameters_to_weights
测试100个(10,1000)的numpy.float32数据的压缩及解压缩总体时间大约需要1s
start encode 1649987906.2903526
TopKCompressionProtocol.compress duration(s): 0.052579641342163086
weights_to_parameters serde duration(s): 0.664482593536377
finish encode, start decode 1649987906.9550595
TopKCompressionProtocol.decompress duration(s): 0.03636360168457031
parameters_to_weights serde duration(s): 0.23481202125549316
finish decode 1649987907.197273
encode dureation(s): 0.6647069454193115
decode duration(s): 0.24221348762512207
total duration(s): 0.9069204330444336
Done
瓶颈在压缩数据格式转换上
原因:目前topk实现方式:形成indices的位置信息后需要转换成1位的mask信息,原始的bool值是char数据类型,占8位,所以将8个bool值转为一个字节mask,这个转换是性能瓶颈。
解决思路:1. 添加多进行并行处理此转换函数或接入C++接口优化。
2. 在压缩比ratio极小(压缩后参数很少)的场景,直接传indices的int型位置信息,无需进行转换
更多效率-精度实验有待测量
请添加topk功能测试结果(不同环境下、不同规模模型的压缩效率、耗时、精度变化等)
加密模块原理图: