Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
WANGW09 3f4e65f5f4 | 2 years ago | |
---|---|---|
.. | ||
__pycache__ | 2 years ago | |
README.md | 2 years ago | |
__init__.py | 2 years ago | |
func.py | 2 years ago | |
parallel.py | 2 years ago | |
sync.py | 2 years ago | |
sync_bn.py | 2 years ago | |
test.py | 2 years ago | |
utils.py | 2 years ago |
Multi-Gpus Synchronized Batch Normalization implementation in PyTorch
This module is a synchronized version of Batch Normalization when using multi-gpus for deep learning, aka 'Syn-BN', as the mean and standard-deviation are reduced across all devices during training.
Traditionally, when using 'nn.DataParallel' to wrap module during training, the built-in PyTorch BatchNorm normalize the tensor on each device using the statistics only on that device, thus the statistics might be inaccurate.
Instead, in this synchronized version, the statistics will be computed over all training samples distributed on each devices.
Besides, in single-gpu or cpu-only case, this module behaves exactly same as the built-in PyTorch implementation.
Note that this module may exist some design problems, if you have any questions or suggestions, please feel free to open an issue or submit a pull request, let's make it better!
Usually, the working batch-size is typically large enough to obtain good statistics for some computer vision tasks, such as classification and detection, thus there is no need to synchronize BN layer during the training, while synchronization will slow down the training.
However, for the other computer vision tasks, such as semantic segmentation, which belongs to dense prediction problem, is very memory consuming, the working bath-size is usually very small(typically 2 or 4 in each GPU), thus it will hurt the performance without synchronization.
(The importance of synchronized batch normalization in object detection has been proved with an extensive analysis in the paper https://arxiv.org/abs/1711.07240)
To use the Syn-BN, I customize a data parallel wrapper named 'DataParallelWithCallBack', which inherits nn.DataParallel, it will call a callback function when in data parallel replication. This introduces a slight difference with typical usage of the nn.DataParallel.
Use it with a provided, customized data parallel wrapper:
from sync import DataParallelWithCallBack
from sync_bn import SynchronizedBatchNorm2d
sync_bn = SynchronizedBatchNorm2d(
num_features=3, eps=1e-5, momentum=0.1, affine=True, sync_timeout=15.
)
sync_bn = DataParallelWithCallBack(sync_bn, device_ids=[0, 1])
sync_bn.to(device)
Or, if you have already defined a model wrapped in nn.DataParallel like:
from torchvision import models
m = models.resnet50(pretrained=True)
m = nn.DataParallel(m, device_ids=[0,1])
m.to(device)
then you can use the method 'convert_model' to convert your model to use Syn-BN easily:
from func import convert_model
m = convert_model(m)
this will change all BNs into Syn-BNs which is contained in your model.
chrisway(cw), 2020.
动态注意力图时空网络(DAGSTNet)
Pickle Text Python other
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》