Are you sure you want to delete this task? Once this task is deleted, it cannot be recovered.
Sunhy 3d1db52534 | 1 year ago | |
---|---|---|
ascend310_infer | 1 year ago | |
scripts | 1 year ago | |
Loss.py | 1 year ago | |
README.md | 1 year ago | |
data.py | 1 year ago | |
eval.py | 1 year ago | |
export.py | 1 year ago | |
network_define.py | 1 year ago | |
pip-requirements.txt | 1 year ago | |
tasnet.py | 1 year ago | |
train.py | 1 year ago | |
train_wrapper.py | 1 year ago |
TasNet使用编码器-解码器框架直接在时域中对信号进行建模,并对非负编码器输出执行源分离。该方法去除了频率分解步骤,并将分离问题简化为编码器输出上的源掩码估计,然后由解码器合成。该系统降低了语音分离的计算成本,并显着降低了输出所需的最小延迟。TasNet 适用于需要低功耗、实时实现的应用,例如可听设备和电信设备。
论文: TASNET: TIME-DOMAIN AUDIO SEPARATION NETWORK FOR REAL-TIME, SINGLE-CHANNEL SPEECH SEPARATION
encoder:提取语音特征
separation:将encoder得到的结果传入一个4层的LSTM并进行分离
decoder:将分离结果进行处理,得到语音波形
使用的数据集为: librimix,LibriMix 是一个开源数据集,用于在嘈杂环境中进行源代码分离。
要生成 LibriMix,请参照开源项目:https://github.com/JorisCos/LibriMix
pip install -r requirements.txt
TasNet
├── ascend310_infer
├─ build.sh # launch main.cc
├─ CMakeLists.txt # CMakeLists
├─ main.cc # 310 main function
├─ requirements.txt # requirements
├─ README.md # descriptions
├── scripts
├─ run_distribute_train.sh # launch ascend training(8 pcs)
├─ run_stranalone_train.sh # launch ascend training(1 pcs)
├─ run_eval.sh # launch ascend eval
├─ run_infer_310.sh # launch infer 310
├─ train.py # train script
├─ eval.py # eval
├─ preprocess.py # preprocess json
├─ data.py # postprocess data
├─ export.py # export mindir script
├─ network_define.py # define network
├─ tasnet.py # tasnet
├─ Loss.py # loss function
├─ train_wrapper.py # clip norm function
├─ preprocess_310.py # preprocess of 310
├─ postprocess.py # postprocess of 310
数据预处理、训练、评估的相关参数在train.py
等文件
数据预处理相关参数
in_dir 预处理前加载原始数据集目录
out_dir 预处理后的json文件的目录
sample_rate 采样率
训练和模型相关参数
in_dir 预处理前加载原始数据集目录
out_dir 预处理后的json文件的目录
train_dir 训练集
sample_rate 采样率
data_url 云上训练数据路径
train_url 云上训练模型存储位置
L 语音分段每段长度
N 基信号数量
hidden_size LSTM隐藏层数量
num_layers LSTM层数
bidirectional 是否为双向LSTM
nspk 说话人的数量
评估相关参数
model_path ckpt路径
cal_sdr 是否计算SDR
data_dir 测试集路径
配置相关参数
device_target 硬件,只支持ASCEND
device_id 设备号
数据预处理运行示例:
python preprocess.py
数据预处理过程很快,大约需要几分钟时间
运行示例:
python train.py
或者可以运行脚本:
bash ./scripts/run_standalone_train.sh [DEVICE_ID]
可以通过train.log查看结果
分布式训练脚本如下
bash run_distribute_train.sh [DEVICE_NUM] [RANK_TABLE_FILE]
可以通过对应卡号的文件夹中的paralletrain.log查看结果
运行示例:
python eval.py
参数:
model_path ckpt文件
data_dir 测试集路径
或者可以运行脚本:
bash run_eval.sh [DEVICE_ID]
可以通过eval.log查看结果
python export.py
./scripts/run_infer_310.sh [MINDIR_PATH] [TEST_PATH] [NEED_PREPROCESS]
Average SISNR improvement: 5.97
参数 | TasNet |
---|---|
资源 | Ascend910 |
上传日期 | 2022-9-2 |
MindSpore版本 | 1.6.1 |
数据集 | Librimix |
训练参数 | 8p, epoch = 50, batch_size = 4 |
优化器 | Adam |
损失函数 | SI-SNR |
输出 | SI-SNR(5.97) |
损失值 | -9.52 |
运行速度 | 8p 5444.8 ms/step |
训练总时间 | 8p: 约36h |
随机性主要来自下面两点:
Dear OpenI User
Thank you for your continuous support to the Openl Qizhi Community AI Collaboration Platform. In order to protect your usage rights and ensure network security, we updated the Openl Qizhi Community AI Collaboration Platform Usage Agreement in January 2024. The updated agreement specifies that users are prohibited from using intranet penetration tools. After you click "Agree and continue", you can continue to use our services. Thank you for your cooperation and understanding.
For more agreement content, please refer to the《Openl Qizhi Community AI Collaboration Platform Usage Agreement》