基于联合特征与随机森林的伪装语音检测

于佳祺; 简志华; 徐嘉; 游林; 汪云路; 吴超

doi:10.11959/j.issn.1000-0801.2022089

您当前的位置：

首页 >

文章列表页 >

基于联合特征与随机森林的伪装语音检测

研究与开发 | 更新时间：2024-06-05

- 基于联合特征与随机森林的伪装语音检测
- Spoofing speech detection algorithm based on joint feature and random forest
- 电信科学 2022年38卷第6期页码：91-99
- 作者机构：
  
  1. 杭州电子科技大学通信工程学院，浙江杭州 310018
  2. 杭州电子科技大学网络空间安全学院，浙江杭州 310018
- 作者简介：
  
  [ "于佳祺（1997- ），男，杭州电子科技大学通信工程学院硕士生，主要研究方向为语音伪装检测、特征提取与分析" ]
  [ "简志华（1978- ），男，博士，杭州电子科技大学通信工程学院副教授、硕士生导师，主要研究方向为语音转换、伪装语音检测、声纹识别等" ]
  [ "徐嘉（1998- ），女，杭州电子科技大学通信工程学院硕士生，主要研究方向为语音伪装及检测" ]
  [ "游林（1966- ），男，博士，杭州电子科技大学网络空间安全学院教授、硕士生导师，主要研究方向为生物信息处理、信息安全、密码学等" ]
  [ "汪云路（1980- ），女，博士，杭州电子科技大学网络空间安全学院讲师，主要研究方向为音频信息处理、信息隐藏" ]
  [ "吴超（1988- ），男，博士，杭州电子科技大学通信工程学院讲师，主要研究方向为导航信号处理及欺骗干扰检测" ]
- 基金信息：
  
  国家自然科学基金资助项目;The National Natural Science Foundation of China(61201301);国家自然科学基金资助项目;The National Natural Science Foundation of China(61772166);国家自然科学基金资助项目;The National Natural Science Foundation of China(61901154)
- DOI：10.11959/j.issn.1000-0801.2022089
  中图分类号： TP391.42
- 网络出版日期：2022-06，
  
  纸质出版日期：2022-06-20
- 稿件说明：
移动端阅览
于佳祺, 简志华, 徐嘉, 等. 基于联合特征与随机森林的伪装语音检测[J]. 电信科学, 2022,38(6):91-99.

Jiaqi YU, Zhihua JIAN, Jia XU, et al. Spoofing speech detection algorithm based on joint feature and random forest[J]. Telecommunications science, 2022, 38(6): 91-99.
于佳祺, 简志华, 徐嘉, 等. 基于联合特征与随机森林的伪装语音检测[J]. 电信科学, 2022,38(6):91-99. DOI： 10.11959/j.issn.1000-0801.2022089.

Jiaqi YU, Zhihua JIAN, Jia XU, et al. Spoofing speech detection algorithm based on joint feature and random forest[J]. Telecommunications science, 2022, 38(6): 91-99. DOI： 10.11959/j.issn.1000-0801.2022089.

摘要

为了能较为全面地描述语音信号的特征信息，提高伪装检测率，提出了一种基于均匀局部二值模式纹理特征与常数Q倒谱系数声学特征相结合，并以随机森林为分类模型的伪装语音检测方法。利用均匀局部二值模式提取语音信号语谱图中的纹理特征矢量，并与常数Q倒谱系数构成联合特征，再用所获得的联合特征矢量训练随机森林分类器，从而实现了伪装语音检测。实验中，分别对其他特征参数以及支持向量机分类器模型所构建的几种伪装检测系统进行了性能对照，结果表明，所提联合特征与随机森林模型相结合的语音伪装检测系统具有最优的检测性能。

Abstract

In order to describe the characteristic information of the speech signal more comprehensively and improve the detection rate of camouflage

a spoofing speech detection method based on the combination of uniform local binary pattern texture feature and constant Q cepstrum coefficient acoustic feature was proposed

which used random forest as the classifier model.The texture feature vector in the speech signal spectrogram was extracted by using the uniform local binary mode

and the joint feature was formed with the constant Q cepstrum coefficient.Then

the obtained joint feature vector was used to train the random forest classifier

so as to realize the camouflage speech detection.In the experiment

the performances of several spoofing detection systems constructed by other feature parameters and the support vector machine classifier model were compared

and the results show that the proposed speech spoofing detection system combined with the joint feature and the random forest model has the best performance.

关键词

Keywords

references

GOMEZ-ALANIS A , GONZALEZ-LOPEZ J A , PEINADO A M . A kernel density estimation based loss function and its application to ASV-spoofing detection [J ] . IEEE Access , 2020 , 8 : 108530 - 108543 .

肜娅峰 , 陈晨 , 陈德运 , 等 . 基于贝叶斯主成分分析的i-vector说话人确认方法 [J ] . 电子学报 , 2021 , 49 ( 11 ): 2186 - 2194 .

RONG Y F , CHEN C , CHEN D Y , et al . Bayesian principal component analysis for I-vector speaker verification [J ] . Acta Electronica Sinica , 2021 , 49 ( 11 ): 2186 - 2194 .

LI N , MAK M W , CHIEN J T . Deep neural network driven mixture of PLDA for robust i-vector speaker verification [C ] // Proceedings of 2016 IEEE Spoken Language Technology Workshop . Piscataway:IEEE Press , 2016 : 186 - 191 .

ALEGRE F , JANICKI A , EVANS N . re-assessing the threat of replay spoofing attacks against automatic speaker verification [C ] // Proceedings of 2014 International Conference of the Biometrics Special Interest Group (BIOSIG) . Piscataway:IEEE Press , 2014 : 1 - 6 .

林朗 , 王让定 , 严迪群 , 等 . 基于逆梅尔对数频谱系数的回放语音检测算法 [J ] . 电信科学 , 2018 , 34 ( 5 ): 90 - 98 .

LIN L , WANG R D , YAN D Q , et al . A playback speech detection algorithm based on log inverse Mel-frequency spectral coefficient [J ] . Telecommunications Science , 2018 , 34 ( 5 ): 90 - 98 .

NAUTSCH A , WANG X , EVANS N , et al . ASVspoof 2019:spoofing countermeasures for the detection of synthesized,converted and replayed speech [J ] . IEEE Transactions on Biometrics,Behavior,and Identity Science , 2021 , 3 ( 2 ): 252 - 265 .

任延珍 , 刘晨雨 , 刘武洋 , 等 . 语音伪造及检测技术研究综述 [J ] . 信号处理 , 2021 , 37 ( 12 ): 2412 - 2439 .

REN Y Z , LIU C Y , LIU W Y , et al . A survey on speech forgery and detection [J ] . Journal of Signal Processing , 2021 , 37 ( 12 ): 2412 - 2439 .

YU H , TAN Z H , MA Z Y , et al . Spoofing detection in automatic speaker verification systems using DNN classifiers and dynamic acoustic features [J ] . IEEE Transactions on Neural Networks and Learning Systems , 2018 , 29 ( 10 ): 4633 - 4644 .

PAUL D , PAL M , SAHA G . Novel speech features for improved detection of spoofing attacks [C ] // Proceedings of 2015 Annual IEEE India Conference . Piscataway:IEEE Press , 2015 : 1 - 6 .

HIDAYAT R , BEJO A , SUMARYONO S , et al . Denoising speech for MFCC feature extraction using wavelet transformation in speech recognition system [C ] // Proceedings of 2018 10th International Conference on Information Technology and Electrical Engineering (ICITEE) . Piscataway:IEEE Press , 2018 : 280 - 284 .

ÖZSÖNMEZ D B , ACARMAN T , PARLAK İ B , . Optimal classifier selection in Turkish speech emotion detection [C ] // Proceedings of 2021 29th Signal Processing and Communications Applications Conference (SIU) . Piscataway:IEEE Press , 2021 : 1 - 4 .

PENG X , LU C Y , YI Z , et al . Connections between nuclear-norm and frobenius-norm-based representations [J ] . IEEE Transactions on Neural Networks and Learning Systems , 2018 , 29 ( 1 ): 218 - 224 .

TODISCO M , DELGADO H , EVANS N . Constant Q cepstral coefficients:a spoofing countermeasure for automatic speaker verification [J ] . Computer Speech ＆ Language , 2017 ( 45 ): 516 - 535 .

SARANYA S , BHARATHI B , KAVITHA S . An approach to detect replay attack in automatic speaker verification system [C ] // Proceedings of 2018 International Conference on Computer,Communication,and Signal Processing (ICCCSP) . Piscataway:IEEE Press , 2018 : 1 - 5 .

YE Y C , LAO L J , YAN D Q , et al . Detection of replay attack based on normalized constant Q cepstral feature [C ] // Proceedings of 2019 IEEE 4th International Conference on Cloud Computing and Big Data Analysis . Piscataway:IEEE Press , 2019 : 407 - 411 .

MASSOUDI M , VERMA S , JAIN R . Urban sound classification using CNN [C ] // Proceedings of 2021 6th International Conference on Inventive Computation Technologies (ICICT) . Piscataway:IEEE Press , 2021 : 583 - 589 .

LI P H , LI Y Y , LUO D C , et al . Speaker identification using FrFT-based spectrogram and RBF neural network [C ] // Proceedings of 2015 34th Chinese Control Conference (CCC) . Piscataway:IEEE Press , 2015 : 3674 - 3679 .

WANG J , HAN Z Y . Research on speech emotion recognition technology based on deep and shallow neural network [C ] // Proceedings of 2019 Chinese Control Conference (CCC) . Piscataway:IEEE Press , 2019 : 3555 - 3558 .

徐剑 , 简志华 , 于佳祺 , 等 . 采用完整局部二进制模式的伪装语音检测 [J ] . 电信科学 , 2021 , 37 ( 5 ): 91 - 99 .

XU J , JIAN Z H , YU J Q , et al . Completed local binary pattern based speech anti-spoofing [J ] . Telecommunications Science , 2021 , 37 ( 5 ): 91 - 99 .

K L , DABHADE S B , RODE Y S , et al . Identification of breast cancer from thermal imaging using SVM and random forest method [C ] // Proceedings of 2021 5th International Conference on Trends in Electronics and Informatics (ICOEI) . Piscataway:IEEE Press , 2021 : 1346 - 1349 .

TAO Y , HE Y Z . Face recognition based on LBP algorithm [C ] // Proceedings of 2020 International Conference on Computer Network,Electronic and Automation (ICCNEA) . Piscataway:IEEE Press , 2020 : 21 - 25 .

OJALA T , PIETIKAINEN M , MAENPAA T . Multiresolution gray-scale and rotation invariant texture classification with local binary patterns [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2002 , 24 ( 7 ): 971 - 987 .

FAUDZI S A A M , YAHYA N . Evaluation of LBP-based face recognition techniques [C ] // Proceedings of 2014 5th International Conference on Intelligent and Advanced Systems (ICIAS) . Piscataway:IEEE Press , 2014 : 1 - 6 .

WANG L L , . Research on distributed parallel dimensionality reduction algorithm based on PCA algorithm [C ] // Proceedings of 2019 IEEE 3rd Information Technology,Networking,Electronic and Automation Control Conference . Piscataway:IEEE Press , 2019 : 1363 - 1367 .

WANG X , YAMAGISHI J , TODISCO M , et al . ASVspoof 2019:a large-scale public database of synthesized,converted and replayed speech [J ] . Computer Speech ＆ Language , 2020 ,64:101114.

WU Z Z , KINNUNEN T , EVANS N , et al . ASVspoof 2015:the first automatic speaker verification spoofing and countermeasures challenge [C ] // Proceedings of Interspeech 2015 . ISCA:ISCA , 2015 .

CHENG X L , XU M X , ZHENG T F . Replay detection using CQT-based modified group delay feature and ResNeWt network in ASVspoof 2019 [C ] // Proceedings of 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) . Piscataway:IEEE Press , 2019 : 540 - 545 .

浏览量

181

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

采用恒Q调制包络的合成语音伪装检测方法

基于中心对称局部二值模式的合成伪装语音检测方法

一种基于随机森林和改进卷积神经网络的网络流量分类方法

机器学习在物联网虚假用户识别中的运用

基于随机森林的HEVC复杂度控制方法