浏览全部资源
扫码关注微信
[ "林朗(1994-),男,宁波大学信息科学与工程学院硕士生,主要研究方向为多媒体通信与信息安全等。" ]
[ "王让定(1962-),男,博士,宁波大学信息科学与工程学院教授、博士生导师,主要研究方向为多媒体通信与取证、信息隐藏与隐写分析、智能抄表及传感网络技术等。" ]
[ "严迪群(1979-),男,博士,宁波大学信息科学与工程学院副教授、硕士生导师,主要研究方向为多媒体通信、信息安全、基于深度学习的数字语音取证等。" ]
[ "李璨(1992-),女,宁波大学信息科学与工程学院硕士生,主要研究方向为多媒体通信与信息安全等。" ]
网络出版日期:2018-05,
纸质出版日期:2018-05-20
移动端阅览
林朗, 王让定, 严迪群, 等. 基于逆梅尔对数频谱系数的回放语音检测算法[J]. 电信科学, 2018,34(5):90-98.
Lang LIN, Rangding WANG, Diqun YAN, et al. A playback speech detection algorithm based on log inverse Mel-frequency spectral coefficient[J]. Telecommunications science, 2018, 34(5): 90-98.
林朗, 王让定, 严迪群, 等. 基于逆梅尔对数频谱系数的回放语音检测算法[J]. 电信科学, 2018,34(5):90-98. DOI: 10.11959/j.issn.1000-0801.2018020.
Lang LIN, Rangding WANG, Diqun YAN, et al. A playback speech detection algorithm based on log inverse Mel-frequency spectral coefficient[J]. Telecommunications science, 2018, 34(5): 90-98. DOI: 10.11959/j.issn.1000-0801.2018020.
高保真录音设备和回放设备的普及化及便携化,给说话人识别系统的抗回放语音攻击带来了严峻挑战。通过语谱图分析原始语音和回放语音在高频区的差异,有针对性地将语音信号在求取 Mel(梅尔)倒谱系数过程中的Mel滤波器组逆置,并将DCT前的Mel对数频谱系数作为算法的特征。最后,利用支持向量机作为分类器对待测语音进行判别。实验结果表明,此算法能够有效地检测回放语音。另外,将此算法加载到GMM-UBM说话人识别系统后,显著地提升了系统的抗回放语音攻击能力。
The popularity and portability of high-fidelity audio recording equipment and playback equipment poses a serious challenge for speaker recognition systems against playback attacks.Based on the differences between the original speech and the playback speech in high frequency region
the algorithm reversed the Mel-filter bank in Mel-frequency cepstral coefficient (MFCC) calculation
and the coefficients before the DCT were used as the features of the algorithm.SVM was utilized as the classifier.Experimental results show that this algorithm can effectively detect the playback speech.In addition
the algorithm is integrated into the GMM-UBM speaker recognition system
which significantly improves the systems’ capability of resisting the playback attack.
ZHU D , MA B , LI H . Speaker verification with feature-space MAPLR parameters [J ] . IEEE Transactions on Audio Speech &Language Processing , 2011 , 19 ( 3 ): 505 - 515 .
易克初 , 胡征 . 一种应用矢量量化的语音合成新方法 [J ] . 电信科学 , 1987 ( 11 ): 1 - 6 .
YI K C , HU Z . A new speech synthesis method using vector quantization [J ] . Telecommunications Science , 1987 ( 11 ): 1 - 6 .
郭弘 . 录音证据的真实性检验与研究 [J ] . 电信科学 , 2010 , 26 ( Z2 ): 56 - 60 .
GUO H . Authenticity verification and research of recording evidence [J ] . Telecommunications Science , 2010 , 26 ( Z2 ): 56 - 60 .
李璨 , 王让定 , 严迪群 , 等 . 基于相位谱的翻录语音攻击检测算法 [J ] . 电信科学 , 2017 , 33 ( 8 ): 145 - 154 .
LI C , WANG R D , YAN D Q , et al . Detection algorithm of riprap voice attack based on phase spectrum [J ] . Telecommunications Science , 2017 , 33 ( 8 ): 145 - 154 .
SHANG W , STEVENSON M . A playback attack detector for speaker verification systems [C ] // IEEE International Symposium on Communications Control and Signal Processing (ISCCSP),March 12-14,2008,St Julians,Malta . Piscataway:IEEE Press , 2008 : 1144 - 1149 .
SHANG W , STEVENSON M . Score normalization in playback attack detection [C ] // IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP),March 14-19,2010,Dallas,USA . Piscataway:IEEE Press , 2010 : 1678 - 1681 .
张利鹏 , 曹犟 , 徐明星 . 防止假冒者闯入说话人识别系统 [J ] . 清华大学学报(自然科学版) , 2008 , 48 ( S1 ): 699 - 703 .
ZHANG L P , CAO J , XU M X . Prevention of impostors entering speaker recognition systems [J ] . Journal of Tsinghua University (Science and Technology) , 2008 , 48 ( S1 ): 699 - 703 .
王志峰 , 贺前华 , 张雪源 , 等 . 基于模式噪声的录音回放攻击检测 [J ] . 华南理工大学学报 , 2011 , 39 ( 10 ): 7 - 12 .
WANG Z F , HE Q H , ZHANG X Y , et al . Channel pattern noise based playback detection algorithm speaker recognition [J ] . Journal of South China University of Technology (Natural Science Edition) , 2011 , 39 ( 10 ): 7 - 12 .
李富强 , 万红 , 黄俊杰 . 基于MATLAB的语谱图显示与分析 [J ] . 微计算机信息 , 2005 ( 20 ): 172 - 174 .
LI F Q , WAN H , HUANG J J . The display and analysis of sonogram based on MATLAB [J ] . Control & Automation , 2005 ( 20 ): 172 - 174 .
BURILLO P , BUSTINCE H . Entropy on intuitionistic fuzzy sets and on interval-valued fuzzy sets [J ] . Fuzzy Sets & Systems , 1996 , 78 ( 3 ): 305 - 316 .
项要杰 , 杨俊安 , 李晋徽 , 等 . 一种适用于说话人识别的改进Mel滤波器 [J ] . 计算机工程 , 2013 ( 11 ): 214 - 217 .
XIANG Y J , YANG J A , LI J H , et al . An improved Mel-frequency filter for speaker recognition [J ] . Computer Engineering , 2013 ( 11 ): 214 - 217 .
陶佰睿 , 郭琴 , 苗凤娟 , 等 . 基于改进 Mel 滤波器组的声纹特征提取SoC设计 [J ] . 微电子学 , 2015 ( 6 ): 785 - 788 .
TAO B R , GUO Q , MIAO F J , et al . SoC design of voiceprint features extraction based on improved Mel filter banks [J ] . Microelectronics , 2015 ( 6 ): 785 - 788 .
胡永刚 , 吴翊 , 王洪志 , 等 . 高维数据降维的 DCT 变换 [J ] . 计算机工程与应用 , 2006 ( 32 ): 21 - 23 .
HU Y G , WU Y , WANG H Z , et al . Discrete cosine transform in data dimensionality reduction [J ] . Computer Engineering and Applications , 2006 ( 32 ): 21 - 23 .
MOHAMED A . Deep neural network acoustic models for ASR [J ] . Doctoral , 2014
CHANG C C , LIN C J . LIBSVM:a library for support vector machines [J ] . ACM Transactions on Intelligent Systems &Technology , 2012 , 2 ( 3 ): 1 - 27 .
王天庆 , 李爱军 . 连续汉语语音识别语料库的设计 [C ] // 第六届全国现代语音学学术会议论文集,2003年10月1日,天津,中国 . [出版地不详:出版者不详] , 2003 : 1 - 4 .
WANG T Q , LI A J . The design of the continuous Chinese speech recognition corpus [C ] // The Sixth National Conference on Modern Phonetics Learning,Oct 1,2003,Tianjin,China.[S.l.:s.n] . 2003 : 1 - 4 .
CHAKROBORTY S , ROY A , SAHA G . Improved closed setttext-independent speaker identification by combining MFCC with evidence from flipped filter banks [J ] . International Journal of Signal Processing , 2007 , 4 ( 2 ): 114 - 122 .
0
浏览量
1004
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构