融合递增词汇选择的深度学习中文输入法

任华健; 郝秀兰; 徐稳静

doi:10.11959/j.issn.1000-0801.2022294

您当前的位置：

首页 >

文章列表页 >

融合递增词汇选择的深度学习中文输入法

研究与开发 | 更新时间：2024-06-05

- 融合递增词汇选择的深度学习中文输入法
- Deep learning Chinese input method with incremental vocabulary selection
- 电信科学 2022年38卷第12期页码：56-64
- 作者机构：
- 作者简介：
  
  [ "任华健（1994- ），男，湖州师范学院硕士生，主要研究方向为自然语言处理、中文输入法" ]
  [ "郝秀兰（1970- ），博士，女，湖州师范学院副教授、硕士生导师，主要研究方向为智能信息处理、数据与知识工程、自然语言理解等" ]
  [ "徐稳静（1998- ），女，湖州师范学院硕士生，主要研究方向为自然语言处理、虚假新闻检测" ]
- 基金信息：
  
  浙江省现代农业资源智慧管理与应用研究重点实验室基金项目;The Foundation of Zhejiang Province Key Laboratory of Smart Management and Application of Modern Agricultural Resources(2020E10017)
- DOI：10.11959/j.issn.1000-0801.2022294
  中图分类号： TP391
- 网络出版日期：2022-12，
  
  纸质出版日期：2022-12-20
- 稿件说明：
移动端阅览
任华健, 郝秀兰, 徐稳静. 融合递增词汇选择的深度学习中文输入法[J]. 电信科学, 2022,38(12):56-64.

Huajian REN, Xiulan HAO, Wenjing XU. Deep learning Chinese input method with incremental vocabulary selection[J]. Telecommunications science, 2022, 38(12): 56-64.
任华健, 郝秀兰, 徐稳静. 融合递增词汇选择的深度学习中文输入法[J]. 电信科学, 2022,38(12):56-64. DOI： 10.11959/j.issn.1000-0801.2022294.

Huajian REN, Xiulan HAO, Wenjing XU. Deep learning Chinese input method with incremental vocabulary selection[J]. Telecommunications science, 2022, 38(12): 56-64. DOI： 10.11959/j.issn.1000-0801.2022294.

摘要

输入法的核心任务是将用户输入的按键序列转化为汉字序列。应用深度学习算法的输入法在学习长距离依赖和解决数据稀疏问题方面存在优势，然而现有方法仍存在两方面问题，一是采用的拼音切分与转换分离的结构导致了误差传播，二是模型复杂难以满足输入法对实时性的需求。针对上述不足提出了一种融合了递增词汇选择算法的深度学习的输入法模型并对比了多种softmax优化方法。在人民日报数据和中文维基百科数据上进行的实验表明，该模型的转换准确率相较当前最高性能提升了15%，融合递增词汇选择算法使模型在不损失转换精度的同时速度提升了130倍。

Abstract

The core task of an input method is to convert the keystroke sequences typed by users into Chinese character sequences.Input methods applying deep learning methods have advantages in learning long-range dependencies and solving data sparsity problems.However

the existing methods still have two shortcomings: the separation structure of pinyin slicing in conversion leads to error propagation

and the model is complicated to meet the demand for real-time performance of the input method.A deep-learning input method model incorporating incremental word selection methods was proposed to address these shortcomings.Various softmax optimization methods were compared.Experiments on People’s Daily data and Chinese Wikipedia data show that the model improves the conversion accuracy by 15% compared with the current state-of-the-art model

and the incremental vocabulary selection method makes the model 130 times faster without losing conversion accuracy.

关键词

Keywords

references

STOLCKE A , . SRILM - an extensible language modeling toolkit [C ] // Proceedings of 7th International Conference on Spoken Language Processing (ICSLP 2002) . ISCA:ISCA , 2002 : 901 - 904 .

XIAO Y L , LIU L M , HUANG G P , et al . BiTIIMT:a bilingual text-infilling method for interactive machine translation [C ] // Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers) . Stroudsburg,PA,USA:Association for Computational Linguistics , 2022 : 1958 - 1969 .

PARASKEVOPOULOS G , PARTHASARATHY S , KHARE A , et al . Multimodal and multiresolution speech recognition with transformers [C ] // Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics . Stroudsburg,PA,USA:Association for Computational Linguistics , 2020 : 2381 - 2387 .

DING Z X , XIA R , YU J F . End-to-end emotion-cause pair extraction based on sliding window multi-label learning [C ] // Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) . Stroudsburg,PA,USA:Association for Computational Linguistics , 2020 : 3574 - 3583 .

DING Z X , XIA R , YU J F . ECPE-2D:emotion-cause pair extraction based on joint two-dimensional representation,interaction and prediction [C ] // Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics . Stroudsburg,PA,USA:Association for Computational Linguistics , 2020 : 3161 - 3170 .

PANDEY P , BASU P , CHAKRABORTY K , et al . GreenTPU:improving timing error resilience of a near-threshold tensor processing unit [C ] // Proceedings of DAC '19:Proceedings of the 56th Annual Design Automation Conference.[S.l:s.n.],2019 . 2019 : 1 - 6 .

LIU X Y , CHEN X , WANG Y Q , et al . Two efficient lattice rescoring methods using recurrent neural network language models [J ] . IEEE/ACM Transactions on Audio,Speech,and Language Processing , 2016 , 24 ( 8 ): 1438 - 1449 .

LEE K , PARK C , KIM N , et al . Accelerating recurrent neural network language model based online speech recognition system [C ] // Proceedings of 2018 IEEE International Conference on Acoustics,Speech and Signal Processing . Piscataway:IEEE Press , 2018 : 5904 - 5908 .

CHEN X , LIU X Y , WANG Y Q , et al . Efficient training and evaluation of recurrent neural network language models for automatic speech recognition [J ] . IEEE/ACM Transactions on Audio,Speech,and Language Processing , 2016 , 24 ( 11 ): 2146 - 2157 .

CHO K , VAN MERRIENBOER B , GULCEHRE C , et al . Learning phrase representations using RNN encoder-decoder for statistical machine translation [C ] // Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) . Stroudsburg,PA,USA:Association for Computational Linguistics , 2014 : 1724 - 1734 .

VASWANI A , SHAZEER N , PARMAR N , et al . Attention is all you need [J ] . Advances in Neural Information Processing Systems , 2017 , 22 ( 7 ): 139 - 147 .

HOCHREITER S , SCHMIDHUBER J . Long short-term memory [J ] . Neural Computation , 1997 , 9 ( 8 ): 1735 - 1780 .

CHEN Z , LEE K F . A new statistical approach to Chinese Pinyin input [C ] // Proceedings of the 38th Annual Meeting on Association for Computational Linguistics - ACL '00 . Morristown,NJ,USA:Association for Computational Linguistics , 2000 : 241 - 247 .

YANG S , ZHAO H , LU B . A machine translation approach for Chinese whole-sentence Pinyin-to-character conversion [C ] // Proceed ings of the 26th Pacific Asia Conference on Language,Information,and Computation .[S.l.:s.n. ] , 2012 : 333 - 342 .

HUANG Y F , LI Z C , ZHANG Z S , et al . Moon IME:neural-based Chinese pinyin aided input method with customizable association [C ] // Proceedings of ACL 2018,System Demonstrations . Stroudsburg,PA,USA:Association for Computational Linguistics , 2018 : 140 - 145 .

ZHANG Z S , HUANG Y F , ZHAO H . Open vocabulary learning for neural Chinese pinyin IME [C ] // Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics . Stroudsburg,PA,USA:Association for Computational Linguistics , 2019 : 1584 - 1594 .

HUANG Y F , ZHAO H . Chinese pinyin aided IME,input what You have not keystroked yet [C ] // Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing . Stroudsburg,PA,USA:Association for Computational Linguistics , 2018 : 2923 - 2929 .

TAN M H , DAI Y , TANG D Y , et al . Exploring and adapting Chinese GPT to pinyin input method [C ] // Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers) . Stroudsburg,PA,USA:Association for Computational Linguistics , 2022 : 1899 - 1909 .

CHEN W L , GRANGIER D , AULI M . Strategies for training large vocabulary neural language models [C ] // Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers) . Stroudsburg,PA,USA:Association for Computational Linguistics , 2016 : 1975 - 1985 .

JOULIN A , CISSÉ M , GRANGIER D , et al . Efficient softmax approximation for GPUs [C ] // International Conference on Machine Learning .[S.l.:s.n. ] , 2017 : 1302 - 1310 .

SHIM K , LEE M , CHOI I , et al . SVD-softmax:Fast softmax approximation on large vocabulary neural networks [J ] . Advances in Neural Information Processing Systems , 2017 , 30 : 5463 - 5473 .

SHI Y Z , ZHANG W Q , LIU J , et al . RNN language model with word clustering and class-based output layer [J ] . EURASIP Journal on Audio,Speech,and Music Processing,2013 , 2013 :22.

Zhang M J , Wang W H , Liu X D , et al . Navigating with graph representations for fast and scalable decoding of neural language models [J ] . Advances in Neural Information Processing Systems , 2018 , 31 : 6308 - 6319 .

MI H T , WANG Z G , ITTYCHERIAH A . Vocabulary manipulation for neural machine translation [C ] // Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 2:Short Papers) . Stroudsburg,PA,USA:Association for Computational Linguistics , 2016 : 124 - 129 .

JEAN S , CHO K , MEMISEVIC R , et al . On using very large target vocabulary for neural machine translation [C ] // Proceed ings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1:Long Papers) . Stroudsburg,PA,USA:Association for Computational Linguistics , 2015 : 1 - 10 .

YAO J , SHU R , LI X , et al . Enabling real-time neural IME with incremental vocabulary selection [C ] // Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies .[S.l.:s.n. ] , 2019 : 1 - 8 .

MIKOLOV T , KARAFIÁT M , BURGET L , et al . Recurrent neural network based language model [C ] // Proceedings of Interspeech 2010 . ISCA:ISCA , 2010 , 2 ( 3 ): 1045 - 1048 .

ELMAN J L . Finding structure in time [J ] . Cognitive Science , 1990 , 14 ( 2 ): 179 - 211 .

KNESER R , NEY H . Improved backing-off for M-gram language modeling [C ] // Proceedings of 1995 International Conference on Acoustics,Speech,and Signal Processing . Piscataway:IEEE Press , 1995 : 181 - 184 .

JAMES F . Modified kneser-ney smoothing of n-gram models [R ] . Research Institute for Advanced Computer Science , 2000 .

PRESS O , WOLF L . Using the output embedding to improve language models [C ] // Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics:Volume 2,Short Papers . Stroudsburg,PA,USA:Association for Computational Linguistics , 2017 : 157 - 163 .

DEVLIN J , ZBIB R , HUANG Z Q , et al . Fast and robust neural network joint models for statistical machine translation [C ] // Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers) . Stroudsburg,PA,USA:Association for Computational Linguistics , 2014 : 1370 - 1380 .

浏览量

377

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

暂无数据