A review of low bit-rate speech codec for satellite communication

Wei Chenguang; Xu Jiayi; Guo Meng; Yang Lei

doi:10.11959/j.issn.1000-0801.2026098

您当前的位置：

首页 >

文章列表页 >

A review of low bit-rate speech codec for satellite communication

Expert Views | 更新时间：2026-03-05

- A review of low bit-rate speech codec for satellite communication
- Telecommunications Science Vol. 42, Issue 2, Pages: 1-12(2026)
- 作者机构：
  
  中国移动通信有限公司研究院，北京 100053
- 作者简介：
- 基金信息：
  
  The National Natural Science Foundation of China(U21B2004)
- DOI：10.11959/j.issn.1000-0801.2026098
  CLC： TN927+.2;TP393
- Received：04 January 2026，
  
  Revised：2026-01-13，
  
  Accepted：15 January 2026，
  
  Published：20 February 2026
- 稿件说明：
移动端阅览
魏晨光,许珈艺,郭勐等.面向卫星通信的低速率语音编码技术综述[J].电信科学,2026,42(02):1-12.

Wei Chenguang,Xu Jiayi,Guo Meng,et al.A review of low bit-rate speech codec for satellite communication[J].Telecommunications Science,2026,42(02):1-12.
魏晨光,许珈艺,郭勐等.面向卫星通信的低速率语音编码技术综述[J].电信科学,2026,42(02):1-12. DOI： 10.11959/j.issn.1000-0801.2026098.

Wei Chenguang,Xu Jiayi,Guo Meng,et al.A review of low bit-rate speech codec for satellite communication[J].Telecommunications Science,2026,42(02):1-12. DOI： 10.11959/j.issn.1000-0801.2026098.

摘要

随着天地一体化信息网络的建设，卫星直连手机终端逐步普及。如何在卫星链路资源受限的情况下实现稳定清晰的语音通信，成为卫星语音通信业务发展的核心挑战。由于卫星信道具有带宽受限、路径损耗大、时延高等特点，地面蜂窝网络的语音编码难以直接适用，低速率语音编码技术是实现卫星语音服务的关键。基于此，系统总结了面向卫星通信的低速率语音编码技术，介绍了主流技术路线的原理、特点及性能评估，分析各方法的优缺点，并展望未来研究方向。

Abstract

With the advancement of the space-integrated-ground network

device-to-satellite communication is transitioning from concept to reality. Achieving stable and clear voice communication with limited satellite link resources is a key challenge for the industry. Due to the bandwidth limitations

high path loss

and high transmission delays of satellite channels

speech codecs used in terrestrial networks are not directly adaptable to satellite communication scenarios. Therefore

low bit-rate speech codec is crucial for satellite voice services. Based on this

the low bit-rate speech codec technologies for satellite communication were systematically summarized

the principles

characteristics

and performance evaluations of mainstream technical routes were introduced

the advantages and disadvantages of each method were analyzed

and future research directions were prospected.

关键词

Keywords

references

杨岭才 . 关于快速形成我国天地一体通信运营能力的思考 [J ] . 电信科学 , 2022 , 38 ( 4 ): 1 - 10 .

Yang L C . Thoughts on the rapid formation of China’s space-ground integrated communication operation capability [J ] . Telecommunications Science , 2022 , 38 ( 4 ): 1 - 10 .

陈山枝 . 星地融合移动通信系统与关键技术: 从5G NTN到6G的卫星互联网发展 [M ] . 北京 : 人民邮电出版社 , 2024 .

Chen S Z . Integrated satellite-terrestrial mobile communication systems and key technologies: from 5G NTN to 6G satellite internet development [M ] . Beijing : Posts & Telecom Press , 2024 .

李铁骊 . 2025年《卫星产业状况报告》发布 [J ] . 卫星应用 , 2025 ( 9 ): 51 - 57 .

Li T L . The report on satellite industry in 2025 was released [J ] . Satellite Application , 2025 ( 9 ): 51 - 57 .

Rodionov V V . Data services in the Inmarsat communication system [C ] // Proceedings of the 3rd International Conference on Satellite Communications . Piscataway : IEEE Press , 2002 : 67 - 70 .

王晓雪 , 杨新聪 . 天通卫星的技术应用与市场前景分析 [J ] . 数字通信世界 , 2025 ( 8 ): 193 - 195, 204 .

Wang X X , Yang X C . The technical application and market prospects of Tiantong satellite [J ] . Digital Communication World , 2025 ( 8 ): 193 - 195, 204 .

3GPP TS 071: 1999 Mandatory speech Codec speech processing functions AMR Speech Codec; General description [S ] .

3GPP TS 26.441: 2014 Codec for enhanced voice services (EVS); General overview (Release 12) [S ] .

3GPP TR 22.887: 2024 Feasibility study on satellite access. Phase 4 (Release 20) [S ] .

Streijl R C , Winkler S , Hands D S . Mean opinion score (MOS) revisited: methods and applications, limitations and alternatives [J ] . Multimedia Systems , 2016 , 22 ( 2 ): 213 - 227 .

周波 , 许萌 . 数字语音编码技术研究 [J ] . 科技情报开发与经济 , 2008 ( 3 ): 165 - 167 .

Zhou B , Xu M . Research on digital speech coding technology [J ] . Sci-Tech Information Development & Economy , 2008 ( 3 ): 165 - 167 .

Hiwasaki Y , Ohmuro H . ITU-T G.711.1: extending G.711 to higher-quality wideband speech [J ] . IEEE Communications Magazine , 2009 , 47 ( 10 ): 110 - 116 .

3GPP TS 26.250: 2024 Codec for immersive voice and audio services (IVAS); General overview (Release 18) [S ] .

RFC 6716: 2012 Definition of the Opus audio Codec [S ] .

朱丽 , 郭从良 . 心理声学模型在数字音频中的应用 [J ] . 电声技术 , 2002 , 26 ( 8 ): 11 - 14 .

Zhu L , Guo C L . Application of psycho-acoustic model in digital audio [J ] . Audio Engineering , 2002 , 26 ( 8 ): 11 - 14 .

赵仁仲 . VoIP系统中语音编码算法研究 [D ] . 成都 : 电子科技大学 , 2011 .

Zhao R Z . Research on voice coding algorithms in VoIP systems [D ] . Chengdu : University of Electronic Science and Technology of China , 2011 .

Salami R , Laflamme C , Bessette B , et al . ITU-T G.729 Annex A: reduced complexity 8 kb/s CS-ACELP codec for digital simultaneous voice and data [J ] . IEEE Communications Magazine , 1997 , 35 ( 9 ): 56 - 63 .

Schroeder M , Atal B . Code-excited linear prediction(CELP): high-quality speech at very low bit rates [C ] // Proceedings of the ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing . Piscataway : IEEE Press , 2003 : 937 - 940 .

3GPP TR 26.940: 2025 File structure for ultra-low bitrate coding (FS_U LBC ) (V0.5.0) [S ] .

Wisayataksin S . An efficient hardware architecture of Codec2 low bit-rate speech decoder [C ] // Proceedings of the 2019 5th International Conference on Engineering, Applied Sciences and Technology (ICEAST) . Piscataway : IEEE Press , 2019 : 1 - 4 .

王晶 , 徐亮 , 陈晓娇 , 等 . 基于神经网络的低码率语音编码技术研究综述 [J ] . 信号处理 , 2024 , 40 ( 12 ): 2261 - 2280 .

Wang J , Xu L , Chen X J , et al . Research review on low bit rate speech coding technology based on neural networks [J ] . Journal of Signal Processing , 2024 , 40 ( 12 ): 2261 - 2280 .

Valin J M , Skoglund J . LPCNet: improving neural speech synthesis through linear prediction [C ] // Proceedings of the ICASSP 2019-2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) . Piscataway : IEEE Press , 2019 : 5891 - 5895 .

Li Y Y , Wang Z Y , Yin L , et al . X-Net: a dual encoding-decoding method in medical image segmentation [J ] . The Visual Computer , 2023 , 39 ( 6 ): 2223 - 2233 .

Zeghidour N , Luebs A , Omran A , et al . SoundStream: an end-to-end neural audio Codec [J ] . IEEE/ACM Transactions on Audio, Speech, and Language Processing , 2022 , 30 : 495 - 507 .

Défossez A , Copet J , Synnaeve G , et al . High fidelity neural audio compression [PP ] . arXiv ( 2022-10-24 )[ 2026-01-04 ] . arXiv:arXiv. 2210 . 13438 .

ITU-R BS: 2002 Multi stimulus test with hidden reference and anchor (MU SHRA ) [S ] .

Wu Y C , Gebru I D , Marković D , et al . Audiodec: an open-source streaming high-fidelity neural audio codec [C ] // Proceedings of the ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) . Piscataway : IEEE Press , 2023 : 1 - 5 .

Yang D C , Liu S X , Huang R J , et al . HiFi-codec: group-residual vector quantization for high fidelity audio codec [PP ] . V2. arXiv ( 2023-05-07 )[ 2026-01-04 ] . arXiv: arXiv. 2305 . 02765 .

Kumar R , Seetharaman P , Luebs A , et al . High-fidelity audio compression with improved rvqgan [J ] . Advances in Neural Information Processing Systems , 2023 , 36 : 27980 - 27993 .

Zhang X , Zhang D , Li S M , et al . SpeechTokenizer: unified speech tokenizer for speech large language models [PP ] . V2. arXiv ( 2024-01-23 )[ 2026-01-04 ] . arXiv: arXiv. 2308 . 16692 .

Du Z H , Zhang S L , Hu K , et al . FunCodec: a fundamental, reproducible and integrable open-source toolkit for neural speech codec [C ] // Proceedings of the ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) . Piscataway : IEEE Press , 2024 : 591 - 595 .

Ye Z , Sun P W , Lei J H , et al . Codec does matter: exploring the semantic shortcoming of codec for audio language model [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2025 , 39 ( 24 ): 25697 - 25705 .

Ji S P , Jiang Z Y , Wang W , et al . WavTokenizer: an efficient acoustic discrete codec tokenizer for audio language modeling [PP ] . V3. arXiv ( 2025-02-25 )[ 2026-01-04 ] . arXiv: arXiv. 2408 . 16532 .

Liu H H , Xu X N , Yuan Y , et al . SemantiCodec: an ultra low bitrate semantic audio codec for general sound [J ] . IEEE Journal of Selected Topics in Signal Processing , 2024 , 18 ( 8 ): 1448 - 1461 .

Défossez A , Mazaré L , Orsini M , et al . Moshi: a speech-text foundation model for real-time dialogue [PP ] . V2. arXiv ( 2024-10-02 )[ 2026-01-04 ] . arXiv: arXiv. 2410 . 00037 .

Della Libera L , Paissan F , Subakan C , et al . FocalCodec: low-bitrate speech coding via focal modulation networks [PP ] . V2. arXiv ( 2025-10-24 )[ 2026-01-04 ] . arXiv: arXiv. 2502 . 04465 .

Yang D C , Liu S X , Guo H H , et al . ALMTokenizer: a low-bitrate and semantic-rich audio codec tokenizer for audio language modeling [PP ] . arXiv ( 2025-04-14 )[ 2026-01-04 ] . arXi v: arXiv. 2504 . 10344 .

Gong Y T , Jin L , Deng R F , et al . XY-tokenizer: mitigating the semantic-acoustic conflict in low-bitrate speech codecs [PP ] . V2. arXiv ( 2025-07-09 )[ 2026-01-04 ] . arXiv: arXiv. 2506 . 23325 .

Zhao X H , Xiang H Y , Ye S Z , et al . LongCat-audio-codec: an audio tokenizer and detokenizer solution designed for speech large language models [PP ] . arXiv ( 2025-10-17 )[ 2026-01-04 ] . arXiv: arXiv. 2510 . 15227 .

张平 , 戴金晟 , 张育铭 , 等 . 面向语义通信的非线性变换编码 [J ] . 通信学报 , 2023 , 44 ( 4 ): 1 - 14 .

Zhang P , Dai J S , Zhang Y M , et al . Nonlinear transform coding for semantic communications [J ] . Journal on Communications , 2023 , 44 ( 4 ): 1 - 14 .

Jia R J , He Z Q , Niu K , et al . SSC: 106 bit/s ultra-low bitrate semantic speech coding [C ] // Proceedings of the ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) . Piscataway : IEEE Press , 2025 : 1 - 5 .

Beerends J G , Schmidmer C , Berger J , et al . Perceptual objective listening quality assessment (POLQA), the third generation ITU-T standard for end-to-end speech quality measurement part I-temporal alignment [J ] . Audio Engineering Society , 2013 , 61 ( 6 ): 366 - 384 .

Rix A W , Beerends J G , Hollier M P , et al . Perceptual evaluation of speech quality (PESQ)-a new method for speech quality assessment of telephone networks and codecs [C ] // Proceedings of the 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing . Proceedings . Piscataway : IEEE Press , 2002 : 749 - 752 .

Hines A , Skoglund J , Kokaram A C , et al . ViSQOL: an objective speech quality model [J ] . EURASIP Journal on Audio, Speech, and Music Processing , 2015 , 2015 ( 1 ): 13 .

高杨 , 曹仰杰 , 段鹏松 . 神经网络模型轻量化方法综述 [J ] . 计算机科学 , 2024 , 51 ( 增刊1 ): 11 - 21 .

Gao Y , Cao Y J , Duan P S . Lightweighting methods for neural network models: a review [J ] . Computer Science , 2024 , 51 ( S1 ): 11 - 21 .

Tseng W C , Harwath D . Probing the robustness properties of neural speech codecs [J ] . V2. arXiv ( 2025-05-30 )[ 2026-01-04 ] . arXiv: arXiv. 2505 . 24248 .

Views

5891

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Research and design of network terminal user collaborative management system for satellite-terrestrial integration

Research on capacity enhancement technology based on NTN

Study on the key technologies of satellite communication for space-air-ground-sea integration

A survey on AI techniques applied in the satellite communication/satellite Internet field

Outlook on satellite communications network architecture for 6G

Related Author

CHEN Xuqiong

LI Yuanjie

YUAN Li

CHEN Shanzhi

KANG Shaoli

GUAN Juan

LIN Jiaxian

HE Yuanzhi

Related Institution

Space Star Technology Co., Ltd.

State Key Laboratory of Wireless Mobile Communication

CICT Mobile Communication Technology Co., Ltd.

China Academy of Telecommunications Technology

Academy of Military Science

⁰