浏览全部资源
扫码关注微信
1. 北京科技大学计算机与通信工程学院,北京 100083
2. 北京科技大学顺德研究生院,广东 佛山 528399
[ "陈悦(1998- ),女,北京科技大学计算机与通信工程学院硕士生,主要研究方向为计算机视觉与人工智能" ]
[ "郭宇(1992- ),男,博士,北京科技大学计算机与通信工程学院讲师,主要研究方向为无线传感器网络、云计算、多机器人系统" ]
[ "谢圆琰(1996- ),女,北京科技大学计算机与通信工程学院博士生,主要研究方向为云机器人、服务科学与云计算" ]
[ "米振强(1983- ),男,博士,北京科技大学计算机与通信工程学院副教授,主要研究方向为服务计算、多机器人系统、移动环境中的点云计算" ]
网络出版日期:2022-01,
纸质出版日期:2022-01-20
移动端阅览
陈悦, 郭宇, 谢圆琰, 等. 基于图像描述算法的离线盲人视觉辅助系统[J]. 电信科学, 2022,38(1):61-72.
Yue CHEN, Yu GUO, Yuanyan XIE, et al. Offline visual aid system for the blind based on image captioning[J]. Telecommunications science, 2022, 38(1): 61-72.
陈悦, 郭宇, 谢圆琰, 等. 基于图像描述算法的离线盲人视觉辅助系统[J]. 电信科学, 2022,38(1):61-72. DOI: 10.11959/j.issn.1000-0801.2022014.
Yue CHEN, Yu GUO, Yuanyan XIE, et al. Offline visual aid system for the blind based on image captioning[J]. Telecommunications science, 2022, 38(1): 61-72. DOI: 10.11959/j.issn.1000-0801.2022014.
摘 要:针对现有盲人视觉辅助设备存在的不便,探讨了基于模型剪枝的图像描述模型在便携式移动设备上运行的方法。回顾了图像描述模型和剪枝模型技术,重点提出了一种针对图像描述模型的改进剪枝算法。结果表明,在保证准确性的前提下,剪枝后的图像描述模型可以大幅降低工作时的处理时间和消耗的电源容量,能够随时随地快速准确地对环境信息进行描述及语音播报。
In view of the inconveniences of existing visual aid systems for the blind
the method of running the image captioning model on portable mobile devices based on model pruning was discussed.Model pruning techniques and image captioning models were reviewed.An improved model pruning algorithm for image captioning model was proposed.Experimental results show that
on the premise of ensuring accuracy
the image captioning model after pruning can greatly reduce processing time and power consumption capacity
and can quickly and accurately describe environmental information and voice broadcast anytime and anywhere.
康帅 , 章坚武 , 朱尊杰 , 等 . 改进 YOLOv4 算法的复杂视觉场景行人检测方法 [J ] . 电信科学 , 2021 , 37 ( 8 ): 46 - 56 .
KANG S , ZHANG J W , ZHU Z J , et al . An improved YOLOv4 algorithm for pedestrian detection in complex visual scenes [J ] . Telecommunications Science , 2021 , 37 ( 8 ): 46 - 56 .
MAO J H , XU W , YANG Y , et al . Explain images with multimodal recurrent neural networks [EB ] . 2014 .
VINYALS O , TOSHEV A , BENGIO S , et al . Show and tell:a neural image caption generator [C ] // Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway:IEEE Press , 2015 .
ANDERSON P , HE X D , BUEHLER C , et al . Bottom-up and top-down attention for image captioning and visual question answering [C ] // Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway:IEEE Press , 2018 : 6077 - 6086 .
LUO Y P , JI J Y , SUN X S , et al . Dual-level collaborative transformer for image captioning [EB ] . 2021 .
YANG X , TANG K H , ZHANG H W , et al . Auto-encoding scene graphs for image captioning [C ] // Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway:IEEE Press , 2019 : 10685 - 10694 .
CHEN S Z , JIN Q , WANG P , et al . Say as you wish:fine-grained control of image caption generation with abstract scene graphs [C ] // Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway:IEEE Press , 2020 : 9962 - 9971 .
WANG Z Y , FENG B , NARASIMHAN K , et al . Towards unique and informative captioning of images [M ] // Computer Vision – ECCV 2020.Cham:Springer International Publishing ,[S.l.:s.n. ] , 2020 : 629 - 644 .
XU G H , NIU S C , TAN M K , et al . Towards accurate text-based image captioning with content diversity exploration [C ] // Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway:IEEE Press , 2021 : 12637 - 12646 .
DENTON E , ZAREMBA W,BRUNA , et al . Exploiting linear structure within convolutional networks for efficient evaluation [C ] // Advances in neural information processing systems . Cambridge:MIT Press , 2014 : 1269 - 1277 .
ZHUANG Z W , TAN M K , ZHUANG B H , et al . Discrimination-aware channel pruning for deep neural networks [EB ] . 2018 .
RASTEGARI M , ORDONEZ V , REDMON J , et al . Xnor-net:imagenet classification using binary convolutional neural networks [C ] // European conference on computer vision . Berlin:Springer , 2016 : 525 - 542 .
WANG K , LIU Z J , LIN Y J , et al . HAQ:hardware-aware automated quantization with mixed precision [C ] // Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway:IEEE Press , 2019 : 8612 - 8620 .
CHEN H T , WANG Y H , XU C , et al . Data-free learning of student networks [C ] // Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV) . Piscataway:IEEE Press , 2019 : 3514 - 3522 .
LUO L C , SANDLER M , LIN Z , et al . Large-scale generative data-free distillation [EB ] . 2020 .
YU X Y , LIU T L , WANG X C , et al . On compressing deep models by low rank and sparse decomposition [C ] // Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway:IEEE Press , 2017 : 7370 - 7379 .
YANG Z , WANG Y , LIU C , et al . Legonet:efficient convolutional neural networks with lego filters [C ] // International Conference on Machine Learning . New York:ACM Press , 2019 : 7005 - 7014 .
CHEN H T , WANG Y H , XU C J , et al . AdderNet:do we really need multiplications in deep learning? [C ] // Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway:IEEE Press , 2020 : 1468 - 1477 .
XU Y , XU C , CHEN X , et al . Kernel based progressive distillation for adder neural networks [EB ] . 2020 .
SONG D H , WANG Y H , CHEN H T , et al . AdderSR:towards energy efficient image super-resolution [C ] // Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway:IEEE Press , 2021 : 15648 - 15657 .
PARK Y , YUN I D . Fast adaptive RNN Encoder⁻Decoder for anomaly detection in SMD assembly machine [J ] . Sensors (Basel,Switzerland) , 2018 , 18 ( 10 ): 3573 .
XU K , BA J , KIROS R , et al . Show,attend and tell:neural image caption generation with visual attention [EB ] . 2015 .
XINGJIAN S H I , CHEN Z , WANG H , et al . Convolutional LSTM network:A machine learning approach for precipitation nowcasting [C ] // Advances in neural information processing systems . Cambridge:MIT Press , 2015 : 802 - 810 .
MOLCHANOV P , TYREE S , KARRAS T , et al . Pruning convolutional neural networks for resource efficient inference [EB ] . 2016 .
王从徐 . 基于泰勒级数展开及其应用探讨 [J ] . 红河学院学报 , 2021 , 19 ( 02 ): 154 - 156 .
WANG C X . Discussion on Taylor series expansion and its application [J ] . Journal of Honghe University , 2021 , 19 ( 02 ): 154 - 156 .
HODOSH M , YOUNG P , HOCKENMAIER J . Framing image description as a ranking task:data,models and evaluation metrics [J ] . Journal of Artificial Intelligence Research , 2013 , 47 : 853 - 899 .
蔡鑫 . 基于 Bert 模型的互联网不良信息检测 [J ] . 电信科学 , 2020 , 36 ( 11 ): 121 - 126 .
CAI X . Internet bad information detection based on Bert model [J ] . Telecommunications Science , 2020 , 36 ( 11 ): 121 - 126 .
LIN C Y , . Rouge:a package for automatic evaluation of summaries [C ] // Text summarization branches out . Barcelona:ACL , 2004 : 74 - 81 .
0
浏览量
475
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构