浏览全部资源
扫码关注微信
1. 中国联合网络通信有限公司上海市分公司,上海 200050
2. 同济大学软件学院,上海 201804
[ "汪保友(1968-),男,博士,中国联合网络通信有限公司上海市分公司高级工程师,主要研究方向为数据科学、数据挖掘、数据签名。" ]
[ "钱晶(1970-),女,中国联合网络通信有限公司上海市分公司工程师,主要研究方向为数据科学、移动互联网、通信网络规划。" ]
[ "袁时金(1975-),女,博士,同济大学软件学院副教授,主要研究方向为大数据与高性能计算。" ]
网络出版日期:2017-01,
纸质出版日期:2017-01-15
移动端阅览
汪保友, 钱晶, 袁时金. 基于Hadoop的电信大数据采集方案研究与实现[J]. 电信科学, 2017,33(1):135-142.
Baoyou WANG, Jing QIAN, Shijin YUAN. Research and implementation on acquisition scheme of telecom big data based on Hadoop[J]. Telecommunications science, 2017, 33(1): 135-142.
汪保友, 钱晶, 袁时金. 基于Hadoop的电信大数据采集方案研究与实现[J]. 电信科学, 2017,33(1):135-142. DOI: 10.11959/j.issn.1000-0801.2017010.
Baoyou WANG, Jing QIAN, Shijin YUAN. Research and implementation on acquisition scheme of telecom big data based on Hadoop[J]. Telecommunications science, 2017, 33(1): 135-142. DOI: 10.11959/j.issn.1000-0801.2017010.
ETL是数据仓库实施过程中一个非常重要的步骤,设计一个能够对大数据进行有效处理的ETL流程以提高运营平台的采集效率,具有重要的实际意义。首先简单介绍某运营商大数据平台采集的主要数据内容。随后,为提升海量数据采集效率,提出了Hadoop与Oracle混搭架构解决方案。继而,提出一种动态触发式ETL调度流程与算法,与定时启动的ETL流程调度方式相比,可有效缩短部分流程的超长等待时间;有效避免资源抢占拥堵现象。最后,根据Hadoop和Oracle的系统运行日志,比较分析了两个平台的采集效率与数据量之间的关系。实践表明,混搭架构的大数据平台优势互补,可有效提升数据采集时效性,获得比较好的应用效果。
ETL is a very important step in the implementation process of data warehouse.A good ETL flow is important
which can effectively process the telecom big data and improve the acquisition efficiency of the operation platform.Firstly
the main data content of the big data platform was expounded.Secondly
in order to improve the efficiency of massive data collection
Hadoop and Oracle mashup solution was suggested.Subsequently
a dynamic triggered ETL scheduling flow and algorithm was proposed.Compared with timer start ETL scheduling method
it could effectively shorten waiting time and avoid the phenomenon of resources to seize and congestion.Finally
according to the running log of Hadoop platform and Oracle database
the relationship between acquisition efficiency and data quantity was analyzed comparatively.Furthermore
practice result shows that the hybrid data structure of the big data platform complement each other and can effectively enhance the timeliness of data collection and access better application effect.
许佳捷 , 郑凯 , 池明旻 , 等 . 轨迹大数据:数据、应用与技术现状 [J ] . 通信学报 , 2015 , 36 ( 12 ): 97 - 105 .
XU J J , ZHENG K , CHI M M , et al . Trajectory big data:data,applications and techniques [J ] . Journal on Communications , 2015 , 36 ( 12 ): 97 - 105 .
刘南海 , 雷蕾 , 王睿 . 大数据时代运营商分析支撑域转型的实践与思考 [J ] . 电信科学 , 2016 , 32 ( 8 ): 146 - 158 .
金澈清 , 钱卫宁 , 周敏奇 , 等 . 数据管理系统评测基准:从传统数据库到新兴大数据 [J ] . 计算机学报 , 2015 , 38 ( 1 ): 18 - 34 .
JIN C Q , QIAN W N , ZHOU M Q , et al . Benchmarking data management systems:from traditional database to emergent big data [J ] . Chinese Journal of Computers , 2015 , 38 ( 1 ): 18 - 34 .
曾嘉 , 刘诗凯 , 袁明轩 . 电信大数据关键技术挑战 [J ] . 大数据 , 2016 , 2 ( 3 ): 96 - 105 .
ZENG J , LIU S K , YUAN M X . Key technical challenges in telecom big data [J ] . Big Data Research , 2016 , 2 ( 3 ): 96 - 105 .
詹义 , 方媛 . 基于 Spark 技术的网络大数据分析平台搭建与应用 [J ] . 互联网天地 , 2016 ( 2 ): 75 - 78 .
ZHAN Y , FANG Y . Building and application of network big data analysis platform based on Spark technology [J ] . China Internet , 2016 ( 2 ): 75 - 78 .
刘珂 . 基于Hadoop平台的大数据迁移与查询方法研究及应用 [D ] . 武汉 : 武汉理工大学 , 2014 .
LIU K . Research and application of big data migration and query based on Hadoop platform [D ] . Wuhan : Wuhan University of Technology , 2014 .
0
浏览量
1187
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构