浏览全部资源
扫码关注微信
1. 中国科学技术大学软件学院 合肥 230051
2. 中国科学院信息工程研究所 北京 100093
3. 信息内容安全技术国家工程实验室 北京 100093
4. 国家计算机网络应急技术处理协调中心 北京 100029
[ "杜华明,男,中国科学技术大学软件学院硕士研究生,主要研究方向为大数据处理。" ]
[ "张鹏,男,中国科学院信息工程研究所在站博士后,主要研究方向为数据流处理与云计算。" ]
[ "徐克付,男,中国科学院信息工程研究所副研究员、博士生导师,主要研究方向为大数据与云计算、信息内容安全。" ]
[ "谭建龙,男,中国科学院信息工程研究所研究员、博士生导师,主要研究方向为云计算与网络安全。" ]
[ "李焱,男,国家计算机网络应急技术处理协调中心工程师,主要研究方向为云计算。" ]
网络出版日期:2013-10,
纸质出版日期:2013-10-20
移动端阅览
杜华明, 张鹏, 徐克付, 等. 面向数据流处理的元组跟踪方法[J]. 电信科学, 2013,29(10):49-57.
Huaming Du, Peng Zhang, Kefu Xu, et al. TTDSP:A Cost-Effective Approach to Tracking Tuple in Data Stream Processing[J]. Telecommunications science, 2013, 29(10): 49-57.
杜华明, 张鹏, 徐克付, 等. 面向数据流处理的元组跟踪方法[J]. 电信科学, 2013,29(10):49-57. DOI: 10.3969/j.issn.1000-0801.2013.10.010.
Huaming Du, Peng Zhang, Kefu Xu, et al. TTDSP:A Cost-Effective Approach to Tracking Tuple in Data Stream Processing[J]. Telecommunications science, 2013, 29(10): 49-57. DOI: 10.3969/j.issn.1000-0801.2013.10.010.
为了保证数据流中的每个元组得到可靠处理,传统的方法需要在内存中保存每个元组,直到它们被数据流处理系统正常处理,因此会带来很大的内存开销。为此提出了一种既能够保证元组得到可靠处理,又能够节省内存开销的元组跟踪方法。该方法包括内存分配策略、元组跟踪单元选择策略和校验值更新策略,这3个策略使得元组跟踪单元只保留元组标识符的异或校验值而不是元组减少内存开销,同时通过改进一致性散列变换实现元组跟踪单元的负载均衡。内存开销和负载均衡的相关实验表明,该方法能够有效实现对元组的跟踪和可靠处理。
The traditional data stream processing systems will keep all the tuples in the memory until they have been processed in order to provide reliable tuple processing.Unfortunately
the strategy will take up much memory.To address this issue
a cost-effective approach to tracking tuples-TTDSP was proposed.The approach includes three strategies
namely memory allocation strategy
tuple acker selection strategy and checksum updating strategy
which make tuple acker to keep only the XOR checksum not the tuple in memory.Moreover
the tuple acker are load balancing through the improved consistent Hash.The experiments on memory overhead and load balancing show that this approach is able to track and process tuples effectively and reliably.
孟小峰 , 慈祥 . 大数据管理:概念,技术与挑战 . 计算机研究与发展 , 2010 , 50 ( 1 ): 146 ~ 169
Kumar R . Two computational paradigm for big data . http://kdd2012.sigkdd.org/sites/images/summerschool/Ravi.Kumar.pdf http://kdd2012.sigkdd.org/sites/images/summerschool/Ravi.Kumar.pdf , 2012
Hwang J H , Balazinska M , Rasin A , et al . High-availability algorithms for distributed stream processing . Proceedings of 21st International Conference on Data Engineering (ICDE 2005) , Tokyo,Japan , 2005
Two computational paradigm for big dataS4 distributed stream computing platformarticle-title> . http://incubator.apache.org/s4/doc/0.6.0/fault_tolerance/ http://incubator.apache.org/s4/doc/0.6.0/fault_tolerance/ , 2013
Guaranteeing message processing . https://github.com/nathanmarz/storm/wiki/Guaranteeing-message-processing https://github.com/nathanmarz/storm/wiki/Guaranteeing-message-processing , 2013
Gu X H , Papadimitriou S , Yu P S , et al . Toward predictive failure management for distributed stream processing systems . Proceedings of 2008 the 28th International Conference on Distributed Computing Systems(ICDCS'08) , Washington,DC,USA , 2008
Brito A , Fetzer C , Felber P . Minimizing latency in fault-tolerant distributed stream processing systems . Proceedings of 2009 the 29th IEEE International Conference on Distributed Computing Systems(ICDCS'09) , Washington,DC,USA , 2009
Brito A , Fetzer C , Felber P . Multithreading-enabled active replication for event stream processing operators . Proceedings of the 28th IEEE International Symposium on Reliable Distributed Systems(SRDS'09) , Niagara Falls,New York,USA , 2009
Brito A , Fetzer C , Felber P . Minimizing latency in fault-tolerant distributed stream processing systems . Proceedings of 2009 the 29th IEEE International Conference on Distributed Computing Systems(ICDCS'09) , Washington,DC,USA , 2009
Sebepou Z , Magoutis K . CEC:continuous eventual check pointing for data stream processing operators . Proceedings of 2011 IEEE/IFIP 41st International Conference on Dependable Systems Networks(DSN) , Hong Kong,China , June 2011
Sebepou Z , Magoutis K . Scalable storage support for data stream processing . Proceedings of the 26th Symposium on Mass Storage Systems and Technologies(MSST) , Incline Village,Nevada , May 2010
Gulisano V , Jimenez-Peris R , Patino-Martnez M , et al . StreamCloud:an elastic and scalable data streaming system . IEEE Transactions on Parallel and Distributed Systems , 2010 , 23 ( 12 ): 2351 ~ 2365
0
浏览量
193
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构