浏览全部资源
扫码关注微信
[ "胡海洋,男,博士,杭州电子科技大学副教授、CCF高级会员、CCF系统软件专业委员会委员,主要研究方向为形式化方法、工作流技术。" ]
[ "刘占晨,男,杭州电子科技大学硕士研究生,主要研究方向为图数据库、工作流技术。" ]
[ "胡华,男,博士,杭州电子科技大学教授、CCF高级会员,主要研究方向为数据库技术、软件协同技术。" ]
网络出版日期:2013-03,
纸质出版日期:2013-03-20
移动端阅览
胡海洋, 刘占晨, 胡华. 云计算环境下一种面向科学工作流不确定数据源的视图构造方法[J]. 电信科学, 2013,29(3):90-100.
Haiyang Hu, Zhanchen Liu, Hua Hu. Constructing View of Uncertain Data Provenance for Scientific Workflow in Cloud Computing[J]. Telecommunications science, 2013, 29(3): 90-100.
胡海洋, 刘占晨, 胡华. 云计算环境下一种面向科学工作流不确定数据源的视图构造方法[J]. 电信科学, 2013,29(3):90-100. DOI: 10.3969/j.issn.1000-0801.2013.03.017.
Haiyang Hu, Zhanchen Liu, Hua Hu. Constructing View of Uncertain Data Provenance for Scientific Workflow in Cloud Computing[J]. Telecommunications science, 2013, 29(3): 90-100. DOI: 10.3969/j.issn.1000-0801.2013.03.017.
科学工作流的数据源视图根据数据源中任务间的数据流关系,将它们划分为多个复合模块,并在此基础上进行数据抽象与封装,从而可有效降低科研工作者的数据分析工作量并节省数据查询时间。 然而在云计算环境中开发与应用科学工作流系统时,由于受数据采集的准确度和服务器的可靠性影响,将会导致工作流数据源图的不确定性,因此需要提供有效的机制在不确定数据源图中构建合理性视图。针对此方面,首先给出了不确定数据源图及其合理性视图的定义,在此基础上提出了一种检测不合理视图的方法;还进一步分析了数据源图中任务节点与其一阶前序节点之间存在的多种数据流关系及复合任务的局部期望支持度,给出了合理视图的构造方法。设计了相应的多项式时间算法,并分析算法的时间复杂度。最后,对相关方法给出示例,并进行实验分析,验证了其可行性与有效性。
The view of data provenance in scientific workf1ow provides an approach of data abstraction and encapsu1ation by partitioning tasks in the data provenance graph(DPG)into a set of composite modu1es due to the data f1ow re1ations among them
so as to efficient1y decrease the work1oad consumed by researchers making ana1ysis on the data provenance and the time needed in doing data querying.Neverthe1ess
deve1oping and app1ying the scientific workf1ow systems in c1oud computing environments suffers the prob1em of uncertainty brought by the inaccuracy of data co11ection and unre1iabi1ity of data servers distributed in the internet.Concentrating on this scenario
the definitions of uncertain DPG and its sound view were presented first1y
and then a method for detecting the unsound view of DPG was proposed.A1so
a method for constructing sound and high-support view was presented
which is based on the data f1ow re1ations among the tasks and their first-order preceding tasks in the graph
and the 1oca1 expected support of the composite modu1es.A po1ynomia1-time a1gorithm was designed
and its maxima1 time comp1exity was a1so ana1yzed.Additiona11y
an examp1e and conduct comprehensive experiments were given to show the feasibi1ity and effectiveness of the method.
IBM Blue Cloud Solution . http://www-900.ibm.com/ibm/ideasfromibm/cn/cloud/solutions/index.shtm http://www-900.ibm.com/ibm/ideasfromibm/cn/cloud/solutions/index.shtm
Sun Cloud Architecture Introduction White Paper . http://developers.sun.com.cn/blog/functionalca/resource/sun_353 cloudcomputing_chinese.pdf http://developers.sun.com.cn/blog/functionalca/resource/sun_353 cloudcomputing_chinese.pdf
Barroso LA , Dean J , Holzle U . Web search for a planet:the Google cluster architecture . IEEE Micro , 2003 , 23 ( 2 ): 22 ~ 28
International Telegraph Union(ITU) . http://www.itu.int/en/pages/defau1t.aspx http://www.itu.int/en/pages/defau1t.aspx
Organization for the Advancement of Structured Information Standards(OASIS) . http://www.oasis-open.org/ http://www.oasis-open.org/
冯登国 , 张敏 , 张妍 等 . 云计算安全研究 . 软件学报 , 2011 , 22 ( 1 ): 71 ~ 83
郑湃 , 崔立真 , 王海洋 等 . 云计算环境下面向数据密集型应用的数据布局策略与方法 . 计算机学报 , 2010 , 33 ( 8 ): 1472 ~ 1480
Oinn T , Addis M , Ferris J , et al . Taverna:a tool for the composition and enactment of bioinformatics workf1ows . Bioinformatics , 2004 , 20 > ( 17 ): 3045 ~ 3054 .
A1tintas I , Berkley C , Jaeger E , et al . Kepler:an extensible system for design and execution of scientific workflows . Proceedings of the 16th International Conference on Scientific and Statistical Database Management , Santorini Island,Greece 2004 : 423 ~ 424 .
Biton O , Boulakia S C , Davidson S B , et al . Querying and managing provenance through user views in scientific workflows . Proceedings of the 24th International Conference on Data Engineering , Cancun, Mexico 2008 : 1072 ~ 1081 .
Biton O , Davidson S B , Khanna S , et al . Optimizing user views for workflows . Proceedings of the 12th International Conference on Database Theory, St , Petersburg, Russia 2009 : 310 ~ 323 .
Cohen S , Boulakia S C , Davidson S B . Towards a model of provenance and user views in scientific workf1ows , Proceedings of Lecture Notes in Computer Science Data Integration in the Life Sciences , Hinxton, UK 2006 : 264 ~ 279 .
Shao Q , Chen Y , Tao S , et al . Easy Ticket:a ticket routing recommendation engine for enterprise problem resolution . Proceedings of the VLDB Endowment , Auckland 2008 : 1436 ~ 1439 .
Shao Q , Chen Y , Tao S , et al . Efficient ticket routing by resolution sequence mining . Proceedings of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining , Las Vegas 2008 : 605 ~ 613 .
Shao Q , Sun P , Chen Y . Efficiently discovering critical workflows in scientific exp.lorations . Future Generation Computer System , 2009 , 25 ( 5 ): 577 ~ 585 .
Sun P , Liu Z Y , Susan D , et al . Detecting and resolving unsound workflow views for correct provenance analysis . Proceedings of the ACM SIGMOD International Conference on Management of Data , Rhode Island 2009 : 549 ~ 562 .
邹兆年 , 李建中 , 高宏 等 . 从不确定图中挖掘频繁子图模式 . 软件学报 , 2009 , 20 ( 11 ): 2965 ~ 2976 .
Chui C K , Kao B , Hung E . Mining frequent itemsets from uncertain data . Proceedings of the 11th Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining , Nanjing 2007 : 47 ~ 58 .
Zhou S G , Yu Z C , Jiang H L . Concepts, issues, and advances of searching in graph structured data . Communication , 2007 , 3 ( 8 ): 59 ~ 65 .
Shasha D , Wang T L , Guigno R . Algorithmic and applications of tee and graph searching . Proceedings of the 21st ACM SIGMOD-SIGART Symposium on Principles of Database Systems , Madison 2002 : 39 ~ 52 .
Yan X , Yu P S , Han J . Graph indexing:a frequent structure based approach . Proceedings of the 2004 ACM SIGMOD International Conference on Management of Data , Paris 2001 : 335 ~ 346 .
Hintsanen P , Toivonen H . Finding reliab1e subgraphs from large probabilistic graphs . Data Mining and Knowledge Discovery , 2008 , 17 ( 1 ): 3 ~ 23 .
Cheng J , Yu J , Lin X . Fast computing reachability labelings for large graphs with high compression rate . Proceedings of 11th International Conference on Extending Database Technology , Nantes 2008 : 193 ~ 204 .
Jin R , Hong H , Wang HX , et al . Computinglabel-constraint reachability in graph databases . Proceedings of the ACM SIGMOD International Conference on Management of Data , Indianapolis,Indiana 2010 : 123 ~ 134 .
0
浏览量
263
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构