浏览全部资源
扫码关注微信
[ "高凯辉(1996- ),男,清华大学计算机科学与技术系博士生,主要研究方向为数据中心网络和网络智能" ]
[ "李丹(1981- ),男,清华大学计算机科学与技术系教授、博士生导师,主要研究方向为可信任互联网、网络智能和数据中心网络" ]
网络出版日期:2023-06,
纸质出版日期:2023-06-20
移动端阅览
高凯辉, 李丹. 数据中心网络性能保障研究综述[J]. 电信科学, 2023,39(6):1-21.
Kaihui GAO, Dan LI. Data center networks with performance guarantee: a survey[J]. Telecommunications science, 2023, 39(6): 1-21.
高凯辉, 李丹. 数据中心网络性能保障研究综述[J]. 电信科学, 2023,39(6):1-21. DOI: 10.11959/j.issn.1000-0801.2023125.
Kaihui GAO, Dan LI. Data center networks with performance guarantee: a survey[J]. Telecommunications science, 2023, 39(6): 1-21. DOI: 10.11959/j.issn.1000-0801.2023125.
数据中心网络(DCN)作为重要的信息基础设施支撑了众多分布式应用,如人工智能训练和云存储等。这些应用通过网络传输大量数据,因此对数据中心网络性能的稳定性提出了很高的要求。近年来,数据中心网络性能保障研究受到学术界和工业界的广泛关注。首先,分析了数据中心网络实现性能保障面临的主要挑战,并提出了提升性能稳定性的研究思路。其次,总结了性能有保障的数据中心网络必备的三大属性——高可用性、带宽保证和有界延迟,系统性地综述了这 3 个方面的相关研究工作,并对这些研究工作从多个角度进行了对比分析。最后,对数据中心网络性能保障研究的未来发展趋势进行了展望。
As an important information infrastructure
data center network (DCN) supports numerous distributed applications
such as artificial intelligence training and cloud storage.These applications require the transmission of large amounts of data
and as a result
the demand for performance stability of DCN has been increasing.The performance guarantee of DCN has gained widespread attention from both academia and industry in recent years.Firstly
the main challenges of implementing performance guarantees in DCN were analyzed
and the research directions to improve performance stability were proposed.Secondly
three essential capabilities for a performance-guaranteed DCN were summarized
such as high availability
bandwidth guarantee
and bounded latency.The comparative analysis of the relevant research was provided from multiple perspectives.In the end
the future development trend of the research on the performance guarantee of DCN was looked forward to.
李丹 , 陈贵海 , 任丰原 , 等 . 数据中心网络的研究进展与趋势 [J ] . 计算机学报 , 2014 , 37 ( 2 ): 259 - 274 .
LI D , CHEN G H , REN F Y , et al . Data center network research progress and trends [J ] . Chinese Journal of Computers , 2014 , 37 ( 2 ): 259 - 274 .
YOUNG J , BARTH T . Akamai online retail performance report:milliseconds are critical [R ] . 2017 .
WANG S , LI D , ZHANG J S , et al . CEFS:compute-efficient flow scheduling for iterative synchronous applications [C ] // Proceedings of the 16th International Conference on Emerging Networking Experiments and Technologies . New York:ACM Press , 2020 : 136 - 148 .
CNCF . Cloud native computing foundation [EB ] . 2021 .
GOUK D , LEE S , KWON M , et al . Direct access,high- performance memory disaggregation with Direct CXL [C ] // 2022 USENIX Annual Technical Conference . Berkeley:USENIX Association , 2022 : 287 - 294 .
ZHANG X C , WANG T Y . Elastic and reliable bandwidth reservation based on distributed traffic monitoring and control [J ] . IEEE Transactions on Parallel and Distributed Systems , 2022 , 33 ( 12 ): 4563 - 4580 .
XIA W F , ZHAO P , WEN Y G , et al . A survey on data center networking (DCN):infrastructure and operations [J ] . IEEE Communications Surveys & Tutorials , 2017 , 19 ( 1 ): 640 - 656 .
曾高雄 , 胡水海 , 张骏雪 , 等 . 数据中心网络传输协议综述 [J ] . 计算机研究与发展 , 2020 , 57 ( 1 ): 74 - 84 .
ZENG G X , HU S H , ZHANG J X , et al . Transport protocols for data center networks:a survey [J ] . Journal of Computer Research and Development , 2020 , 57 ( 1 ): 74 - 84 .
蒋炜 , 索龙 , 晋路遥 , 等 . 数据中心虚拟网络映射综述 [J ] . 电力信息与通信技术 , 2021 , 19 ( 4 ): 9 - 17 .
JIANG W , SUO L , JIN L Y , et al . Overview of virtual network embedding in data centers [J ] . Electric Power Information and Communication Technology , 2021 , 19 ( 4 ): 9 - 17 .
武晋 , 何利力 . 云计算数据中心能耗优化研究综述 [J ] . 软件导刊 , 2019 , 18 ( 8 ): 4 - 7 .
WU J , HE L L . A summary of research on energy optimization of cloud computing data center [J ] . Software Guide , 2019 , 18 ( 8 ): 4 - 7 .
KUMAR P . Toward predictable networks [D ] . Ithaca:Cornell University , 2021 .
JANARDHAN S . Update about the October 4th outage [EB ] . 2021 .
李文信 , 齐恒 , 徐仁海 , 等 . 数据中心网络流量调度的研究进展与趋势 [J ] . 计算机学报 , 2020 , 43 ( 4 ): 600 - 617 .
LI W X , QI H , XU R H , et al . Data center network flow scheduling progress and trends [J ] . Chinese Journal of Computers , 2020 , 43 ( 4 ): 600 - 617 .
CLARK D . The design philosophy of the DARPA Internet protocols [C ] // Proceedings of Symposium Proceedings on Communications Architectures and Protocols - SIGCOMM’88 . New York:ACM Press , 1988 : 106 - 114 .
ARYAL A , LIAO Y Y , NATTUTHURAI P , et al . The emerging big data analytics and IoT in supply chain management:a systematic review [J ] . Supply Chain Management-An International Journal , 2020 , 25 ( 2 ): 141 - 156 .
SINGH A , ONG J , AGARWAL A , et al . Jupiter rising:a decade of clos topologies and centralized control in Google’s datacenter network [J ] . ACM SIGCOMM Computer Communication Review , 2015 , 45 ( 4 ): 183 - 197 .
AMODEI D , HERNANDEZON D . AI and compute [EB ] . 2018 .
DOBRESCU M , ARGYRAKI K , RATNASAMY S . Toward predictable performance in software packet-processing platforms [J ] . Proceedings of NSDI 2012:9th USENIX Symposium on Networked Systems Design and Implementation.Berkeley:USENIX Association , 2012 : 141 - 154 .
MOORE G E . Cramming more components onto integrated circuits [J ] . Proceedings of the IEEE , 1998 , 86 ( 1 ): 82 - 85 .
ZHAO S Z , CAO P R , WANG X B . Understanding the performance guarantee of physical topology design for optical circuit switched data centers [J ] . Proceedings of the ACM on Measurement and Analysis of Computing Systems , 2021 , 5 ( 3 ): 1 - 24 .
MCKEOWN N , ANDERSON T , BALAKRISHNAN H , et al . OpenFlow:enabling innovation in campus networks [J ] . ACM SIGCOMM Computer Communication Review , 2008 , 38 ( 2 ): 69 - 74 .
CARIA M , JUKAN A , HOFFMANN M . SDN partitioning:a centralized control plane for distributed routing protocols [J ] . IEEE Transactions on Network and Service Management , 2016 , 13 ( 3 ): 381 - 393 .
FOSTER N , MCKEOWN N , REXFORD J , et al . Using deep programmability to put network owners in control [J ] . ACM SIGCOMM Computer Communication Review , 2020 , 50 ( 4 ): 82 - 88 .
SHIRMARZ A , GHAFFARI A . Performance issues and solutions in SDN-based data center:a survey [J ] . The Journal of Supercomputing , 2020 , 76 ( 10 ): 7545 - 7593 .
GUO C X , YUAN L H , XIANG D , et al . Pingmesh:a large-scale system for data center network latency measurement and analysis [J ] . ACM SIGCOMM Computer Communication Review , 2015 , 45 ( 4 ): 139 - 152 .
PENG Y H , YANG J , WU C , et al . Detector:a topology-aware monitoring system for data center networks [C ] // Proceedings of the 2017 USENIX Conference on USENIX Annual Technical Conference . Berkeley:USENIX Association , 2017 : 55 - 68 .
TAN C , JIN Z , GUO C X , et al . NetBouncer:active device and link failure localization in data center networks [C ] // Proceedings of the 16th USENIX Conference on Networked Systems Design and Implementation . Berkeley:USENIX Association , 2019 : 599 - 614 .
ARZANI B , CIRACI S , LOO B T , et al . Taking the blame game out of data centers operations with NetPoirot [C ] // Proceedings of the 2016 ACM SIGCOMM Conference . New York:ACM Press , 2016 : 440 - 453 .
ROY A , ZENG H Y , BAGGA J , et al . Passive realtime datacenter fault detection and localization [C ] // Proceedings of the 14th USENIX Conference on Networked Systems Design and Implementation . Berkeley:USENIX Association , 2017 : 595 - 612 .
ARZANI B , CIRACI S , CHAMON L , et al . 007:democratically finding the cause of packet drops [C ] // Proceedings of the 15th USENIX Conference on Networked Systems Design and Implementation . Berkeley:USENIX Association , 2018 : 419 - 435 .
MOSHREF M , YU M L , GOVINDAN R , et al . Trumpet:timely and precise triggers in data centers [C ] // Proceedings of the 2016 ACM SIGCOMM Conference . New York:ACM Press , 2016 : 129 - 143 .
GENG Y , LIU S , YIN Z , et al . SIMON:a simple and scalable method for sensing,inference and measurement in data center networks [C ] // Proceedings of the 16th USENIX Conference on Networked Systems Design and Implementation . Berkeley:USENIX Association , 2019 : 549 - 564 .
ERIKSSON B , BARFORD P , BOWDEN R , et al . BasisDetect:a model-based network event detection framework [C ] // Proceedings of the 10th ACM SIGCOMM Conference on Internet Measurement . New York:ACM Press , 2010 : 451 - 464 .
LIU V , HALPERIN D , KRISHNAMURTHY A , et al . F10:a fault-tolerant engineered network [C ] // Proceedings of the 10th USENIX Conference on Networked Systems Design and Implementation . Berkeley:USENIX Association , 2013 : 399 - 412 .
ZHU Y B , KANG N X , CAO J X , et al . Packet-level telemetry in large datacenter networks [C ] // Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication . New York:ACM Press , 2015 : 479 - 491 .
LI Y L , MIAO R , KIM C , et al . LossRadar:fast detection of lost packets in data center networks [C ] // Proceedings of the 12th International on Conference on Emerging Networking Experiments and Technologies . New York:ACM Press , 2016 : 481 - 495 .
ZHOU Y , SUN C , LIU H H , et al . Flow event telemetry on programmable data plane [C ] // Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications,Technologies,Architectures,and Protocols for Computer Communication . New York:ACM Press , 2020 : 76 - 89 .
MOLERO E C , VISSICCHIO S , VANBEVER L . FAst in-network GraY failure detection for ISPs [C ] // Proceedings of the ACM SIGCOMM 2022 Conference . New York:ACM Press , 2022 : 677 - 692 .
WU D M , XIA Y T , SUN X S , et al . Masking failures from application performance in data center networks with shareable backup [C ] // Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication . New York:ACM Press , 2018 : 176 - 190 .
PAPÁN J , SEGEČ P , MORAVČÍK M , et al . Overview of IP fast reroute solutions [C ] // Proceedings of 2018 16th International Conference on Emerging eLearning Technologies and Applications (ICETA) . Piscataway:IEEE Press , 2018 : 417 - 424 .
LEMESHKO O , YEVDOKYMENKO M , YEREMENKO O , et al . Design of the fast ReRoute QoS protection scheme for bandwidth and probability of packet loss in software-defined WAN [C ] // Proceedings of 2019 IEEE 15th International Conference on the Experience of Designing and Application of CAD Systems (CADSM) . Piscataway:IEEE Press , 2019 : 1 - 5 .
KAMISIŃSKI A . Evolution of IP fast-reroute strategies [C ] // Proceedings of 2018 10th International Workshop on Resilient Networks Design and Modeling (RNDM) . Piscataway:IEEE Press , 2018 : 1 - 6 .
LEMESHKO O , YEREMENKO O , YEVDOKYMENKO M . MPLS traffic engineering solution of multipath fast ReRoute with local and bandwidth protection [C ] // Proceedings of 2020 International Conference on Computer Science,Engineering and Education Applications . Cham:Springer , 2020 : 113 - 125 .
ADRICHEM N L M , ASTEN B J , KUIPERS F A . Fast recovery in software-defined networks [C ] // Proceedings of 2014 Third European Workshop on Software Defined Networks . Piscataway:IEEE Press , 2014 : 61 - 66 .
KUŹNIAR M , PEREŠÍNI P , VASIĆ N , et al . Automatic failure recovery for software-defined networks [C ] // Proceedings of the 2nd ACM SIGCOMM Workshop on Hot Topics in Software Defined Networking . New York:ACM Press , 2013 : 159 - 160 .
Cisco . 2014.BGP PIC edge for IP and MPLS-VPN [EB ] . 2014 .
LIU J , PANDA A , SINGLA A , et al . Ensuring connectivity via data plane mechanisms [C ] // Proceedings of the 10th USENIX Conference on Networked Systems Design and Implementation . Berkeley:USENIX Association , 2013 : 113 - 126 .
AL-FARES M , LOUKISSAS A , VAHDAT A . A scalable,commodity data center network architecture [C ] // Proceedings of the ACM SIGCOMM 2008 Conference on Data Communication . New York:ACM Press , 2008 : 63 - 74 .
CHIESA M , SEDAR R , ANTICHI G , et al . PURR:a primitive for reconfigurable fast reroute:hope for the best and program for the worst [C ] // Proceedings of the 15th International Conference on Emerging Networking Experiments and Technologies . New York:ACM Press , 2019 : 1 - 14 .
LAKSHMINARAYANAN K , CAESAR M , RANGAN M , et al . Achieving convergence-free routing using failure-carrying packets [J ] . ACM SIGCOMM Computer Communication Review , 2007 , 37 ( 4 ): 241 - 252 .
RAMOS R M , MARTINELLO M , ROTHENBERG C E . Slickflow:resilient source routing in data center networks unlocked by OpenFlow [C ] // Proceedings of 38th Annual IEEE Conference on Local Computer Networks . Piscataway:IEEE Press , 2014 : 606 - 613 .
GUO C X , LU G H , WANG H J , et al . SecondNet:a data center network virtualization architecture with bandwidth guarantees [C ] // Proceedings of the 6th International Conference on Emerging Networking Experiments and Technologies . New York:ACM Press , 2010 .
PRESUHN R . Management information base (MIB) for the simple network management protocol (SNMP) [J ] . RFC , 2002 , 3418 : 1 - 26 .
LONVICK C . The BSD syslog protocol [R ] . 2001 .
KATZ D , WARD D . Bidirectional forwarding detection (BFD) [R ] . 2010 .
HUANG P , GUO C X , ZHOU L D , et al . Gray failure:the achilles’ heel of cloud-scale systems [C ] // Proceedings of the 16th Workshop on Hot Topics in Operating Systems . New York:ACM Press , 2017 : 150 - 155 .
ZHUO D Y , GHOBADI M , MAHAJAN R , et al . Understanding and mitigating packet corruption in data center networks [C ] // Proceedings of the Conference of the ACM Special Interest Group on Data Communication . New York:ACM Press , 2017 : 362 - 375 .
ZHU Y B , ERAN H , FIRESTONE D , et al . Congestion control for large-scale RDMA deployments [J ] . ACM SIGCOMM Computer Communication Review , 2015 , 45 ( 4 ): 523 - 536 .
CARDWELL N , CHENG Y , GUNN C S , et al . Bbr [J ] . Communications of the ACM , 2017 , 60 ( 2 ): 58 - 66 .
BOSSHART P , DALY D , IZZARD M , et al . Programming protocol-independent packet processors [J ] . ACM SIGCOMM Computer Communication Review , 2014 , 44 ( 3 ): 87 - 95 .
MOY J . OSPF version 2 [R ] . 1997 .
ISO . Intermediate system-to-intermediate system (IS-IS) routing protocol [R ] . 2002 .
ALIZADEH M , GREENBERG A , MALTZ D , et al . Data center TCP (DCTCP) [C ] // Proceedings of the ACM SIGCOMM 2010 Conference . New York:ACM Press , 2010 : 63 - 74 .
BASIT Z , TABASSUM M , SHARMA T , et al . Performance analysis of OSPF and EIGRP convergence through IP sec tunnel using multi-homing BGP connection [J ] . Materials Today:Proceedings , 2022 ( 62 ): 4853 - 4861 .
GILL P , JAIN N , NAGAPPAN N . Understanding network failures in data centers [J ] . ACM SIGCOMM Computer Communication Review , 2011 , 41 ( 4 ): 350 - 361 .
CHEN Y R , REZAPOUR A , TZENG W G , et al . RL-routing:an SDN routing algorithm based on deep reinforcement learning [J ] . IEEE Transactions on Network Science and Engineering , 2020 , 7 ( 4 ): 3185 - 3199 .
Open Flow switch specification 1.3.1 [EB ] . 2013 .
NIRANJA N MYSORE R , PAMBORIS A , FARRINGTON N , et al . Port Land:a scalable fault-tolerant layer 2 data center network fabric [J ] . ACM SIGCOMM Computer Communication Review , 2009 , 39 ( 4 ): 39 - 50 .
GAFNI E , BERTSEKAS D . Distributed algorithms for generating loop-free routes in networks with frequently changing topology [J ] . IEEE Transactions on Communications , 1981 , 29 ( 1 ): 11 - 18 .
BALLANI H , COSTA P , KARAGIANNIS T , et al . Towards predictable datacenter networks [C ] // Proceedings of the ACM SIGCOMM 2011 Conference . New York:ACM Press , 2011 : 242 - 253 .
LEE J , TURNER Y , LEE M , et al . Application-driven bandwidth guarantees in datacenters [C ] // Proceedings of the 2014 ACM Conference on SIGCOMM . New York:ACM Press , 2014 : 467 - 478 .
JANG K , SHERRY J , BALLANI H , et al . Silo:predictable message latency in the cloud [C ] // Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication . New York:ACM Press , 2015 : 435 - 448 .
XIE D , DING N , HU Y C , et al . The only constant is change:incorporating time-varying network reservations in data centers [J ] . ACM SIGCOMM Computer Communication Review , 2012 , 42 ( 4 ): 199 - 210 .
CHOWDHURY M , LIU Z H , GHODSI A , et al . HUG:multi-resource fairness for correlated and elastic demands [C ] // Proceedings of the 13th USENIX Conference on Networked Systems Design and Implementation . Berkeley:USENIX Association , 2016 : 407 - 424 .
BALLANI H , JANG K , KARAGIANNIS T , et al . Chatty tenants and the cloud network sharing problem [C ] // Proceedings of the 10th USENIX Conference on Networked Systems Design and Implementation . Berkeley:USENIX Association , 2013 : 171 - 184 .
POPA L , KUMAR G , CHOWDHURY M , et al . FairCloud:sharing the network in cloud computing [C ] // Proceedings of the 10th ACM Workshop on Hot Topics in Networks . New York:ACM Press , 2011 : 1 - 6 .
LAM V T , RADHAKRISHNAN S , PAN R , et al . NetShare and stochastic NetShare:predictable bandwidth allocation for data centers [J ] . ACM SIGCOMM Computer Communication Review , 2012 , 42 ( 3 ): 6 - 11 .
HU S H , BAI W , CHEN K , et al . Providing bandwidth guarantees,work conservation and low latency simultaneously in the cloud [C ] // Proceedings of IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications . Piscataway:IEEE Press , 2016 : 1 - 9 .
SHIEH A , KANDULA S , GREENBERG A , et al . Sharing the data center network [C ] // Proceedings of the 8th USENIX Conference on Networked Systems Design and Implementation . Berkeley:USENIX Association , 2011 : 309 - 322 .
POPA L , YALAGANDULA P , BANERJEE S , et al . ElasticSwitch:practical work-conserving bandwidth guarantees for cloud computing [C ] // Proceedings of the ACM SIGCOMM 2013 Conference on SIGCOMM . New York:ACM Press , 2013 : 351 - 362 .
RODRIGUES H , SANTOS J R , TURNER Y , et al . Gatekeeper:supporting bandwidth guarantees for multi-tenant datacenter networks [C ] // Proceedings of the 3rd Conference on I/O Virtualization . Berkeley:USENIX Association , 2011 .
JEYAKUMAR V , ALIZADEH M , MAZIÈRES D , et al . EyeQ:practical network performance isolation at the edge [C ] // Proceedings of the 10th USENIX Conference on Networked Systems Design and Implementation . Berkeley:USENIX Association , 2013 : 297 - 312 .
ANGEL S , BALLANI H , KARAGIANNIS T , et al . End-to-end performance isolation through virtual datacenters [C ] // Proceedings of the 11th USENIX Conference on Operating Systems Design and Implementation . Berkeley:USENIX Association , 2014 : 233 - 248 .
KUMAR P , DUKKIPATI N , LEWIS N , et al . PicNIC:predictable virtualized NIC [C ] // Proceedings of the ACM Special Interest Group on Data Communication . New York:ACM Press , 2019 : 351 - 366 .
ZHU J , LI D , WU J P , et al . Towards bandwidth guarantee in multi-tenancy cloud computing networks [C ] // Proceedings of 2012 20th IEEE International Conference on Network Protocols (ICNP) . Piscataway:IEEE Press , 2013 : 1 - 10 .
DUFFIELD N G , GOYAL P , GREENBERG A , et al . A flexible model for resource management in virtual private networks [J ] . ACM SIGCOMM Computer Communication Review , 1999 , 29 ( 4 ): 95 - 108 .
BOUDEC J Y , THIRAN P . Network calculus a theory of deterministic queuing systems for the Internet [M ] . Heidelberg : Springer , 2004 .
ALIZADEH M , YANG S , SHARIF M , et al . pFabric:minimal near-optimal datacenter transport [C ] // Proceedings of the ACM SIGCOMM 2013 Conference on SIGCOMM . New York:ACM Press , 2013 : 435 - 446 .
KIM C , SIVARAMAN A , KATTA N , et al . In-band network telemetry via programmable dataplanes [C ] // Proceedings of the 2015 ACM Conference on SIGCOMM . New York:ACM Press , 2015 .
LI Y L , MIAO R , LIU H Q , et al . HPCC:high precision congestion control [C ] // Proceedings of the ACM Special Interest Group on Data Communication . New York:ACM Press , 2019 : 44 - 58 .
KELLY F P , RAINA G , VOICE T . Stability and fairness of explicit congestion control with small buffers [J ] . ACM SIGCOMM Computer Communication Review , 2008 , 38 ( 3 ): 51 - 62 .
WANG S , GAO K H , QIAN K , et al . Predictable vFabric on informative data plane [C ] // Proceedings of the ACM SIGCOMM 2022 Conference . New York:ACM Press , 2022 : 615 - 632 .
MITTAL R , LAM V T , DUKKIPATI N , et al . TIMELY:RTT-based congestion control for the datacenter [C ] // Proceedings of the ACM Special Interest Group on Data Communication . New York:ACM Press , 2015 : 537 - 550 .
KUMAR G , DUKKIPATI N , JANG K , et al . Swift:delay is simple and effective for congestion control in the datacenter [C ] // Proceedings of the Annual Conference of the ACM Special Interest Group on Data Communication on the Applications,Technologies,Architectures,and Protocols for Computer Communication . New York:ACM Press , 2020 : 514 - 528 .
PERRY J , OUSTERHOUT A , BALAKRISHNAN H , et al . Fastpass:a centralized “zero-queue” datacenter network [C ] // Proceedings of the 2014 ACM conference on SIGCOMM . New York:ACM Press , 2014 : 307 - 318 .
BEMTEN A , DERIĆ N , VARASTEH A , et al . Chameleon:predictable latency and high utilization with queue-aware and adaptive source routing [C ] // Proceedings of the 16th International Conference on emerging Networking Experiments and Technologies . New York:ACM Press , 2020 : 451 - 465 .
LE Y , MYSORE R N , SURESH L , et al . PL2:towards predictable low latency in rack-scale networks [J ] . arXiv preprint , 2021 ,arXiv:2101.06537.
WILSON C , BALLANI H , KARAGIANNIS T , et al . Better never than late:meeting deadlines in datacenter networks [C ] // Proceedings of the ACM SIGCOMM 2011 Conference . New York:ACM Press , 2011 : 50 - 61 .
HONG C , CAESAR M , GODFREY P B . Finishing flows quickly with preemptive scheduling [J ] . ACM SIGCOMM Computer Communication Review , 2012 , 42 ( 4 ): 127 - 138 .
VAMANAN B , HASAN J , VIJAYKUMAR T N . Deadline-aware datacenter TCP (D 2 TCP) [J ] . ACM SIGCOMM Computer Communication Review , 2012 , 42 ( 4 ): 115 - 126 .
MUNIR A , QAZI I A , UZMI Z A , et al . Minimizing flow completion times in data centers [C ] // 2013 Proceedings IEEE INFOCOM . Piscataway:IEEE Press , 2013 : 2157 - 2165 .
GROSVENOR M P , SCHWARZKOPF M , GOG I , et al . Queues don’t matter when You can JUMP them! [C ] // Proceedings of the 12th USENIX Conference on Networked Systems Design and Implementation . Berkeley:USENIX Association , 2015 : 1 - 14 .
CHEN L , CHEN K , BAI W , et al . Scheduling mix-flows in commodity datacenters with Karuna [C ] // Proceedings of the 2016 ACM SIGCOMM Conference . New York:ACM Press , 2016 : 174 - 187 .
ZHANG Y W , KUMAR G , DUKKIPATI N , et al . Aequitas:admission control for performance-critical RPCs in datacenters [C ] // Proceedings of the ACM SIGCOMM 2022 Conference . New York:ACM Press , 2022 : 1 - 18 .
BAI W , CHEN L , CHEN K , et al . Information-agnostic flow scheduling for commodity data centers [C ] // Proceedings of the 12th USENIX Conference on Networked Systems Design and Implementation . Berkeley:USENIX Association , 2015 : 455 - 468 .
CHEN L , LINGYS J , CHEN K , et al . Au TO:scaling deep reinforcement learning for datacenter-scale automatic traffic optimization [C ] // Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication . New York:ACM Press , 2018 : 191 - 205 .
MUNIR A , BAIG G , IRTEZA S , et al . Friends,not foes:synthesizing existing transport strategies for data center networks [J ] . ACM SIGCOMM Computer Communication Review , 2014 , 44 ( 4 ): 491 - 502 .
IEEE . Congestion notification:IEEE 802.11Qau [S ] . 2010 .
GIBSON D , HARIHARAN H , LANCE E , et al . Aquila:a unified,low-latency fabric for datacenter networks [C ] // Proceedings of the 19th USENIX Symposium on Networked Systems Design and Implementation . Berkeley:USENIX Association , 2022 : 1249 - 1266 .
0
浏览量
617
下载量
0
CSCD
关联资源
相关文章
相关作者
相关机构