Monday, November 23, 2015

5 Reasons Infiniband Will Lose Relevance at 100G

Proprietary technologies briefly lead the market because they introduce disruptive features not found in the available standard offerings. Soon after, those features are merged into the standard. We’ve seen this many times in the interconnects used in High Performance Computing (HPC). From 2001 through 2004 Myrinet adoption grew as rapidly in the Top500 as Ethernet, and if you were building a cluster at that time you likely used one or the other. Myrinet provided significantly lower latency, a higher performance switching fabric, and double the effective bandwidth, but it came with a larger price tag. In the below graph Myrinet made up nearly all of the declining grey line through 2010, by
which time the Top500 was split between Infiniband and Ethernet. Today Myrinet is gone, Infiniband is on top just edging out Ethernet, but its time in the sun has begun to fade as it faces challenges in five distinct areas.

1. Competition, in 2016 and beyond Infiniband EDR customers will have several attractive options: 25GbE, 50GbE and by 2017 100GbE along with Intel's Omni-Path. For the past several generations Infiniband has raced so far ahead of Ethernet that it left little choice. Recently though within HPC 10GbE adoption has been growing rapidly, and is responsible for much of Ethernet's growth in the past six months. During the same time 40GbE has seen little penetration, it’s often viewed as too expensive. In 2016 we will see an IEEE approved 25GbE and 50GbE standard emerge, along with new & affordable cabling/optics options. It should be noted that a single 50GbE link aligns very well with the most common host server bus connection PCIe Gen3 x8 which delivers roughly 52Gbps/unidirectionally. For 100GbE we'll need PCIe Gen4 x8. While 100Gbps could be done today with PCIe Gen3 x16 often HPC system architects leave this slot open for I/O hungry GPU cards. The second front Infiniband will be facing is Intel's Omni-Path technology which will also offer a 100Gbps solution, but it will be directly off the host CPU complex designed to be a routable extensible interconnect fabric. Intel made a huge splash at SC15 with Omni-Path & switching which is a fusion of intellectual property Intel picked up from Cray, Qlogic, and several other Infiniband acquisitions. Some view 2017 as the year when both 100GbE and Omni-Path will begin to chip away at Infiniband's performance revenue while 25/50GbE erodes the value focused HPC and Exascale customers Infiniband has been enjoying.

2. Bandwidth, if you’ve wanted something greater that 10GbE over a single link, you’ve pretty much had little choice up to this point. While 40GbE exists many view this as an expensive alternative. Recent pushes by two groups to flesh out 25GbE and 50GbE ahead of the IEEE have resulted in this standards group stepping up its' efforts.  All of this has accelerated the industries approach toward a unified 100GbE server solution for 2017. Add to this Arista and others pushing Intel to provide CLR4 as an affordable 4-channel 25G, 100G optical transceiver, and things get even more interesting.

3. Latency, has always been a strong reason for selecting Infiniband. Much of it's gains are the result of moving the communications stack into user space, and accelerating the wire to PCIe bus connection.  These tricks are not unique to Infiniband, others have played them all for Ethernet delivering performance ethernet controllers and OS bypass stacks which now offers similar latencies, when compared at similar speeds. This is why nearly all securities world-wide are traded through systems using Solarflare adapters leveraging their OS Bypass stack called OpenOnload while using standard UDP, and TCP protocols. The domain of low latency is no longer exclusive to RDMA as it can now be more easily, and transparently done using existing code, and via UDP and TCP transport layers over industry standard ethernet.

4. Single vendor, if you want Infiniband there really is only one vendor who offers a single end-to-end solution. End-to-end solution providers are great because they expose the single throat to choke when things eventually don't work it. Conversely, many customers will avoid adopting technologies where there is only one single provider, because it removes competition and choice from the equation. Also when that vendor stumbles, and they always do, with a single vendor you're stuck. Ethernet, the open industry standard, affords you options while also providing you with interoperability.

5. Return to a Single Network, ever since Fiber Channel intruded into the data center nearly two decades ago network engineers have been looking ways to remove it. Then along came exascale, HPC by another name, and Infiniband was also pulled into the datacenter. Some will say Infiniband can do all three, but clearly those people have never dealt with bridging real world Ethernet traffic with Infiniband traffic. At 100Gbps Ethernet should have what it needs in both features, and performance, to provide a pipeline for all three protocols over a single generic network fabric.

Given all the above it should be interesting to revisit this post in 2018 to see how the market reacted. For some perspective, in this blog back in December 2012 I wrote "How Ethernet Won the West" where I predicted that both Fiber Channel and Infiniband would eventually disappear. Fiber Channel as a result of Fiber Channel over Ethernet (FCoE), which never really took off, and Infiniband because everyone else was abandoning it including Jim Cramer. Turns out while I've yet to be right about either, Cramer nailed it.  Since January 2013, adjusting for splits and dividends, Mellanox stock has dropped 14%. 

No comments:

Post a Comment