200GE vs. 400GE: Data Center Network Development Focus

Date: 2020-07-09 Author: Lu Hai 381 Tags: 200G , 400G , data center

200GE vs. 400GE: Data Center Network Development Focus

The Internet connects more than four billion users worldwide and supports emerging digital applications such as Virtual Reality (VR) and Augmented Reality (AR), 16K video, autonomous driving, Artificial Intelligence (AI), 5G, and Internet of Things (IoT). In addition, by merging online and offline services in the education, medical, and office sectors, it has an impact on every aspect of our lives.

A Data Center Network (DCN), as the infrastructure for the development of Internet services, has moved from GE and 10GE networks to the "25GE access + 100GE interconnection" phase.

100GE Interconnection: Full-Fixed-Device Architecture Is Valued by Large Internet Enterprises

In the "25GE access + 100GE interconnection" architecture, a DCN implements large-scale access through three-layer networking. A single cluster can contain more than 100,000 servers.

As shown in the following figure, Point of Deployments (PoDs) at T1 and T2 layers can be flexibly expanded in much the same manner as blocks, and therefore constructed on demand.

Facebook F16 data center network topology

With the improvement of large-capacity forwarding chips and the reduction of 100GE optical interconnection costs, single-chip switches are used to construct a 100GE interconnection network. This single-chip multi-plane interconnection solution typically features 12.8Tbit/s chips. A single chip provides 128x100GE port density, and a single PoD can connect to 2000 servers.

Compared with a traditional solution composed of fixed and modular devices, a full-fixed-device networking solution increases the number of network nodes and optical interconnection modules between devices, which also increases the Operations and Maintenance (O&M) workload. However, a high-performance forwarding chip is introduced to effectively reduce the per-bit cost of DCN ports, which is obviously attractive to large Internet enterprises. On one hand, large Internet enterprises can quickly introduce 100 GE full-fixed-device architecture to reduce network construction costs; on the other, they can use their strong Research and Development (R&D) capabilities to improve automatic network deployment and maintenance capabilities in order to cope with the increased O&M workload.

As a result, large Internet enterprises often use the same 100GE network solution, and full-fixed-device networking has become the basis for 100GE network architecture evolution.

Direction of Network Acceleration Development

The "25GE access + 100GE interconnection" solution promotes unified chip selection and rapid growth, demonstrating how technology dividends drive rapid evolution of Internet Data Center (IDC) network architecture. With the launch of single-chip network products, 100GE inter-generational technology dividends are now available.

Given the inevitable bandwidth upgrade associated with such continuous and rapid service development, enterprises now face a choice: 200GE or 400GE?

Networks are never isolated, and the overall environment of the industry determines whether technologies can advance and become mature.

First, let's review the current status of the 200GE and 400GE industries from the perspective of network standards, servers, and optical modules.

200GE vs. 400GE Standards: Protocols and Standards Are Mature

During the evolution of Institute of Electrical and Electronics Engineers (IEEE) standards, work on the 200GE standard began after the 400GE standard.

After completing the Bandwidth Assessment I (BWA I) project survey, the IEEE 802.3 Ethernet Working Group initiated a project to formulate the 400GE standard in 2013. In 2015, the IEEE established the 802.3cd project and began to formulate the 200GE standard in order to further expand the market scope and include 50GE server and 200GE switch specifications.

As 200GE standards are derived from their 400GE equivalents, 200GE single-mode optical module specifications were finally included in the 802.3bs project. By then, the main designs of Physical Coding Sublayer (PCS), Physical Medium Attachment (PMA), and Physical Medium Dependent (PMD) had been completed for 400GE. The 200GE single-mode optical module specifications are generally formulated based on half of those for 400GE.

On December 6, 2017, IEEE 802 approved the IEEE 802.3bs 400GE Ethernet standard, including 400GE Ethernet and 200GE Ethernet single-mode optical module specifications, and the standard was officially released. IEEE 802.3cd defined the 200GE Ethernet multimode optical module standard, which was officially released in December 2018.

As described in the following table, 400GE supports all-scenario standards, including 100m, 500m, 2km, and long distance 80km.

Distance Standard Name Electrical Port Rate Optical Port Rate
100m IEEE 802.3cd
IEEE 802.3cm
200G SR4
400G SR8
400G SR4.2
4x 56GE
8x 56GE
8x 56GE
4x 50GE
8x 50GE
8x 50GE
500m IEEE 802.3bs 400G DR4 8x 56GE 4x 110GE
2km IEEE 802.3bs
IEEE 802.3bs
100G Lambda MSA
200G FR4
400G FR8
400G FR4
4x 56GE
8x 56GE
8x 56GE
4x 50GE
8x 50GE
4x 100GE
10km
6km
IEEE 802.3bs
100G Lambda MSA
400G LR8
400G LR4
8x 56GE
8x 56GE
8x 50GE
4x 100GE
80km OIF 400G ZR 8x 56GE DP-16QAM

50GE vs. 100GE Server: 100GE Servers Will Become the Mainstream

As per predictions from Crehan, 50GE and 100GE Network Interface Cards (NICs) have been successfully delivered since 2019. The entire industry was on the fence over next generation upgrade to 25GE NICs in 2018 and 2019. And, despite a reversal in shipment numbers in 2019, the 100GE server market surpassed that of 50GE by 2020, as the industry became fully confident in 100GE servers.

Forecast on the shipment trend of NICs and servers

Two mainstream Central Processing Unit (CPU) chip vendors — Intel and AMD — will launch Peripheral Component Interconnect Express (PCIe) 4.0 chips in Q3, 2020, capable of 50Gbit/s Input/Output (I/O). For high-end applications, I/O can reach speeds of 100Gbit/s and 200Gbit/s. Both vendors are expected to launch additional chips in the first half of 2021 featuring increased I/O speeds of 100Gbit/s, rising to 400Gbit/s for high-end applications.

In the context of chip development and server delivery prediction, 100GE servers will quickly become the mainstream.

200GE vs. 400GE Optical Module: 400GE Offers Lower Costs with a Mature Industry

As data center access servers evolve from 25GE to 100GE, should 200GE or 400GE be selected for the current 100GE interconnection network?

Item 10GE Access + 40GE Interconnection 25GE Access + 100GE Interconnection
Bandwidth A 2.5A
Cost C C
Power Consumption B B

As data center servers evolve from 10GE to 25GE, and network interconnection is upgraded from 40GE to 100GE, bandwidth doubles but interconnection costs and power consumption remain unchanged, meaning the actual cost and power consumption per Gbit/s interconnection is effectively reduced by half. As a result, 100GE is replacing 40GE to become the mainstream network interconnection solution in the 25GE era.

200GE and 400GE optical modules are different. Traditional optical modules use Non-Return-to-Zero (NRZ) signal transmission technology, where high and low signal levels are used to represent the digital logic signals 0 and 1, and one bit of logic information can be transmitted in each clock cycle. Both 200GE and 400GE optical modules use Pulse Amplitude Modulation 4 (PAM4), a high-order modulation technology which uses four signal levels for transmission. Two bits of logical information — 00, 01, 10, and 11 — can be transmitted in each clock period.

Consequently, under the same baud rate, the bit rate of a PAM4 signal is twice that of an NRZ signal, doubling transmission efficiency and reducing transmission costs. From the perspective of optical module composition, 200GE and 400GE modules both use 4-lane mainstream architecture, and feature similar module design costs and power consumption.

Optical Module 200GE 400GE
Modulation Mode PAM4 PAM4
Implementation 4x 50GE 4x 100GE
High Design Cost C C
Power Consumption B B

As the bandwidth of a 400GE module is twice that of a 200GE module, the technical cost and power consumption of a 400GE module are half that of a 200GE module.

In addition to architecture design, module costs also depend on the scale of deployment. According to the delivery data of third-party consulting company Omdia (originally known as OVUM), the layout of 200GE and 400GE optical modules provided by the top eight suppliers is as follows.

The layout of 200GE and 400GE optical modules provided by the top eight suppliers

As shown in the preceding figure, 200GE modules are classified into 100m SR4 and 2km FR4 modules. Among the 200GE modules, only 100m SR4 modules are classified into five types. The top eight vendors have deployed 100m, 500m, and 2km modules. In contrast, the 400GE industry is far more mature, offering a wide array of choices for customers.

This analysis further proves that PAM4 technology increases technical costs and power consumption. In the DCN field, which is sensitive to both costs and power consumption, the industry requires urgent evolution from 200GE to the more efficient and competitive 400GE.

Summary: 400GE Momentum Surges, 200GE Evolution May Be Abandoned

DCNs are designed around the delivery of services. With this in mind, fast-growing digital construction will drive the rapid growth of 100GE servers and cement them as the mainstream option in 2020. In terms of costs, data center optical components account for more than half of the total cost of network devices. With the introduction of PAM4, the cost of a single bit of a 400GE optical component is cheaper than that of a 200GE optical module. Such a reduction in the deployment cost of an optical module directly lowers overall network construction costs.

In general, 400GE is enjoying strong momentum, while 200GE may become a temporary transition or even skipped over entirely.

400GE Networking Mode: High-Density 400GE All-Box Network Is Still Incoming

As the access and interconnection devices of data center servers, switches provide larger capacity as server I/Os increase. The switching capacity of core component forwarding chips doubles every generation. However, the challenge of doubling forwarding chip capacity is far greater than doubling the capacity of NICs, in order to provide high bandwidth for the many servers connected to them.

The operational clock frequency of the chip lowers the performance by 20%, so additional area and power must be added to improve performance. As the area of the forwarding chip increases, so too does the power consumption, eventually leading to a power consumption bottleneck. A more advanced semiconductor technology is required to avoid such a restrictive bottleneck.

A typical 128-port 100GE high-density fixed switch is used here as an example. The switch uses the 12.8Tbit/s chip 16nm process, and the chip's power consumption is approximately 350W. The maximum power consumption of the switch with 100GE optical modules is 1998W, and it is estimated that the maximum power consumption of a 25.6Tbit/s device with 128x 200GE is 3000W. As such, the power consumption of the entire device and the demands placed on the chip's single-point heat dissipation capability continue to increase, posing great challenges to the engineering design of network devices.

If a 400GE network node needs to achieve the same performance of a 128-port network node on a 100GE network, the forwarding chip performance must reach 51.2Tbit/s. If future 51.2Tbit/s chips continue to use 7nm process technology, the estimated chip power consumption will reach 1000W, which is not practical for fixed devices based on the current heat dissipation process.

As a result, the 51.2Tbit/s forwarding chip is used to build a high-density 128-port 400GE fixed switch, which relies on an upgrade to 5nm or 3nm chip technology. This 5nm or 3nm chip technology reduces the power consumption of the forwarding chip to less than 900W. If the 5nm or 3nm chip process is used, chips can be produced at scale and delivered in 2023.

400GE Era Has Arrived, with 400GE Fixed and Modular Devices Emerging as the Best Choice

The commercial delivery of high-density 400GE switch chips (51.2Tbit/s) has been delayed, and there are currently three options available for network devices.

Option 1: High-density 200GE fixed device. 128x 200GE ports are available using a 25.6Tbit/s chip.

Option 2: Low-density 400GE fixed device. 4x 400GE ports are available using a 25.6Tbit/s chip.

Option 3: High-density 400GE modular switch. Multiple chips are stacked to provide higher-density 400GE ports, and 400GE modular switches are provided to meet the 128x 400GE (or even higher) port density requirements.

High-Density 200GE Fixed Devices: A Lost 400GE Opportunity

100GE servers will soon become the mainstream, and 400GE optical connections are positioned to be the most cost-effective. However, given the current immaturity of the 51.2Gbit/s (128x 400GE) forwarding chip, enterprises that have deployed a 100GE full-fixed-device architecture are reluctant to consider 200GE. As such, if 200GE is selected, direct evolution to 400GE will likely be abandoned. As a result, repeated investment made in 200GE sees optical interconnection costs accounting for more than half of the entire DCN construction cost. Consequently, the 200GE solution is unable to take full advantage of 400GE technology dividends.

Low-density 400GE Fixed Device: 75% Reduction of Server Cluster Scale

If 64x 400GE fixed switches are used for networking, the port density of T2 devices is half that of 128 ports in the 100GE network architecture, and the number of access servers in the PoD is also half that in the 100GE network architecture. In addition, 64x 400GE network devices are also used at the T3 layer. As a result, the number of servers is reduced by half, and the scale of the entire server cluster is reduced to 25% of the original scale. Throughout the development of DCNs, the rate is upgraded while the size of existing server clusters is ensured. Low-port-density 400GE network devices will greatly reduce the server cluster scale, which may fail to meet requirements of service applications.

High-Density 400GE Modular Devices: The Best Choice, as Confirmed by Network Evolution

Let's review the history of 100GE network evolution. In the early stages, the development of cloud computing services and compute resource virtualization technologies promoted the maturity of 100GE industry standards. 25GE access servers gradually saw widespread use, and the rapid growth of 100GE optical interconnection further reduced costs.

Once the industry matured and the 100GE network era arrived, high-performance 100GE forwarding chips lagged behind and could not be obtained during the initial phase of 100GE network construction (phase 1 in the following figure). The industry initially used a multi-chip solution to build high-density 100GE modular switches, and this ensured that the network scale met expectations while maximizing the technical dividends of 100GE networks.

With the upgrade of chip performance and the launch of 6.4Tbit/s and 12.8Tbit/s chips, the network smoothly evolved from a 100GE modular switch to a 100GE fixed switch (phases 2 and 3 in the following figure).

High-performance 100 GE forwarding chips lagged behind and could not be obtained during the initial phase of 100 GE network construction

400GE networks will evolve in a similar manner, and while 51.2Tbit/s chip capability is currently unavailable, multi-chip 400GE modular switches are the better choice.

High-density 400GE modular devices can be deployed to maintain or even expand the network scale and reduce single-bit costs. Mainstream vendors within the industry have already released 400GE modular devices, which will promote the commercial use of 400GE networks. With the launch of 51.2Tbit/s switching chips, 400GE architecture with fixed and modular devices can smoothly evolve to full-fixed-device architecture, as it finally becomes the mainstream architecture of DCNs in the 400GE era.