Inspur NF5488M5 Review A Unique 8x NVIDIA Tesla V100 Server
07 May 2020
Reprinted from serverthehome.com
The Inspur NF5488M5 is something truly unique. Although many vendors, including Inspur, can claim to have an 8x NVIDIA Tesla V100 system, the NF5488M5 may just be the highest-end 8x Tesla V100 system you can buy. Not only is it using 8x Tesla V100 SXM3 and for “Volta Next” GPUs with 350W+ TDPs, but it has something special on the fabric side. The GPUs are connected together using NVSwitch technology which means each GPU has 300GB/s of NVLink bandwidth to each of the other GPUs. Essentially, this is half of an NVIDIA DGX-2 or HGX-2 in a 4U chassis.
In our review, we are going to focus a bit more time than normal talking about the hardware, and what makes it completely different than other offerings on the market. By April 2019, Inspur was able to claim at least 51% of AI server market share in China, and this is one of the innovative designs that is helping Inspur continue to build market share. Just for a bit of context, we borrowed this server but the power costs alone to do this review were several thousand dollars so it was a big investment for STH to make to review this server. As such, we want to ensure that we are showing why we wanted to bring this to our readers.
Inspur NF5488M5 Hardware Overview
We are going to go deep into the hardware side of this server because it is an important facet to the solution. It is also something very unique on the market.
Inspur NF5488M5 Front Inspur Badge
We are going to start with the server overview, then deep-dive on the GPU baseboard assembly on the next page. Still, the wrapper around the GPU baseboard is unique and important so we wanted to cover that.
Inspur NF5488M5 Server Overview
The Inspur NF5488M5 is a 4U server that measures 448mm x 175.5mm x 850mm. We are going to start our overview at the front of the chassis. Here we can see two main compartments. The bottom is the GPU tray where we will end this overview. Instead, we are going to start with the top approximately 1U section which is the x86 compute server portion of the server.
Inspur NF5488M5 Front
One can see front I/O with a management port, two USB 3.0 ports, two SFP+ cages for 10GbE networking, as well as a VGA connector.
Inspur NF5488M5 Front Storage
Storage is provided by 8x 2.5″ hot-swap bays. All eight bays can utilize SATA III 6.0gbps connectivity. The top set of four drives can optionally utilize U.2 NVMe SSDs.
Inside the system, we have a bit more storage. Here we have a dual M.2 SATA boot SSD module on a riser card next to the memory slots. These boot modules allow one to keep front hot-swap bays open for higher-value storage.
Inspur NF5488M5 Memory And M.2 Boot Card
On the topic of memory, this is a dual Intel Xeon Scalable system with a full memory configuration. That means each of the two CPUs can potentially take 12 DIMMs for 24 DIMMs total. In this class of machine, we see the new high-end 2nd Gen Intel Xeon Scalable Refresh SKUs along with some of the legacy 2nd Gen Intel Xeon Scalable L high-memory SKUs. Mich of the chassis design is dedicated to optimizing airflow, but this is a lower TDP per U section of the server, even with higher-end CPU and memory configurations.
Inspur NF5488M5 Dual Intel Xeon Scalable CPU And 24x DDR4 DIMMs
Rounding out a few more bits to the solution, we can see the onboard SFF-8643 connectors providing SATA connectivity. The heatsink to the right of this photo is for the Lewisburg PCH.
Inspur NF5488M5 CPU Motherboard Storage Connectors
You may have noticed the module just behind those connectors. That module provides the CPU’s motherboard with power from the rear power supplies and power distribution boards.
Inspur NF5488M5 CPU Motherboard PDB Interface
Looking at the rear of the chassis, one can see an array of hot-swap modules. There are three basic types. The center is dominated by fan modules. To each side, one can see power supplies on top and I/O modules on the bottom.
Inspur NF5488M5 Rear
The NV5488M5 utilizes four 3kW power supplies which can be used on A+B data center power to provide redundant operation.
Inspur NF5488M5 4x 3kW PSUs Three Quarter
One of the most overlooked, but very important features in a system like this are the fan modules. Each module is comprised of two heavy-duty fans in a hot-swap carrier. Due to the power consumption of this 4U system, these fan modules are tasked with moving a lot of air through the system reliably to keep the system running.
Inspur NF5488M5 Six Fan Modules
These six fan modules are hot-swappable and have status LEDs that show status. That helps identify units that may need to be swapped out.
We also wanted to highlight the I/O modules. Here we have the two modules each with two Mellanox Infiniband cards. That gives us a ratio of two GPUs for each Infiniband card.
Inspur NF5488M5 Two IO Modules
The modules themselves have their own fans, as well as an additional slot. One slot is used for rear networking such as 10/25GbE. The other side has a slot designed for legacy I/O. One can hook up a dongle and have USB and other ports for hot aisle local management.
Behind the motherboard, and that wall of fans, one can see an array of PCIe cables. These cables carry PCIe signaling from the main motherboard on the top of the chassis to the PCBs on the bottom of the chassis. Inspur has a cable management solution to ensure that these cables can be used without blocking too much airflow to the CPUs, memory and other motherboard components.
Inspur NF5488M5 CPU Motherboard PCIe Cables
Here is a view looking down through those cables to the PCIe switch PCB. There, one can see a second Aspeed AST2520 BMC. There is a BMC on the main motherboard as well.
Inspur NF5488M5 AST2520 And Dual PEX9797 PCIe Distribution Board Cabling From Motherboard
On either side, there are large heatsinks. Those heatsinks cover the Broadcom (PLX) PEX9797 97 lane, 25 port, PCI Express Gen3 ExpressFabric switches. These are high-end Broadcom PCIe switches and are used to tie the various parts of the system together.
Inspur NF5488M5 Dual PEX9797 PCIe Distribution Board Power And PCIe Cables
Those PCIe lanes, are connected to the GPU baseboard via high-density PCIe connectors.
Inspur NF5488M5 Dual PEX9797 PCIe Distribution Board Connectors To HGX 2 Board
Before we get to the GPU baseboard, you may have noticed the large red and black wires in this section. Those are the power feeds inside the system.
Next, we are going to look at the GPU baseboard assembly that this plugs into in more detail.
Inspur NF5488M5 GPU Baseboard Assembly
The GPU baseboard assembly slides out from the main 4U server chassis.
Inspur NF5488M5 Front IO And Storage With GPU Tray Partially Out
It actually has its own cover, and own sliding rail system internally. There are even side latches to keep the entire assembly secure. In effect, this is like a smaller version of a server rail kit, just found inside this single-node 4U server.
Inspur NF5488M5 HGX 2 Tray Rails And Release
Taking the cover off, we can see the large hard airflow guide that runs through this section. Airflow is a key design consideration in this chassis, and therefore this is a very heavy duty airflow guide.
Inspur NF5488M5 HGX 2 Tray Airflow Guide
Removing that cover, let us work our way through the GPU baseboard. PCIe passes from the CPUs, to the motherboard, to PCIe cables, then to those Broadcom PEX9797 PCIe switches, then through the high-density PCIe connectors and then to the GPU baseboard where it is distributed to each GPU.
Inspur NF5488M5 HGX 2 Board PCIe Side Heatsink
There are a total of eight NVIDIA Tesla V100 32GB SXM3 in our system ready for “Volta Next” GPUs in this system. SXM3 GPUs like this are designed to run in this 54VDC system and have 350-400W TDP. Our test system had caps set for 350W and we saw idle on each SXM3 GPU of around 50W as measured by nvidia-smi.
Inspur NF5488M5 Nvidia Smi
That is higher than the PCIe, and SXM2 versions of the Tesla V100. While all are called “Tesla V100” GPUs, there is a significant gap in capabilities.
Inspur NF5488M5 HGX 2 Board PCIe Side
Each GPU has its own heatsink covered in an NVIDIA shroud. The whole assembly looks very impressive.
NVIDIA HGX 2 GPU Tray Coolers On Tesla V100 SXM3 GPUs
The other key feature of the Inspur Systems NF5488M5 is the interconnect on this board. Years ago, NVIDIA innovated well beyond simply using PCIe for inter-GPU communication. With the Pascal (Tesla P100) generation, NVIDIA introduced NVLink in the SXM2 modules. We actually have a guide on How to Install NVIDIA Tesla SXM2 GPUs using Tesla P100’s. SXM2 systems generally rely on direct attach GPU-to-GPU topologies which limits their scale. The NF5488M5 is a SXM3 system with NVswitch. At STH, we covered NVIDIA NVSwitch details during Hot Chips 30 when the company went into detail around how they work.
Inspur NF5488M5 HGX 2 Board NVSwitch Heatsink Right
There are a total of six NVSwitches on the GPU PCB. By connecting GPUs into this switched fabric, NVIDIA can provide full 300GB/s bandwidth from one GPU to another. With eight GPUs making memory transactions over NVLink, that effectively turns this into a large GPU set with 256GB of HBM2.
Inspur NF5488M5 HGX 2 Board NVSwitch Heatsink
These NVSwitch modules require their own heat pipe coolers which you can see in these photos. In the Inspur NG5488M5, they are not being used to their full 16/18 port capacity (2 reserved in the NVSwitch design.)
Inspur NF5488M5 HGX 2 Board NVSwitch Bridge Connectors
One may notice the large high-density connectors on the right side of the photo above. These are facing out towards the front of the chassis and are not being used here. By doing some investigation, we found out why. Looking into the forest of GPUs, we found a NVIDIA logo also screened on the GPU baseboard PCB.
NVIDIA HGX 2 SXM3 Board PCB NVIDIA Logo
We also found this label. The GPU baseboard is actually an NVIDIA HGX-2 baseboard. While NVIDIA sells its DGX-2 16-GPU machine, partners such as Inspur and others have their takes on the partner-oriented NVIDIA HGX-2. The Inspur 16-GPU offering they call the Inspur AGX-5. NVIDIA can bundle the HGX-2 baseboard along with the GPUs and NVSwitches for partners who can then innovate around that platform. While most have used the HGX-2 to provide DGX-2 alternatives with sixteen GPUs, the NF5488M5 is something different on the market with a single HGX-2 baseboard.
NVIDIA HGX 2 SXM3 Board PCB NVIDIA HGX 2 PN
Those high-density connectors we see in the front of the board are designed for bridges that extend the NVSwitch fabric between two HGX-2 baseboards in the sixteen GPU designs. This is very innovative making a system with only a single HGX-2 baseboard as the HGX-2 is too dense for many data center rack environments.
Next, we are going to look at some final chassis bits and show the system topologies which are important in a server like this.
Inspur NF5488M5 Other Chassis Impressions
We wanted to cover a few more chassis related items of the server. First, Inspur has a nice service guide underneath the chassis in English and Mandarin. This is a fairly complex system the first time you take it apart so this a great printed in-data center reference for the machine.
Inspur NF5488M5 Service Guide Under Lid
There is a nice warning label on the side that says that the server can weigh over 60kg. For some reference, the GPU box alone weighs over 23kg.
Inspur NF5488M5 Chassis Handles And Over 60kg Weight
To help move the unit, Inspur suggests having four people and includes handles. When we moved the unit out of the Inspur Silicon Valley office, we used four people to carry the system. Realistically, once in the data center, there may be some movement but we suggest using a server lift if you are installing these. Most data centers have them, but with such a heavy node, it makes a lot of sense here.
Inspur Systems NF5488M5 Topology
With training servers, topology is a big deal. We used Intel Xeon Platinum 8276 CPUs in our test system. The new 2nd Gen Intel Xeon Scalable Refresh SKUs are 2x UPI parts while the legacy parts are 3x UPI so that is something to consider.
Inspur NF5488M5 Platinum 8276 Lscpu
Each CPU has a set of GPUs, storage, Infiniband cards and other I/O attached to it. With the sheer number of devices, you may need to click this one to get a better view.
Inspur NF5488M5 Lstopo
In terms of the NVIDIA topology, one can see the NVIDIA GPUs along with Mellanox NICs. This topology shows the 6 bonded NVLink per GPU on the switched architecture. There is also PCIe and UPI traversal routes. Overall, you can see the four Mellanox Infiniband cards and how they connect to the system.
Inspur NF5488M5 Nvidia Smi Topology
We can see the peer-to-peer topology is setup.
Inspur NF5488M5 Nvidia Smi P2p Topology
On the NVLink status, we can see the eight GPUs each with their six NVLinks that are up. We can also see the six NVSwitches each with eight links. Each GPU has a link to each NVSwitch. So if we are doing a GPU-to-GPU transfer, we are pushing 1/6th of that transfer over each of the switches on the HGX-2 baseboard.
Inspur NF5488M5 NVLink And NVSwitch Link Status
On a 16x GPU HGX-2 or DGX-2 system, you would see more of the switch ports utilized to uplink to the switches on the other GPU baseboard via the bridges.
The addition of those switches makes this a significantly more robust architecture than the direct attach NVLink we find on DGX-1/ HGX-1 class systems.
Next, we are going to look at the management followed by some of the background behind why we are seeing this type of solution.
Inspur Systems NF5488M5 Management
Inspur’s primary management is via IPMI and Redfish APIs. That is what most hyperscale and CSP customers will utilize to manage their systems. Inspur also includes a robust and customized web management platform with its management solution.
Inspur Web Management Interface Dashboard
There are key features we would expect from any modern server. These include the ability to power cycle a system and remotely mount virtual media. Inspur also has a HTML5 iKVM solution that has these features included. Some other server vendors do not have fully-featured HTML5 iKVM including virtual media support as of this review being published.
Inspur Management HTML5 IKVM With Remote Media Mounted
Another feature worth noting is the ability to set BIOS settings via the web interface. That is a feature we see in solutions from top-tier vendors like Dell EMC, HPE, and Lenovo, but many vendors in the market do not have.
Inspur Management BIOS Settings
Another web management feature that differentiates Inspur from lower-tier OEMs is the ability to create virtual disks and manage storage directly from the web management interface. Some solutions allow administrators to do this via Redfish APIs, but not web management. This is another great inclusion here.
Inspur Management Storage Virtual Drive Creation
Based on comments in our previous articles, many of our readers have not used an Inspur Systems server and therefore have not seen the management interface.
It is certainly not the most entertaining subject, however, if you are considering these systems, you may want to know what the web management interface is on each machine and that tour can be helpful.
Inspur Systems NF5488M5 Background
A quick background on this system is in order. There are probably a few non-regular STH readers who do not know Inspur System today. For some context, in our recent IDC 4Q19 Quarterly Server Tracker Dell Sinks Inspur and Lenovo Surge you can see that Inspur is the third-largest server vendor by unit shipments. Unlike many of its competitors that greatly reduced ASP’s to hold or gain share in Q4 2019, Inspur’s ASP’s are actually growing at an industry-leading rate among large server vendors.
IDC 4Q19 Server Tracker Server ASP Heatmap
A big part of that is Inspur’s AI computing division. When I was at Inspur Partner Forum 2019 a big theme was 51% AI server market share in China. China is a huge AI market, so that gives some sense of just how many AI systems the company is moving. These accelerated servers raise ASP, so that is how we get both unit and ASP growth for Inspur. Inspur is not just focusing on traditional enterprise clients, but also hyper-scale data center customers. Last year we had a piece Visiting the Inspur Intelligent Factory Where Robots Make Cloud Servers where I actually toured an Inspur factory in Jinan, China. Other vendors have seen that video and declined to have me visit their factories because that is how advanced Inspur’s facilities are.
That large AI market share has a real impact on a server such as the Inspur NF5488M5. While Dell EMC is large as a company, it does not have an AI portfolio with a direct competitor to this machine due to a lack of focus in this market. Inspur ships so many AI servers that it has finely tuned solutions in the market such as the NF488M5 that some other players simply do not have.
Inspur NF5488M5 Performance Testing
We wanted to show off a few views on what makes this different than other GPU compute servers we have tested.
Inspur NF5488M5 CPU Performance to Baseline
In the original draft of this piece, we had a deep-dive into CPU performance. Since we already have more in-depth CPU reviews, and CPU performance is not the focus of this system.
Inspur NF5488M5 Platinum 8276 Lscpu
We instead are just going to present our baseline Platinum 8276 performance versus the same performance in the Inspur NF5488M5.
Inspur NF5488M5 Intel Xeon Platinum 8276 Performance V. Baseline
As you can see, we generally stayed very close to our primary testbed which shows we are getting adequate cooling to the CPUs.
Inspur NF5488M5 P2P Testing
We wanted to take a look at what the peer-to-peer bandwidth looks like. For comparison, we have DeepLearning10, a dual root Xeon E5 server, and DeepLearning11 a single root Xeon E5 server, and DeepLearning12 a Tesla P100 SXM2 server. If you want to compare some of these numbers to an 8x Tesla V100 32GB PCIe server, you can check out our Inspur Systems NF5468M5 review.
Inspur NF5488M5 P2P Bandwidth
Here is the Unidirectional P2P bandwidth on the dual root PCIe server:
Inspur NF5488M5 P2pBandwidthLatencyTest Unidirectional BW
Here we can see the unidirectional P2P bandwidth is 143GB/s that was about 9-18GB/s on the PCIe dual root server with the Tesla V100’s. Also, that is more consistent across the GPUs whereas the PCIe server had a lot of variation depending on the placement of the GPUs.
Looking at bidirectional bandwidth:
Inspur NF5488M5 P2pBandwidthLatencyTest Bidirectional BW
We again see 266GB/s bandwidth between GPUs and very consistent results. We compare this from about 18-37GB/s on a PCIe Gen3 switched server. You can also see the 800GB/s figures here for the same GPU (e.g. 0,0) which was closer to 400GB/s on the Tesla P100 SXM2 generation.
Just for good measure, we also had the CUDA bandwidth test:
Inspur NF5488M5 BandwidthTest
We wanted to show here that the on-device bandwidth is phenomenal around 800GB/s as you can see in these P2P numbers.
Inspur NF5488M5 P2P Latency
Beyond raw bandwidth, we wanted to show Inspur Systems NF5488M5 GPU-to-GPU latency. Again, see links above for comparison points:
Inspur NF5488M5 P2pBandwidthLatencyTest P2P Disabled Latency
Here are the P2P enabled latency figures:
Inspur NF5488M5 P2pBandwidthLatencyTest P2P Enabled Latency
These figures are again very low and consistent.
While this is not intended to be an exact performance measurement, it is a tool you can quickly use on your deep learning servers to see how they compare.
Raw Deep Learning/ AI Performance Increase over PCIe
We had data from the Inspur Systems NF5468M5 that we reviewed and so we ran some of the same containers on this system to see if we indeed saw a direct speedup in performance.
Inspur NF5488M5 8x Tesla V100 SXM3 With NVSwitch Out Of Box Performance Uplift Over V100 PCIe Server
Of course, the usual disclaimers here are that these are not highly optimized results, so you are seeing more of a real-world out-of-box speedup across a few companies that we help test their workloads on different machines as part of DemoEval. Realistically, if one uses newer frameworks, and optimizes for the system, better results are obtainable. The above chart took almost two weeks to generate, so we did not get to iterate on optimizations since we had a single system and a limited time running it.
This is one where we are just going to say, our testing confirmed what one would expect, faster GPUs and faster interconnects yield better performance, the degree of which depends on the application.
Inspur NF5488M5 Power Consumption
Our Inspur NF5488M5 test server used a quad 3kW power supply configuration. The PSUs are 80Plus Platinum level units.
- Ubuntu OS Idle Power Consumption: 1.1kW
- Average load during AI/ ML workloads: 4.6kW
- Maximum Observed: 5.1kW
Note these results were taken using two 208V Schneider Electric / APC PDUs at 17.7C and 72% RH. Our testing window shown here had a +/- 0.3C and +/- 2% RH variance.
Inspur NF5488M5 4x 3kW PSUs
We actually split the load across two PDUs just to ensure we did not go over on our testing. Realistically, we found that this can be powered by a 208V 30A circuit, even if it is only one per rack (or two using two PDUs and non-redundant power.)
Of course, most deployments will see 18-60kW racks with these servers. However, one of the major benefits of a system like this versus a 16x GPU HGX-2 system is the fact that it uses less power so it can fit into more deployment scenarios.
STH Server Spider Inspur NF5488M5
In the second half of 2018, we introduced the STH Server Spider as a quick reference to where a server system’s aptitude lies. Our goal is to start giving a quick visual depiction of the types of parameters that a server is targeted at.
STH Server Spider Inspur NF5488M5
The Inspur Systems NF5488M5 is a solution designed around one purpose: keeping eight SXM3 Tesla V100’s fed with data. To that end, it supports up to 2.8-3.2kW of GPUs plus the in-chassis and external fabrics to keep data in GPU pipelines.
There are a few key takeaways from our Inspur NF5488M5 review. This is a well-built server from a top AI/ deep learning server vendor. As such, the fit and finish are great and it is designed to be a high-end solution.
We see this server as a tool that AI and deep learning researchers can use to get the benefits of current Tesla V100 SXM3 and “Volta Next” GPUs and NVSwitch fabric without having to move up to a larger and more power-hungry sixteen GPU HGX-2 platform. With the SXM3 modules, we get the benefits of higher TDPs and the NVSwitch fabric to provide better performance than we would from traditional PCIe servers. The more an application can use the NVSwitch and NVLink fabric, the better the performance gains will be.
In terms of pricing, our suggestion here is to reach out to a sales rep. Our sense is that it will rightfully fall between that of a HGX-1 platform and a HGX-2 platform given the capabilities we are seeing. Although NVIDIA sells the DGX-2’s for $399K, we see HGX-2 platforms on the market for well under $300K which puts a ceiling in terms of pricing for the NF5488M5.
Beyond the hardware, there is much more that goes into a solution like this. That includes the storage, Infiniband and other networking, along with all of the clustering solutions, data science frameworks and tools. While we focused on the physical server here, we do want our readers to understand this is part of a much larger solution framework.
Still, for those who want to know more about the AI hardware that is available, this is a top-tier, if not the top-end 8x GPU solution on the market for the NVIDIA Tesla V100 generation.