Unpacking the Powerhouse: A Deep Dive Into Azure's HBv3 Virtual Machines

When you're pushing the boundaries of what's computationally possible, especially in fields like high-performance computing (HPC) or complex simulations, the underlying hardware matters. A lot. Azure's HBv3 virtual machines are designed precisely for these demanding workloads, and understanding their architecture is key to unlocking their full potential.

At its heart, an HBv3 server is built around AMD EPYC 7V73X processors. Think of it as having two of these powerful CPUs, each packing 64 cores. That's a total of 128 physical cores humming away. Now, what's interesting is that simultaneous multithreading (SMT) is turned off here. This means each of those 128 cores is dedicated to your work, no sharing. These cores are organized into 16 sections, with 8 cores per CPU socket, and each group has direct access to a substantial 96 MB of L3 cache. This setup is further optimized with specific AMD BIOS configurations: Nodes per Socket (NPS) set to 2, L3 cache treated as unified (not per-core), and importantly, the operating system sees 4 NUMA domains. This means the system is designed to present its resources in a way that minimizes latency for applications that are aware of Non-Uniform Memory Access.

So, what does this translate to for you? The server boots up with 4 NUMA domains, 2 per CPU socket, each domain boasting 32 cores. Crucially, each NUMA domain has direct access to 4 channels of physical DRAM running at a zippy 3,200 MT/s. Azure, of course, reserves some resources for itself – 8 physical cores per server are set aside for the hypervisor to ensure smooth operation without interfering with your virtual machine.

The topology is quite deliberate. Imagine the server's CPU sockets. Azure symmetrically reserves those 8 host cores across both sockets, taking a couple of cores from specific Core Complex Dies (CCDs) within each NUMA domain. The remaining cores are then allocated to your HBv3 VM. It's important to note that a CCD boundary isn't the same as a NUMA boundary. In HBv3, a group of four consecutive CCDs forms a NUMA domain, both at the host server level and within your VM. This means your VM will consistently see 4 NUMA domains, regardless of its specific size, each with a varying number of cores depending on the configuration.

Azure offers several HBv3 VM sizes, each mirroring the physical layout and performance characteristics of different AMD EPYC 7003 series CPUs. You'll find sizes like Standard_HB120rs_v3, which aligns closely with a dual-socket EPYC 7773X, down to Standard_HB120-16rs_v3, similar to a dual-socket EPYC 72F3. The 'rs' in the naming often indicates a reduced core count, but here's the clever part: while the number of exposed cores changes, the global resources like RAM, memory bandwidth, L3 cache, and network connectivity (InfiniBand and Azure Ethernet) remain constant. This is a fantastic feature for optimizing costs or licensing for specific software needs.

What's particularly neat is how Azure presents this hardware. The virtual NUMA allocation directly maps to the underlying physical NUMA topology. There's no confusing abstraction; you see the hardware as it is. You can even use tools like lstopo within your VM to see this precise layout for each VM size, confirming the direct mapping of physical resources.

Beyond the CPU, the networking is equally impressive. HBv3 VMs are equipped with NVIDIA Mellanox HDR InfiniBand adapters, capable of speeds up to 200 Gigabits/s. These are connected via SR-IOV, meaning network traffic bypasses the hypervisor for maximum efficiency. This allows you to load standard Mellanox OFED drivers directly into your HBv3 VMs, treating them almost like bare-metal systems for networking performance. Support for adaptive routing and Dynamic Connection Transport (DCT) further enhances communication reliability and speed for distributed workloads.

In essence, Azure's HBv3 VMs offer a powerful, transparent, and highly optimized platform for your most demanding computational tasks, bringing cutting-edge AMD hardware and advanced networking directly to your cloud environment.

Leave a Reply

Your email address will not be published. Required fields are marked *