AMD has a clear-cut business plan: Its upcoming 32-core Zen chip will bring it back into high-performance servers, and the company has expressed a desire to make high-performing GPUs for such systems.
The combination could make for a powerful server and a signature re-entry for AMD into the server market, where it virtually has no presence after bad product decisions cost it market share. The company is already doing the background work so software can be written and installed on those servers.
AMD has released the latest version of software tools, called ROCm, that will make it easy to write and compile parallel programs for its GPUs and CPUs. It’s targeted mainly at high-performance computing applications.
ROCm is key to AMD’s re-entry into the fast-growing data center market. AMD’s GPUs are rocking game consoles, PCs, and virtual reality systems; ROCm provides a base for the company to build GPUs for large-scale servers.
ROCm is a low-level programming framework like Nvidia’s CUDA. But instead of being closed source, it’s open source and can work with a wide range of CPU architectures like ARM, Power, and x86.
Beyond servers, AMD also doesn’t have a big presence in the high-performance computing (HPC) market. Most of the GPUs in the world’s top supercomputers are from Nvidia, but ROCm tools provide AMD a base for the company to take market share from Nvidia.
The ROCm platform is targeted at the large-scale server installations and for multiple GPUs in a cluster of racks, said Greg Stoner, senior director for Radeon Open Compute.
It’ll work with AMD’s latest Radeon Pro GPUs and current consumer GPUs based on the Polaris architecture. It can be used to run neural networking clusters or for scientific computing.
But there’s a chicken-and-egg problem. Scientists use Nvidia’s CUDA because the company’s GPUs are already in supercomputers. AMD scrapped its FirePro brand, which was targeted at supercomputing, and is reestablishing the Radeon brand for servers. AMD hasn’t said if it would release GPUs targeted at supercomputing to take on Nvidia’s Tesla, but it’s a lucrative market.
Stoner didn’t reveal any of AMD’s supercomputing GPU plans but said ROCm will play a big role as the company goes after the HPC space.
“We know what we have to do for future hardware” in the supercomputing market, Stoner said.
The specifications established by Heterogeneous System Architecture (HSA) Foundation are core to ROCm. The specifications are designed to harness the joint computing power of CPU, GPU, and other processors in a system. AMD has said HSA specifications could replace OpenCL, which is widely used today for parallel programming.
AMD is chasing open-source standards, contrary to Intel, which rules the server market and is trying to push its proprietary standards. AMD is a member of recently established organizations like Gen-Z and OpenCAPI, which are trying to push open-source interconnects. Stoner believes ROCm will receive a boost through those organizations.