As graphics processors become more common in computers, Nvidia is expanding its collaboration with standards and open source communities to include downstream technologies that were previously limited to the company’s development tools. A lot of effort is being put into programming languages like C++ and Fortran, which are thought to lag behind native implementation when it comes to executing code on highly parallel computers.
Nvidia’s CUDA parallel programming framework, which combines open and proprietary libraries, is responsible for many of the technologies being opened up and mainstreamed. In 2007, CUDA was introduced as a set of programming tools and frameworks for programmers to develop GPU-based systems. However, as GPU utilisation grew in more applications and sectors, the CUDA philosophy shifted.
Nvidia is best recognised for its GPU dominance, but CUDA is at the heart of the company’s rebranding as a software and services supplier targeting a $1 trillion market cap. Nvidia’s long-term ambition is to become a full-stack provider with a focus on specific fields such as autonomous driving, quantum computing, health care, robotics, cybersecurity, and quantum computing.
Nvidia has created dedicated CUDA libraries in certain domains, as well as the hardware and services that businesses can use. The concept of a “AI factory,” announced by CEO Jensen Huang at the recent GPU Technology Conference, best exemplifies the full-stack strategy. Customers can drop applications into Nvidia’s mega datacenters, with the result being a customised AI model tailored to specific industry or application needs.
Nvidia may profit from AI factory principles in two ways: by utilising GPU capacity or by utilising domain-specific CUDA libraries. On Nvidia GPUs, programmers can use open source parallel programming frameworks such as OpenCL. CUDA, on the other hand, will deliver that extra last-mile increase for those willing to invest because it is tuned to operate closely with Nvidia’s GPU.
While parallel programming is common in high-performance computing, Nvidia’s goal is to make it a norm in mainstream computing. The company is assisting in the standardisation of best-in-class tools for writing parallel code that is portable across hardware platforms regardless of brand, accelerator type, or parallel programming framework.
For one thing, Nvidia is a member of a C++ group that is building the groundwork for simultaneous execution of portable code across hardware. A context could be a CPU thread that primarily performs IO or a CPU or GPU thread that does demanding computation. Nvidia is particularly engaged in delivering C++ programmers a standard language and infrastructure for asynchrony and parallelism.
The first work focused on the memory model, which was incorporated in C++ 11, but had to be updated when parallelism and concurrency became more prevalent. C++ 11’s memory model emphasised concurrent execution across multicore CPUs, but it lacked parallel programming hooks. The C++ 17 standard laid the foundation for higher-level parallelism features, but real portability will have to wait for future standards. C++ 20 is the current standard, with C++ 23 on the horizon.