A gold rush has begun in the technology industry with software that can write passages of text or draw pictures that look like they were created by a human.
A number of companies, such as Microsoft and Google, are battling each other for the right to integrate cutting-edge artificial intelligence into their search engines, even as billion-dollar competitors such as OpenAI and Stable Diffusion race ahead and release their software.
The Nvidia A100 is a chip that's become a key tool in artificial intelligence applications, powering many of them.
Nathan Benaich of Nathan Benaich Investments, who publishes a newsletter and reports on the artificial intelligence industry and includes a partial list of supercomputers that are using A100s, said that at present, the A100 has become the "workhorse" for artificial intelligence professionals. It is estimated that Nvidia controls 95% of the market for graphics processors that can be used for machine learning, based on the research conducted by New Street Research.
The A100 is the perfect tool for building the kinds of machine learning models that power tools such as ChatGPT, Bing AI, or Stable Diffusion, which all use machine learning models. As a result, it is capable of performing many simple calculations at the same time, which is important when it comes to training neural network models and using them.
It was originally used for rendering sophisticated 3D graphics in games by the technology behind the A100. As an application processor, or GPU, the Nvidia A100 is often referred to as a graphics processor, but these days Nvidia's A100 is more focused on machine learning tasks and is going to be running in data centers, not in glowing gaming PCs.
Developing software like chatbots and image generators requires hundreds or thousands of Nvidia chips, which companies can purchase on their own or get access to via cloud computing.
Artificial intelligence models, such as large language models, require hundreds of GPUs. To be able to recognize patterns in terabytes of data quickly, the chips have to be powerful enough to crunch terabytes of data quickly. After this, GPUs like the A100 is also needed for "inference," which is the process by which the model is used to generate text, predict outcomes, or identify objects within photos, using the model.
As a result, AI companies need access to a lot of A100s to be able to develop their algorithms. The number of A100s that entrepreneurs have available to them is even seen as a sign of progress by certain entrepreneurs in the space.
In January, Emad Mostaque, Stability AI's CEO, wrote on Twitter that the company had 32 A100s last year. “Dream big and stack as many GPUs as you can. Brrr.” Stable AI is the company that helped develop Stable Diffusion, an image generator that drew attention last fall, and has reportedly been valued at more than $1 billion at the time of writing.
There are currently more than 5,400 A100 GPUs available to Stability AI, according to one estimate from the State of AI report, which charts and tracks which companies and universities have the largest collection of A100 GPUs – although the report does not include cloud providers, which do not publish their numbers publicly, so that is not included.
Nvidia’s riding the A.I. train
As a result of the hype cycle surrounding artificial intelligence, Nvidia stands to benefit. In the fourth quarter earnings report released by the company on Wednesday, investors pushed the stock up about 14% on Thursday, despite the fact that overall sales declined by 21%. This was because the company's artificial intelligence chip business, which is reported as data centers, grew by 11% to over $3.6 billion during the quarter, showing that the company is still growing.
Since the beginning of 2023, Nvidia shares have gained 65%, outpacing both the S&P 500 and other semiconductor stocks in terms of growth.
Jensen Huang, Nvidia's CEO, couldn't stop talking about AI on Wednesday, suggesting it's at the center of the company's strategy.
In the last 60 days, Huang stated that “there has been a surge in activity around the AI infrastructure that we built, and around inferencing using Hopper and Ampere to influence large language models, both of which have just gone through the roof," he said. “Regardless of what our views are on the year as we enter this year, there is no doubt that our views about this year have changed dramatically in the last 60, 90 days as a result of what has happened in the last few months."
The A100 generation of Nvidia's chips is codenamed Ampere, which refers to the chip's codename. A new generation of routers, including the H100, which has recently started shipping, is code-named, Hopper.
More computers needed
A machine learning task, on the other hand, can use up an entire computer's processing power, sometimes for hours or days at a time, unlike other kinds of software, such as serving a webpage.
There is a tendency for companies that have a hit AI product to require more GPUs in order to handle peak periods or to improve their models as a result.
It is not cheap to buy these GPUs. Additionally, many data centers make use of a system that includes eight GPUs working together, all of which are connected to an existing server via a card that can be inserted into the server.
In spite of the fact that it includes all of the chips needed to operate this system, Nvidia's DGX A100 has a suggested price of nearly $200,000. It was announced on Wednesday that Nvidia would sell cloud access directly to users of its DGX systems, which will reduce the entry-level costs for tinkerers and researchers.
There's no denying that A100s can be expensive.
The OpenAI-based ChatGPT model featured in Bing's search engine can be estimated to require 8 GPUs to deliver a response to a question within less than a second, according to an estimate from New Street Research.
For Microsoft to deploy the model in Bing to everyone, over 20,000 8-GPU servers would be needed, suggesting a $4 billion infrastructure investment.
“It's just that if you're from Microsoft and you want to scale that, at the scale of Bing, you're talking about maybe $4 billion. The price of DGXs could cost as much as $80 billion,” according to Antoine Chkaiban, a technology analyst at New Street Research if you want to scale to the degree of Google, which serves 8 to 9 billion queries every day. "The numbers we were able to come up with are staggering. The reason for these problems is that every single user who wants to use such a large language model must make use of a massive supercomputer at the same time that they are making use of it."
200,000 compute hours were devoted to training Stable Diffusion, an image generator, using 32 machines with 8 A100s each, according to the information posted online by Stability AI.
The price of training the model alone, at the market price, is around $600,000, Stability AI CEO Mostaque said on Twitter, implying the price was unusually cheap compared to rivals in the market. In addition, that figure does not include the cost of "inference," or the cost of deploying the model.
In an interview with Trade Algo, Nvidia's CEO, Jensen Huang, stated that for the amount of computation that these kinds of models require, Nvidia's products are actually affordable when compared with their counterparts from other companies.
Huang stated that "we were able to take a data center that would have otherwise been worth $1 billion and reduce it down to a data center of $100 million," he explained. "Now, when you put $100 million in the cloud and share it with 100 companies, then it is almost nothing when it goes into the cloud."
Huang said startups can train models much more inexpensively with Nvidia's GPUs than with traditional computer processors.
“Now you could build something like a large language model, like a GPT, for something like $10, $20 million dollars,” Huang said, referring to a large language model. “There is no doubt that that is a very affordable price.
New competition
In addition to Nvidia, there are a few other companies that make GPUs for artificial intelligence purposes. There is ongoing competition between AMD and Intel in the area of graphics processors, and big cloud companies such as Google and Amazon are able to develop and deploy their own chips, designed specifically for AI workloads.
Despite this, the State of AI computes report states that "AI hardware remains strongly consolidated to NVIDIA," as it pertains to computation. In December, more than 21,000 open-source AI papers reported that the chips used in those papers were from Nvidia.
The State of AI Compute Index included researchers using Nvidia's V100 chip that was released in 2017, but A100 grew fast in 2022 to become Nvidia's third-most used chip, just behind a gaming graphics chip that costs $1500-or-less.
As the only chip with export controls placed on it because of national defense reasons, the A100 also holds the distinction of being one of the very few chips with export controls placed on it. In a filing to the Securities and Exchange Commission (SEC) last fall, Nvidia stated that the U.S. government had imposed a license requirement that prevented the A100 and the H100 from being exported to China, Hong Kong, and Russia.
As Nvidia mentioned in its filing, the USG indicated that the new license requirement would address the risk that the covered products might end up being used for, or diverted to, a "military end use" or "military end user" in China or Russia. To comply with U.S. export restrictions, Nvidia previously stated that some of its chips were adapted for the Chinese market.
Perhaps its successor will be the fiercest competitor for the A100. As far as chip cycles are concerned, the A100 was introduced for the first time in 2020, over an eternity ago. The H100 chips, introduced in 2022, are now starting to be produced in volume - indeed, Nvidia has reported more revenue for the H100 chips in the quarter ending in January than it did for the A100 chips, it said on Wednesday. The H100 chips are more expensive per unit than the A100 chips.
According to Nvidia, the H100 is the first of its data center GPUs to be optimized for transformers, an increasingly important technique used by many of the top AI applications. Nvidia said on Wednesday that it wants to make AI training more than 1 million percent faster than it is now. The result of this could be that AI companies would not need to buy as many Nvidia chips in the future.
As a leading independent research provider, TradeAlgo keeps you connected from anywhere.