Home| Features| About| Customer Support| Request Demo| Our Analysts| Login
Gallery inside!
Technology

The Market For ChatGPT and Generative AI Is Booming, But At A Steep Cost

March 13, 2023
minute read

There was a small startup called Latitude that was wowing consumers with its AI Dungeon game for years before OpenAI’s ChatGPT emerged and caught the world’s attention for its ability to make compelling sentences using artificial intelligence. The game allowed them to create fantastical stories based on their prompts using artificial intelligence.

In the course of time, as AI Dungeon grew in popularity, Latitude CEO Nick Walton recalled that maintenance costs for the text-based role-playing game began to skyrocket as the game became more popular. As part of AI Dungeon's text-generation software, Microsoft-backed open AI artificial intelligence research lab OpenAI provided the GPT language technology that was used in the software's next generation. As more and more people played AI Dungeon, the higher the bill Latitude had to pay OpenAI to run the game.

The predicament was compounded by Walton discovering that content marketers were using AI Dungeon to create promotional copy, which was a use for AI Dungeon that no one in Walton's team had imagined, and thus ended up adding to the AI bill for the company.

In order to keep up with the millions of queries it needed to handle every day, Walton estimates that the company was spending nearly $250,000 per month on OpenAI's so-called generative AI software and Amazon Web Services at its peak in 2021.

“We joked that we had human employees and artificial intelligence employees, and we spent about the same amount on both of them," Walton said in his statement. “Our company spent hundreds of thousands of dollars a month on artificial intelligence and we are not a large startup, so our costs were very high."

The startup also incorporated open source and free language models into its service in order to decrease the cost of its services by the end of 2021, Walton said, adding that the startup had also incorporated open source and free language models into its service in order to reduce the cost. Walton said that Latitude's generative AI bills have fallen to under $100,000 a month, and the startup charges players a monthly subscription for more advanced AI features to reduce the cost of that service.

A recent boom in generative AI technologies has highlighted a bitter truth behind Latitude’s high AI bills: The cost to develop and maintain the software can be extraordinarily high, not only for the companies that develop the underlying technologies, generally referred to as large languages or foundation models but also for the companies that use the AI to run their own applications.

There is no doubt that the high price of machine learning is an uncomfortable reality in the industry as venture capitalists are eyeing companies that could potentially be worth trillions and big corporations such as Microsoft, Meta, and Google use their considerable capital to develop a market lead in the technology that smaller challengers cannot compete with.

It could put a damper on the current boom in AI if the margins for AI applications are permanently lower than the margins for previously offered software-as-a-service applications, because of the high cost of computing, as a result of the high cost of computing.

In contrast to previous computing booms, the high cost of training and "inference" - that is, actually running large language models - is a structural cost that is different from that of previous computing booms. It still requires a lot of computing power to run large language models, regardless of the fact that the software has been built or trained, because each time a reply is returned to a prompt, the model does billions of calculations. The amount of computation required to serve web apps or pages is much lower in comparison.

It is also necessary to have specialized hardware in order to perform these calculations. Machine learning models can be run using traditional computer processors, but they are only as fast as those processors. Today, most of the training and inference is done on graphics processors, or GPUs for short, which were originally designed for 3D gaming but have now become the standard in AI applications due to their capability of performing many simple calculations at the same time. 

The majority of GPUs used in the AI sector are manufactured by Nvidia, and its primary chip used in data centers is priced at $10,000. There is a joke among scientists who build these models that they tend to "melt GPUs."

Training models

A large language model such as GPT-3 is a complex and time-consuming process that could cost more than $4 million to train, according to analysts and technologists. There is a possibility that more advanced language models might cost over a million dollars to be trained, according to Forrester analyst Rowan Curran, who specializes in artificial intelligence and machine learning.

When Meta announced that it had released its largest LLaMA model last month, for example, the company said that it used 2,048 Nvidia A100 GPUs to train on 1.4 trillion tokens (750 words equal about 1,000 tokens), and reported that the model took about 21 days to train.

The training process took about a million GPU hours to complete. Taking into account AWS's dedicated prices, that would amount to over $2.4 million. In addition, the model has 65 billion parameters, which is considerably smaller than the current GPT models at OpenAI, which have 175 billion parameters, such as ChatGPT-3. 

The CEO of the AI startup Hugging Face, Clement Delangue, told Trade Algo during a recent interview that it took more than two months to train the company's Bloom large language model, and the company required access to a supercomputer equivalent to 500 graphics processors in order to accomplish the task.

Those who build large language models for their software must be cautious when they retrain the software, which helps the software improve its capabilities because it costs so much for them to do so, he explained.

“It's important to realize that these models aren't trained all the time, like every single day,” Delangue said, noting that the reason some models, including ChatGPT, don't have knowledge of recent events is that they aren't trained all the time. It has been stated that ChatGPT's knowledge ends in 2021, according to him.

Delangue said the retraining of Bloom version two will cost no more than $10 million. “It's not something we want to do every week."

Inference and who pays for it

Using a trained machine learning model to produce predictions or generate text, engineers use the model in a process called "inference", which is much more expensive than training because it might have to be run millions of times in order to produce a product that is popular.

Curran believes that OpenAI could have spent $40 million to process all the millions of prompts that people sent to ChatGPT each month, which is a product with a popular product like ChatGPT - which was estimated to have reached 100 million active monthly subscribers by UBS in January.

These tools are used billions of times a day, and when they are used billions of times a day, their costs skyrocket. A number of financial analysts estimates that Microsoft's AI chatbot, powered by an OpenAI ChatGPT model, will require at least $4 billion of infrastructure in order to deliver responses to all users of Bing.

For example, in Latitude's case, even though the startup did not have to pay for the training of the underlying OpenAI language model it was accessing, it had to pay for the inferencing costs, which were approximately half a cent each call on a couple of million requests a day, according to a Latitude spokesperson.

“ My calculations were relatively conservative,” Curran said.

To ensure that generative AI technologies are at the forefront of the current AI boom, venture capitalists and tech giants have been investing billions of dollars in startups that are specialized in generative AI technologies in order to sow the seeds of the boom. Microsoft, for instance, is reported to have invested up to $10 billion into OpenAI, the GPT's overseer, according to recent media reports. Salesforce Ventures, Salesforce's venture capital arm, recently announced the opening of a $250 million fund that caters to startups that use generative artificial intelligence (AI).

“VC dollars have moved from subsidizing your taxi ride and burrito delivery to generative AI computations and generative learning modeling,” as Semil Shah of Haystack Venture Partners and Lightspeed Venture Partners described on Twitter recently.

There are many entrepreneurs who consider it risky to rely on potentially subsidized AI models, which they don't control, and which they merely pay for on a per-use basis, not something they can control.

“Whenever I talk to my AI friends at the startup conferences, I always tell them this: Don't rely exclusively on OpenAI, ChatGPT, or any other large language models to create your chatbot,” said Suman Kanuganti, founder of personal.ai, a chatbot that is currently in beta-testing. “The reason is that, because businesses are shifting, they are all owned by big technology companies, isn't it? As soon as they cut off your access, you will be gone."

Several companies such as enterprise technology company Conversica are looking into ways in which the technology can be used through Microsoft's Azure cloud service at the discount price it is currently being offered.

Despite the fact that Conversica CEO Jim Kaskade declined to comment about the amount the startup is paying, he did concede that the subsidized cost is welcome as the startup is exploring how language models can be used effectively in the future.

“They would be charging a heck of a lot more if they were really trying to break even,” Kaskade said.

How it could change

While the AI industry is still in its infancy, it is unclear whether AI computing will remain expensive as it develops. Those making foundation models, semiconductor companies, and startups all see the opportunities for reducing the cost of running AI software as a way to make money.

Despite the fact that Nvidia owns about 95% of the market for AI chips, it is still developing more powerful versions specifically designed for machine learning, but overall improvements in the total chip power within the industry have slowed in the past few years.

Nvidia CEO Jensen Huang believes that in 10 years time, AI will be "a million times" more efficient due to improvements not only in chips but also in software and other hardware components that make up computers.

Huang said in an earnings call last month that Moore's Law, in its best days, would have delivered 100x in a decade, if it was implemented properly. "By designing new processors, new systems, new interconnects, new frameworks and algorithms, along with working with data scientists and artificial intelligence researchers on new models, through that entire process, we have been able to speed up large language model processing by a million times."

There are a number of startups that have focused on the high cost of AI as a potential business opportunity.

“Nobody ever suggested that you should build a system that was purpose-built for inference. Can you give an example of what that might look like? ” Sid Sheth, the founder of D-Matrix, a startup building a system that uses the computer's memory to make inferences rather than GPUs, said.

As of today, most of the inference is carried out on GPUs, especially on NVIDIA GPUs. There are a lot of companies out there that buy the expensive DGX systems that NVIDIA sells. The problem with inference is if the workload spikes very rapidly, as it did with ChatGPT. The number of users for ChatGPT increased from a few hundred to over a million within a few days. Considering that your GPU is not built to handle such a high amount of power, there is no way it can keep up with it. "It was designed for graphics acceleration, for training," he said.

Delangue, the CEO of HuggingFace, believes that more organizations would be better served if they focused on smaller, more specific models that are easier to train and maintain, rather than the large language models that are attracting the most attention at the moment.

Meanwhile, OpenAI announced last month that it would be lowering the cost of accessing its GPT models by 50% for companies. The price currently charged by this service for about 750 words of output is one-fifth of a cent.

Latitude has taken notice of OpenAI's lower prices.

"I think it's safe to say that it's a really big change that we're looking forward to seeing happen in the industry, and we're continuously evaluating what we can do in order to deliver the best possible user experience to our customers," said a Latitude spokesperson. "Latitude will continue to evaluate all AI models so that we can make sure we have the most competitive game on the market."

Tags:
Author
Adan Harris
Managing Editor
Eric Ng
Contributor
John Liu
Contributor
Editorial Board
Contributor
Bryan Curtis
Contributor
Adan Harris
Managing Editor
Cathy Hills
Associate Editor

Subscribe to our newsletter!

As a leading independent research provider, TradeAlgo keeps you connected from anywhere.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Explore
Related posts.