By Ismail Amla, Senior Vice President of Kyndryl Consult at Kyndryl
On Aug. 8, 2024, Google announced it was slashing the price of its Gemini 1.5 Flash AI model by over 70%. While not their top performer, Gemini 1.5 Flash is Google’s most popular model among developers building AI applications on top of Google technology. The company’s price reduction is the latest in a steady stream of cuts from OpenAI, Amazon, Anthropic, Hugging Face and other providers of AI services or compute. Even as the most powerful models remain pricey, less-costly mid-level models have improved capabilities across the board. For example, 1.5 Flash can now handle AI applications in over 100 languages, and Google has steadily expanded the size of the queries that 1.5 Flash will accept.
At the same time, ingenious developers continue to create new ways to train and inference AI models, all while using less compute power and data without diminishing quality. Sometimes they are even able to improve the AI’s capabilities. Quantization and pruning are two newer techniques that allow developers to train and run AI applications more efficiently, for example.
Some of the smartest people in the industry say AI is getting cheap so fast that it will soon be nearly free.
A host of smaller AI models, such as Microsoft’s open source Phi3.5, cost little to modify and run and have made big inroads. Many AI experts believe these smaller models will be responsible for handling most application tasks. This is the concept behind Apple Intelligence, which uses small models running directly on the phone to handle most AI applications. Meta is effectively subsidizing all of AI with its open-source Llama models that rival the largest and most sophisticated models Google and OpenAI are releasing, and come at the unbeatable price of zero. These free, open-source models reduce fears of the high cost of training large foundational models, which would create an exclusive club of AI incumbents. We are also seeing the arrival of so-called AI routers, which can analyze a request and route it to the most cost-efficient model, saving on performance.
The upshot of all this? In 2022, the tech world worried about “GPU Poverty,” where a handful of tech giants dominated AI applications and development with deep pockets and expensive expertise. Two years later, fears about overly expensive AI, driven by high GPU costs, seem misplaced. Some of the smartest people in the industry, like Abacus AI CEO Bindu Reddy, say AI is getting cheap so fast that it will soon be nearly free. Perplexity CEO Aravind Srinivas observes that AI costs have fallen by 100x in only two years. Sam Altman, CEO of OpenAI, has stated that he views AI as heading toward being “too cheap to meter.”
Today, we can glimpse a future where running basic AI applications is incredibly cheap and possibly even a loss-leader for other application needs. This opens tantalizing possibilities for new ways of doing business and new business models.
4 ways affordable AI may reshape business
A new economic model is emerging for AI agents: pay-per-unit of work. The logic is simple. Rather than pay for a “seat” for a human worker, enterprises can turn to AI that’s paid per unit of work, not per “agent.” And because AI is not the same as a human — it can scale up or down based on need and demand.
This mirrors cloud computing consumption models but at a more granular level. As AI gets cheaper and more ubiquitous, this model becomes more viable due to the sheer scale of AI use. This model only works when AI is on a rapid path to becoming cheaper and cheaper, even surpassing Moore’s Law in pace of computational improvements and reductions in price per unit of compute. Not surprisingly, the model is gaining initial traction in areas such as customer support bots. But look for it to become more commonplace as AI agents begin to take hold.
Super cheap AI will allow the introduction of a freemium tier for many professional services. Accounting, legal, and marketing will all become AI-addressable, and basic tasks in these areas could become free offers. Pay more and you will get access to a human expert. We have already seen freemium tiers in many existing software products targeting these types of services.
With cheap AI, these freemium offerings will become more sophisticated, with the AI offering a level of personalization and specialization formerly reserved for junior employees providing entry-level professional services. For example, AI will transform legal offerings, easily providing free incorporation forms and bespoke incorporation letters that account for a business’ location, size and legal status. It may also handle moderate degrees of specialization. This is the classic disruption pattern, and AI will accelerate it.
AI will become as commonplace as basic CPUs are in all electronic devices. The amount of compute and memory required to run specialized models continues to shrink each month as researchers find better ways to run AI with fewer resources. Driven by the need to put better AI into cell phones and mobile devices, all electronics and connected devices will draft off innovations driven by Microsoft, Meta, Apple and Google. In fact, this is already happening. For industrial systems and Internet of Things (IoT) tech, super cheap AI that can run even on the most basic processors will allow them to become intelligent and autonomous, making real-time decisions without needing a connection to the cloud or other networks.
Not only will systems all have AI, they will participate in a semantic meta-layer that can recognize the content of data collected and react accordingly. For example, a video recognition system will know that you like to send sports videos of your children playing soccer to their grandparents and will do that automatically. Or, as Apple demonstrated with its September 2024 AI presentation, an AI will gather information from your calendar, email, maps and the web to help you get to an important dinner engagement on time and find nearby parking. The layer will not be artificial general intelligence (AGI) and will feel like a natural extension of what we already do. Because AI is so inexpensive, this additional layer will likely be free to use, like what we see with existing search engine models.
New jobs created by affordable AI-plus
In every era of technological upheaval, the longer-term impacts of a technology shift are impossible to predict. No one could have imagined that the smartphone would swallow so many other devices — radios, cameras, tape recorders, turn-by-turn GPS systems, video recorders, calculators, flashlights and more. This disruption was made possible by inexpensive, powerful processors connected to ever-accelerating networks. With AI, we don’t yet know the form factor or the ultimate areas of disruption.
Some are beginning to emerge. Some we can speculate on. However, a pattern is clearly developing. Cognitive tasks that tax human capabilities, like analyzing large volumes of data, running the same job over and over, or carefully checking the configuration files of numerous computers, are the lowest-hanging fruit. Note that we are talking about AI replacing tasks and augmenting humans.
If history is a guide, this will create more demand for higher-value work thanks to efficiencies created by AI. After all, spreadsheets replaced rooms of human “calculators,” but the category of finance jobs grew stronger after software automated basic calculations. Likewise, even as prominent venture capitalists promised that AI would kill job sets such as junior attorneys and radiologists, we see few signs of mass unemployment in those professions. Rather, workers in those roles that use AI are seeing improvements in their capabilities, with the greatest improvements going to more junior workers.
AI will also flow into the places where cognitive labor is short, or bottlenecks exist. With primary doctors in short supply, AI-enabled diagnostics and decision support systems are improving the capabilities of nurse practitioners, who are often the first line of treatment in many doctor offices. AI can be expected to improve these workers’ lives and enable the creation of many more of these relatively high-paying jobs.
This is the new era of “AI+ jobs.” Just as search generated entire new industries around SEO, blogging and content curation, AI will likely spawn new industries around helping people and companies get the most out of AI. As the cost of AI continues to fall, the innovations that amaze us today will be considered standard in just a few short years — and this shift will make us all more productive.