Why are AI and machine learning moving away from cloud computing?
In order to localize delivery operations, a quick-service food operator is running its AI models on equipment inside its restaurants. In the meantime, a large pharmaceutical business is utilizing its own servers to train machine learning models locally.
Although cloud computing isn't going away, several businesses that employ machine learning models and the tech firms that provide the tools to manage their claim that the on-premises application of machine learning is becoming more popular.
Despite the fact that the area is developing, cloud providers have long contended that it would be too costly and time-consuming to set up machine learning on their own.
In the last year or two, we've seen an increase in clients wanting to bring workloads back on-premises due to cost concerns. "We still have a ton of customers that want to embark on a cloud migration," says one representative.
says Domino Data Lab's Thomas Robinson, vice president of strategic partnerships and corporate development. GPT-3 and other large-language transformer models, which firms now use in their conversation AI products and chatbots, are among the computationally intensive deep-learning models Robinson mentioned as being expensive to operate on cloud servers.
According to Vijay Raghavendra, chief technology officer at SymphonyAI, which partners with grocery chain Albertsons, the on-prem trend is spreading among big box and grocery retailers who need to feed products, distribution, and store-specific data into sizable machine learning models for inventory predictions.
After working at Walmart for seven years in top engineering and merchant technology capacities, Raghavendra departed the firm in 2020.
"This occurred after my employment at Walmart. When I worked there, they switched from having everything on-premise to having everything on the cloud. Raghavendra said Protocol.
Today, I believe there is more balance since they are reinvesting in their hybrid infrastructure, which combines on-premises infrastructure.
The cloud there is more of a balance since they are investing once again in their hybrid infrastructure, which combines on-premises infrastructure and the cloud.
" The cost of running it in the cloud does get quite expensive at a certain scale, so if you have the capacity, it might make sense to set up your own [co-location data center] and run those workloads there.
In the model building stage, when ML and deep learning models are trained before being released to the open market, some businesses consider on-prem setups.
Terabytes or petabytes of data must be used for the process's computationally intensive tweaking and testing of several parameters or combinations of various model types and inputs.
According to Danny Lange, vice president of AI and machine learning at gaming and automotive AI company Unity Technologies, "the high cost of training is giving people some challenges." According to Lange, the price of training might reach millions of dollars.
Since allowing engineers to train on a bank of GPUs in a public cloud service may rapidly become extremely costly, several businesses are now considering moving their training in-house to better manage training costs.
According to Robinson, businesses that are transferring compute and data to their own physical servers housed in owned or leased co-located data centers are likely to be at the forefront of using AI or deep learning.
The pharmaceutical company Domino Data Lab works with acquired two Nvidia server clusters to operate compute-intensive image recognition models on-prem, Robinson said, despite the client having made public its cloud-centric approach.
high price? Consider poor broadband
Running their own hardware is preferred by several businesses for reasons other than just building enormous deep learning models.
Retailers or fast food chains that use region-specific machine learning models to localize delivery logistics or optimize store inventory, according to Victor Thu, president of Datatron, would prefer to run ML inference workloads on their own servers inside their stores rather than transferring data back and forth to run the models in the cloud.
According to Thu, some clients "don't want it on the cloud at all." He said that Datatron has seen clients shift certain ML activities to their own computers, particularly those merchants with bad internet access in some places.
A more widely acknowledged justification for moving away from the cloud is model latency. Going in-house is often motivated by the time it takes a model to transfer data back and forth between cloud servers once it has been installed.
In order to ensure that models used in mobile devices or semi-autonomous vehicles react quickly to new data, several businesses also shun the cloud.
According to Robinson, where the data is created or where the model outputs are consumed often determines whether to operationalize a model on-premises or in the cloud.
Cloud service providers have overcome early assumptions that certain clients, especially those from highly regulated sectors, would find their services to be insufficiently secure.
Since well-known corporations like Capital One have used the cloud, worries about data security are becoming less common.
However, some businesses are forced to employ on-prem systems because of data security and privacy concerns.
According to the company's CEO, Ed Ikeguchi, AiCure takes a hybrid approach to manage data and machine learning models for its app used by patients in clinical trials. AiCure maintains complete control over all procedures involving sensitive personally identifiable information (PII).
We do most of our PII-type work locally, according to Ikeguchi. He said the corporation will be able to employ aggregated and anonymized data, at which point "all of the abstracted data will operate in the cloud."
NetApp's vice president of customer experience and digital strategy, Biren Fondekar, said, "We have clients that are incredibly security-aware." Clients in the highly regulated financial services and healthcare industries use NetApp's AI software in their private data centers.
Big cloud reacts
Even the biggest cloud providers are bucking the trend by gently promoting their on-premises machine learning technologies.
In a blog post published last year, AWS marketed its Outposts machine learning infrastructure, clients prefer to run ML outside of the cloud because of its low latency and a vast volume of data.
The lack of real-time inference and/or security requirements that prevent user data from being sent or stored in the cloud are two challenges customers are facing with performing inference in the cloud, according to a blog post by Josh Coen, a senior solutions architect at AWS, and Mani Khanuja, an expert in artificial intelligence and machine learning.
To address client concerns regarding region-specific compliance, data sovereignty, low latency, and local data processing, Google Cloud unveiled Google Distributed Cloud Edge in October.
Microsoft Azure has released technologies to assist users in managing machine learning in a hybrid fashion by verifying and debugging models locally before deploying them in the cloud.
According to Harsha Kapre, senior product manager at Snowflake, the company is considering offering additional on-premise solutions to clients.
Snowflake is connected to Domino Data Lab's MLOps platform. He informed Protocol, "I know we're actively thinking about it."
In July, Snowflake announced that it would make its machine learning data preparation tool, the external table data lake architecture, available for clients to utilize on their own hardware.
As firms incorporate AI into their operations, an increasing number of employees are employing machine learning models, which may be expensive to use in the cloud, according to Robinson.
Some of these models are now being utilized by programs with so many users that the amount of computing needed soars, making it economically necessary to operate them locally, he said.
However, others claim that the on-premise guarantee comes at a price.
It's important to remember that you're up against cloud providers that are experts in purchasing equipment and running it efficiently for a low cost.
According to Lange, it is more expensive and requires more expertise to do your training internally.
Bob Friday, Chief AI Officer of communications and AI network provider Juniper Networks, concurred.
If you can, leave it at Google, AWS, or Microsoft, according to Friday, who also noted that using on-prem doesn't make sense if a firm doesn't have an edge use-case needing split-second decision-making in a semi-autonomous car or processing massive streaming video files.
However, Robinson noted that businesses with significant AI programs may save money. He pointed out that while smaller businesses may not see cost savings from bringing their AI operations in-house, at scale, cloud infrastructure is much more expensive, especially for GPUs and other AI-optimized hardware.
He made a reference to a pharmaceutical client of Domino Data Lab that invested in Nvidia clusters because the price and availability of GPUs "were not palatable on AWS alone."
Also, cloud providers have been reluctant to make AI-accelerated hardware available for purchase by end users, as Robinson said. Because of the rapid advancements enabled by artificial intelligence (AI),
In the end, the effort to include on-prem infrastructure into machine learning might be seen as a sign of maturity among firms that have advanced beyond just dipping their toes in AI, much like the journey toward various clouds and hybrid cloud methods.
According to Lange, there has always been a little bit of a pendulum effect. Everyone gets to the cloud, after which they make an effort to retreat a little. I believe the key is striking the correct balance.