OpenAI’s Foundry will let customers buy dedicated compute to run its AI models

Share Story on

OpenAI is quietly launching a new developer platform that lets customers run the company’s newer machine learning models, like GPT-3.5, on dedicated capacity. In screenshots of documentation published to Twitter by users with early access, OpenAI describes the forthcoming offering, called Foundry, as “designed for cutting-edge customers running larger workloads.”

“[Foundry allows] inference at scale with full control over the model configuration and performance profile,” the documentation reads.

If the screenshots are to be believed, Foundry — whenever it launches — will deliver a “static allocation” of compute capacity dedicated to a single customer. Users will be able to monitor specific instances with the same tools and dashboards that OpenAI uses to build and optimize models. In addition, Foundry will provide some level of version control, letting customers decide whether or not to upgrade to newer model releases, as well as “more robust” fine-tuning for OpenAI’s latest models.

Foundry will also offer service-level commitments for instance uptime and on-calendar engineering support. Rentals will be based on dedicated compute units with three-month or one-year commitments; running an individual model instance will require a specific number of compute units (see the chart below).

Instances won’t be cheap. Running a lightweight version of GPT-3.5 will cost $78,000 for a three-month commitment or $264,000 over a one-year commitment. To put that into perspective, one of Nvidia’s recent-gen supercomputers, the DGX Station, runs $149,000 per unit.

Eagle-eyed Twitter and Reddit users spotted that one of the text-generating models listed in the instance pricing chart has a 32k max context window. (The context window refers to the text that the model considers before generating additional text; longer context windows allow the model to “remember” more text essentially.) GPT-3.5, OpenAI’s latest text-generating model, has a 4k max context window, suggesting that this mysterious new model could be the long-awaited GPT-4 — or a stepping stone toward it.

OpenAI is under increasing pressure to turn a profit after a multi-billion-dollar investment from Microsoft. The company reportedly expects to make $200 million in 2023, a pittance compared to the more than $1 billion that’s been put toward the startup so far.

Compute costs are largely to blame. Training state-of-the-art AI models can command upwards of millions of dollars, and running them generally isn’t much cheaper. According to OpenAI co-founder and CEO Sam Altman, it costs a few cents per chat to run ChatGPT, OpenAI’s viral chatbot — not an insignificant amount considering that ChatGPT had over a million users as of last December.

In moves toward monetization, OpenAI recently launched a “pro” version of ChatGPT, ChatGPT Plus, starting at $20 per month and teamed up with Microsoft to develop Bing Chat, a controversial chatbot (putting it mildly) that’s captured mainstream attention. According to Semafor and The Information, OpenAI plans to introduce a mobile ChatGPT app in the future and bring its AI language technology into Microsoft apps like Word, PowerPoint and Outlook.

Separately, OpenAI continues to make its tech available through Microsoft’s Azure OpenAI Service, a business-focused model-serving platform, and maintain Copilot, a premium code-generating service developed in partnership with GitHub.

OpenAI’s Foundry will let customers buy dedicated compute to run its AI models by Kyle Wiggers originally published on TechCrunch



Search By Category

Recent News

You May Also Like

ZEEL’s Mind Wars app reaches 111K downloads on Google Play Store!

ZEEL’s Mind Wars app reaches 111K downloads on Google Play Store!

ZEEL’s Mind Wars app reaches 111K downloads on Google Play Store! Mind Wars, the popular edutainment mobile app developed by Zee Entertainment Enterprises Ltd. has

Rewards and Loyalty Gateway Benepik nears 100 Crore Revenues in FY 22-23

Rewards and Loyalty Gateway Benepik nears 100 Crore Revenues in FY 22-23

Gurugram (India), March 25: Benepik, a Rewards, Loyalty and Engagement Gateway to Businesses announced that it has reached a major milestone, nearing INR 100 Crores

Data Intelligence Firm, Near, to Debut on Nasdaq Under Ticker “NIR”

Data Intelligence Firm, Near, to Debut on Nasdaq Under Ticker “NIR”

Near Intelligence Holdings Inc. and KludeIn I Acquisition Corp. Announce Closing of Business Combination PASADENA, CA, March 25: Near, a global leader in privacy-led data intelligence on people, places

Producer Akshai Puri’s next “Gaslight” will unlock the royal world of deep dark secrets, a murder mystery and a classic whodunit

Producer Akshai Puri’s next “Gaslight” will unlock the royal world of deep dark secrets, a murder mystery and a classic whodunit

New Delhi (India), March 25: Akshai Puri, Producer, 12th Street Entertainment gears up to redefine thriller genre for the audience, and at the same time

Launched in Pune, MediCtrl Hospitals to Revolutionize Healthcare in India

Launched in Pune, MediCtrl Hospitals to Revolutionize Healthcare in India

Pune (Maharashtra) [India], March 25: MediCtrl launched two Hospitals in Pune, namely MediCtrl Apple Hospital and MediCtrl Shree Hospital. The chain of Hospitals is aiming

Tyrant Sports Club announces the Sixth Edition of their Tyrant Premier League, 25th March – 15th April ,2023 at Catholic Gymkhana, Mumbai

Tyrant Sports Club announces the Sixth Edition of their Tyrant Premier League, 25th March – 15th April ,2023 at Catholic Gymkhana, Mumbai

–           Tyrant Premier league will be held from 25th March to 15th April at Catholic Gymkhana, Mumbai – 54 matches will be played over 3