When it comes to large language models, should you build or buy?

Share Story on

Last summer could only be described as an “AI summer,” especially with large language models making an explosive entrance. We saw huge neural networks trained on a massive corpora of data that can accomplish exceedingly impressive tasks, none more famous than OpenAI’s GPT-3 and its newer, hyped offspring, ChatGPT.

Companies of all shapes and sizes across industries are rushing to figure out how to incorporate and extract value from this new technology. But OpenAI’s business model has been no less transformative than its contributions to natural language processing. Unlike almost every previous release of a flagship model, this one does not come with open-source pretrained weights — that is, machine learning teams cannot simply download the models and fine-tune them for their own use cases.

Instead, they must either pay to use them as-is, or pay to fine-tune the models and then pay four times the as-is usage rate to employ it. Of course, companies can still choose other peer open-sourced models.

This has given rise to an age-old corporate — but entirely new to ML — question: Would it be better to buy or build this technology?

It’s important to note that there is no one-size-fits-all answer to this question; I’m not trying to provide a catch-all answer. I mean to highlight pros and cons of both routes and offer a framework that might help companies evaluate what works for them while also providing some middle paths that attempt to include components of both worlds.

Buying: Fast, but with clear pitfalls

While building looks attractive in the long run, it requires leadership with a strong appetite for risk, as well as deep coffers to back said appetite.

Let’s start with buying. There are a whole host of model-as-a-service providers that offer custom models as APIs, charging per request. This approach is fast, reliable and requires little to no upfront capital expenditure. Effectively, this approach de-risks machine learning projects, especially for companies entering the domain, and requires limited in-house expertise beyond software engineers.

Projects can be kicked off without requiring experienced machine learning personnel, and the model outcomes can be reasonably predictable, given that the ML component is being purchased with a set of guarantees around the output.

Unfortunately, this approach comes with very clear pitfalls, primary among which is limited product defensibility. If you’re buying a model anyone can purchase and integrate it into your systems, it’s not too far-fetched to assume your competitors can achieve product parity just as quickly and reliably. That will be true unless you can create an upstream moat through non-replicable data-gathering techniques or a downstream moat through integrations.

What’s more, for high-throughput solutions, this approach can prove exceedingly expensive at scale. For context, OpenAI’s DaVinci costs $0.02 per thousand tokens. Conservatively assuming 250 tokens per request and similar-sized responses, you’re paying $0.01 per request. For a product with 100,000 requests per day, you’d pay more than $300,000 a year. Obviously, text-heavy applications (attempting to generate an article or engage in chat) would lead to even higher costs.

You must also account for the limited flexibility tied to this approach: You either use models as-is or pay significantly more to fine-tune them. It is worth remembering that the latter approach would involve an unspoken “lock-in” period with the provider, as fine-tuned models will be held in their digital custody, not yours.

Building: Flexible and defensible, but expensive and risky

On the other hand, building your own tech allows you to circumvent some of these challenges.

When it comes to large language models, should you build or buy? by Ram Iyer originally published on TechCrunch



Search By Category

Recent News

You May Also Like

Sakshi Chandraakar: From Modest Beginnings to Monumental Success

Sakshi Chandraakar: From Modest Beginnings to Monumental Success

Sakshi Chandraakar’s journey is a testament to resilience, perseverance, and the transformative power of adversity. Born into a modest family, Sakshi’s determination led her to

Shri Narendra Modi Inaugurated World’s Largest Grain Storage Scheme ‘Anna Bhandaran Yojana’ in New Delhi

Shri Narendra Modi Inaugurated World’s Largest Grain Storage Scheme ‘Anna Bhandaran Yojana’ in New Delhi

New Delhi (India), February 27:  Feb 24 marks a historic milestone in India’s agricultural landscape as the Honorable Prime Minister, Shri Narendra Modi, inaugurated the

Weavinghands Rugs Pvt Ltd Unleashes Asymmetrical Masterpieces at Bharat Tex 2024

Weavinghands Rugs Pvt Ltd Unleashes Asymmetrical Masterpieces at Bharat Tex 2024

New Delhi (India), February 27: Prepare to be mesmerized by a symphony of shapes and textures as Shree Sai T/A Weavinghands Rugs Pvt Ltd, the

Small-cap Investments: Fuelling Financial Growth in India’s Emerging Economy

Small-cap Investments: Fuelling Financial Growth in India’s Emerging Economy

New Delhi [India], February 27: With the evolving scenario of financial markets, investors are on a constant look out for opportunities that promise not only

Senior Diplomats call for North-South Unity on tackling Environment and Conflict

Senior Diplomats call for North-South Unity on tackling Environment and Conflict

Former Foreign Secretary of India and current chancellor of Jawaharlal Nehru University Ambassador Dr Kanwal Sibal (seated fourth from the left) addressing a conference on

CLASSIYA JEWELS Announces the Launch of their Exclusive and Premium Jewellery E-Boutique & FLAGSHIP Boutique at Salt Lake City Center Kolkata

CLASSIYA JEWELS Announces the Launch of their Exclusive and Premium Jewellery E-Boutique & FLAGSHIP Boutique at Salt Lake City Center Kolkata

Kolkata (West Bengal) [India], February 27: Luxury Silver jewellery brand Classiya Jewels has ventured into the digital ecosystem with the launch of its e-commerce platform www.classiyajewels.com