Head over to our on-demand library to view periods from VB Rework 2023. Register Here

Generative AI, enabled by giant language fashions (LLMs) like GPT-4, has induced shockwaves within the tech world. ChatGPT’s meteoric rise has triggered the worldwide tech business to reassess and prioritize gen AI, reshaping product methods in actual time.

Integration of LLMs has given product builders a simple technique to incorporate AI-powered options into their merchandise. However it’s not all easy crusing. A obtrusive problem looms giant for product leaders: the GPU scarcity and spiraling prices.

Rise of LLMs and GPU scarcity

The rising variety of AI startups and companies has led to excessive demand for high-end GPUs similar to A100s and H100s, thereby overwhelming Nvidia and its manufacturing companion TSMC, each of whom are struggling to satisfy the availability. On-line boards like Reddit are abuzz with frustrations over GPU availability, echoing the sentiment throughout the tech neighborhood. It’s turn into so dire that each AWS and Azure have had no alternative however to implement quota techniques.

This bottleneck doesn’t simply squeeze startups; it’s a stumbling block for tech giants like OpenAI. At a current off-the-record assembly in London, OpenAI’s CEO Sam Altman candidly acknowledged that the pc chip scarcity is stymieing ChatGPT’s development. Altman reportedly lamented that the dearth of computing energy has resulted in subpar API availability and has obstructed OpenAI from rolling out bigger “context home windows” for ChatGPT.


VB Rework 2023 On-Demand

Did you miss a session from VB Rework 2023? Register to entry the on-demand library for all of our featured periods.


Register Now

Prioritizing AI options

On the one hand, product leaders discover themselves caught in a relentless push to innovate, dealing with the expectations to ship cutting-edge options that leverage the power of gen AI. Then again, they grapple with the tough realities of GPU capability constraints. It’s a fancy juggling act, the place ruthless prioritization turns into not only a strategic resolution however a necessity.

Provided that GPU availability is poised to stay a problem for the foreseeable future, product leaders should suppose strategically about GPU allocation. Historically, product leaders have leaned on prioritization strategies just like the Buyer Worth/Want vs. Effort Matrix. This methodology, nonetheless logical in a world the place computational assets have been plentiful, now calls for a little bit of reevaluation.

In our present paradigm, the place compute is the constraint and never software program expertise, product leaders should redefine how they prioritize numerous merchandise or options, bringing GPU limitations to the forefront of strategic decision-making.

Planning round capability constraints may appear uncommon for the tech business, nevertheless it’s a commonplace technique in different industries. The underlying idea is easy: Probably the most helpful issue is the time spent on the constrained useful resource, and the target is to optimize the worth per unit of time spent on that constraint.

Expertise success metrics

As a former advisor, I’ve efficiently utilized this framework throughout numerous industries. I imagine that tech product leaders can even use an identical strategy to prioritize merchandise or options whereas GPU constraints exist. When making use of this framework, essentially the most easy measure of worth is profitability.

Nonetheless, in tech, profitability may not all the time be the suitable metric, notably when venturing into a brand new market or product. Thus, I’ve tailored the framework to align with the success metrics usually utilized in tech, outlining a easy 4 steps course of:

1. Contribution

In the beginning, determine your North Star metric. That is the contribution of every product or function, one thing that encapsulates the essence of its value. Some concrete examples may embody:

  • A rise in income and revenue
  • Positive factors in market share
  • Development within the variety of every day/month-to-month energetic customers

2. Variety of GPUs required

Gauge the variety of GPUs wanted for every product or function. Give attention to key components together with:

  • Variety of queries per consumer per day
  • Variety of every day energetic customers
  • Complexity of the question (what number of tokens every question consumes)

3. Calculate contribution per GPU

Break it all the way down to the specifics. How does every GPU contribute to the general purpose? Understanding this gives you a transparent image of the place your GPUs are finest allotted.

Prioritize merchandise primarily based on contribution per GPU

Now, it’s time to make the powerful selections. Rank your merchandise by their Contribution per GPU, after which line them up accordingly. Give attention to the merchandise with the very best Contribution per GPU first, guaranteeing that your restricted assets are channeled into the areas the place they’ll take advantage of affect.

With GPU constraints now not a blind spot however a quantifiable issue within the decision-making course of, your organization can extra strategically navigate the GPU scarcity. To carry this framework to life, let’s visualize a state of affairs the place you, as a product chief, are grappling with the problem of prioritizing amongst 4 totally different merchandise:

  Product A Product B Product C Product D
Income Potential (Contribution) $100M $80M $50M $25M
Variety of GPUs Required 1,000 450 500 50
Contribution Per GPU $0.1M/GPU $0.18M/GPU $0.1M/GPU $0.5M/GPU

Though Product A has the very best income potential, it doesn’t yield the very best contribution per GPU. Surprisingly, Product D, with the least income potential, affords essentially the most substantial return per GPU. By prioritizing primarily based on this metric, you may maximize complete potential income.

Let’s say you’ve got a complete of 1,000 GPUs at your disposal. An easy alternative may need you choosing Product A, producing a income potential of $100 million. Nonetheless, by making use of the prioritization technique described above, you may obtain $155 million in income:

Precedence Order Product Income Achieve GPUs
1 Product D $25M 50
2 Product B $80M 450
3 Product C $50M 500
Complete   $155M 1,000

The identical methodology may be utilized to different contribution metrics, similar to market share acquire:

  Product A Product B Product C Product D
Market Share Achieve (Contribution) 5% 4% 2.5% 1.25%
Variety of GPUs Required 1,000 500 500 50
Contribution Per GPU 0.005%/GPU 0.008%/GPU 0.005%/GPU 0.025%/GPU

Equally, deciding on Product A would have led to a market share acquire of 5%. Nonetheless, making use of the prioritization technique described above, you may obtain 7.75% in market share acquire:

Precedence Order Product Market Share acquire GPUs
1 Product D 1.25% 50
2 Product B 4% 450
3 Product C 2.5% 500
Complete   7.75% 1,000

Advantages and limitations

This different prioritization framework introduces a extra nuanced and strategic strategy. By zeroing in on the Contribution Per GPU, you’re strategically aligning assets the place they will take advantage of substantial distinction, whether or not by way of income, market share or some other defining metric.

However the benefits don’t cease there. This methodology additionally fosters a higher sense of readability and objectivity throughout product groups. In my expertise, together with my early days main digital transformation at a healthcare firm and later whereas working with numerous McKinsey shoppers, this strategy has been a game-changer in situations the place capability constraints are a vital issue. It’s enabled us to prioritize initiatives in a extra data-driven and rational manner, sidelining the standard politics the place selections may in any other case fall to the loudest voice within the room.

Nonetheless, no one-size-fits-all resolution exists, and it’s value acknowledging the potential limitations of this methodology. As an illustration, this strategy might not all the time encapsulate the strategic significance of sure investments. Thus, whereas exceptions to the framework can and must be made, they should be fastidiously thought-about moderately than the norm. This maintains the integrity of the method and ensures that any deviations are made with a broader strategic context in thoughts.


Product leaders are dealing with an unprecedented scenario with the GPU shortage, so discovering new methods of managing assets is required. Within the phrases of the good strategist Solar Tzu, “Within the midst of chaos, there’s additionally alternative.”

The GPU scarcity is certainly a problem, however with the precise strategy, it could even be a catalyst for differentiation and success. The proposed prioritization framework, specializing in Contribution Per GPU, affords a strategic technique to prioritize. By zeroing in on Contribution Per GPU, corporations can maximize their return on funding, aligning assets the place they’ll take advantage of affect and specializing in what issues essentially the most to the long-term success of their firm.

Prerak Garg is senior director of cloud and AI company technique at Microsoft and a former McKinsey and Firm engagement supervisor.


Welcome to the VentureBeat neighborhood!

DataDecisionMakers is the place specialists, together with the technical individuals doing information work, can share data-related insights and innovation.

If you wish to examine cutting-edge concepts and up-to-date data, finest practices, and the way forward for information and information tech, be part of us at DataDecisionMakers.

You may even take into account contributing an article of your individual!

Read More From DataDecisionMakers

Lascia un commento

Il tuo indirizzo email non sarà pubblicato. I campi obbligatori sono contrassegnati *