This is our tenth annual landscape and “state of the union” of the data, analytics, machine learning and AI ecosystem.
In 10+ years covering the space, things have never been as exciting and promising as they are today. All trends and subtrends we described over the years are coalescing: data has been digitized, in massive amounts; it can be stored, processed and analyzed fast and cheaply with modern tools; and most importantly, it can be fed to ever-more performing ML/AI models which can make sense of it, recognize patterns, make predictions based on it, and now generate text, code, images, sounds and videos.
The MAD (ML, AI & Data) ecosystem has gone from niche and technical, to mainstream. The paradigm shift seems to be accelerating with implications that go far beyond technical or even business matters, and impact society, geopolitics and perhaps the human condition. Perhaps suddenly for some, it has become “everything, everywhere all at once”.
There are still many chapters to write in the multi-decade megatrend, however. As every year, this post is an attempt at making sense of where we are currently, across products, companies and industry trends.
Here are the prior versions: 2012, 2014, 2016, 2017, 2018, 2019 (Part I and Part II), 2020, 2021 and 2023 (Part I, Part II, Part III, Part IV).
Our team this year was Aman Kabeer and Katie Mills (FirstMark), Jonathan Grana (Go Fractional) and Paolo Campos, major thanks to all. And a big thank you as well to CB Insights for providing the card data appearing in the interactive version.
This annual state of the union post is organized in three parts:
- Part: I: The landscape (PDF, Interactive version)
- Part II: 24 themes we’re thinking about in 2024
- Part III: Financings, M&A and IPOs
PART I: THE LANDSCAPE
Links
To see a PDF of the 2024 MAD Landscape in full resolution (please zoom!), please CLICK HERE
To access the interactive version of the 2024 MAD landscape, please CLICK HERE
Number of companies
The 2024 MAD landscape features 2,011 logos in total.
That number is up from 1,416 last year, with 578 new entrants to the map.
For reference, the very first version in 2012 has just 139 logos.
The intensely (insanely?) crowded nature of the landscape primarily results from two back-to-back massive waves of company creation and funding.
The first wave was the 10-ish year long data infrastructure cycle, which started with Big Data and ended with the Modern Data Stack. The long awaited consolidation in that space has not quite happened yet, and the vast majority of the companies are still around.
The second wave is the ML/AI cycle, which started in earnest with Generative AI. As we are in the early innings of this cycle, and most companies are very young, we have been liberal in including young startups (a good number of which are seed stage still) in the landscape.
Note: those two waves are intimately related. A core idea of the MAD Landscape every year has been to show the symbiotic relationship between data infrastructure (on the left side); analytics/BI and ML/AI (in the middle) and applications (on the right side).
While it gets harder every year to fit the ever-increasing number of companies on the landscape every year, but ultimately, the best way to think of the MAD space is as an assembly line – a full lifecycle of data from collection to storage to processing to delivering value through analytics or applications.
Two big waves + limited consolidation = lots of companies on the landscape.
Main changes in “Infrastructure” and “Analytics“
We’ve made very few changes to the overall structure of left side of the landscape – as we’ll see below (Is the Modern Data Stack dead?), this part of the MAD landscape has seen a lot less heat lately.
Some noteworthy changes: We renamed “Database Abstraction” to “Multi-Model Databases & Abstractions”, to capture the rising wave around an all-in-one ‘Multi-Model’ database group (SurrealDB*, EdgeDB); killed the “Crypto / Web 3 Analytics” section we experimentally created last year, which felt out of place in this landscape; and removed the “Query Engine” section, which felt more like a part of a section than a separate section (all the companies in that section still appear on the landscape – Dremio, Starburst, PrestoDB etc).
Main changes in “Machine Learning & Artificial Intelligence”
With the explosion of AI companies in 2023, this is where we found ourselves making by far the most structural changes.
- Given the tremendous activity in the ‘AI enablement’ layer in the last year, we added 3 new categories next to MLOps:
- “AI Observability” is a new category this year, with startups that help test, evaluate and monitor LLM applications
- “AI Developer Platforms” is close in concept to MLOps but we wanted to recognize the wave of platforms that are wholly focused on AI application development, in particular around LLM training, deployment and inference
- “AI Safety & Security” includes companies addressing concerns innate to LLMs, from hallucination to ethics, regulatory compliance, etc
- If the very public beef between Sam Altman and Elon Musk has told us anything, it’s that the distinction between commercial and nonprofit is a critical one when it comes to foundational model developers. As such, we have split what was previously “Horizontal AI/AGI” into two categories: “Commercial AI Research” and “Nonprofit AI Research”
- The final change we made was another nomenclature one, where we amended “GPU Cloud” to reflect the addition of core infrastructure feature sets made by many of the GPU Cloud providers: “GPU Cloud / ML Infra”
Main changes in “Applications”
- The biggest update here is that…to absolutely no one’s surprise…every application-layer company is now a self-proclaimed “AI company” – which, as much as we tried to filter, drove the explosion of new logos you see on the right side of the MAD landscape this year
- Some minor changes on the structure side:
- In “Horizontal Applications,” we added a “Presentation & Design” category
- We renamed “Search” to Search / Conversational AI” to reflect the rise of LLM-powered chat-based interface such as Perplexity.
- In “Industry”, we rebranded “Gov’t & Intelligence” to “Aerospace, Defense & Gov’t”
Main changes in “Open Source Infrastructure”
- We merged categories that have always been close, creating a single “Data Management” category that spans both “Data Access” and “Data Ops”
- We added an important new category, “Local AI” as builders sought to provide the infrastructure tooling to bring AI & LLMs to the local development age
PART II: 24 THEMES WE’RE THINKING ABOUT IN 2024
Things in AI are both moving so fast, and getting so much coverage, that it is almost impossible to provide a fully comprehensive “state of the union” of the MAD space, as we did in prior years.
So here’s for a different format: in no particular order, here are 24 themes that are top of mind and/or come up frequently in conversations. Some are fairly fleshed out thoughts, some largely just questions or thought experiments.
- Structured vs unstructured data
This is partly a theme, partly something we find ourselves mentioning a lot in conversations to help explain the current trends.
So, perhaps as an introduction to this 2024 discussion, here’s one important reminder upfront, which explains some of the key industry trends. Not all data is the same. At the risk of grossly over-simplifying, there are two main families of data, and around each family, a set of tools and use cases has emerged.
- Structured data pipelines: that is data that can fit into rows and columns.
- For analytical purposes, data gets extracted from transactional databases and SaaS tools, stored in cloud data warehouses (like Snowflake), transformed, and analyzed and visualized using Business Intelligence (BI) tools, mostly for purposes of understanding the present and the past (what’s known as “descriptive analytics”). That assembly line is often enabled by the Modern Data Stack discussed below, with analytics as the core use case.
- In addition, structured data can also get fed in “traditional” ML/AI models for purposes of predicting the future (predictive analytics) – for example, which customers are most likely to churn
- Unstructured data pipelines: that is the world of data that typically doesn’t fit into rows and columns such as text, images, audio and video. Unstructured data is largely what gets fed in Generative AI models (LLMs, etc), both to train and use (inference) them.
Those two families of data (and the related tools and companies) are experiencing very different fortunes and levels of attention right now.
Unstructured data (ML/AI) is hot; structured data (Modern Data Stack, etc) is not.
- Is the Modern Data Stack dead?
Not that long ago (call it, 2019-2021), there wasn’t anything sexier in the software world than the Modern Data Stack (MDS). Alongside “Big Data”, it was one of the rare infrastructure concepts to have crossed over from data engineers to a broader audience (execs, journalists, bankers).
The Modern Data Stack basically covered the kind of structured data pipeline mentioned above. It gravitated around the fast-growing cloud data warehouses, with vendors positioned upstream from it (like Fivetran and Airbyte), on top of it (DBT) and downstream from it (Looker, Mode).
As Snowflake emerged as the biggest software IPO ever, interest in the MDS exploded, with rabid, ZIRP-fueled company creation and VC funding. Entire categories became overcrowded within a year or two – data catalogs, data observability, ETL, reverse ETL, to name a few.
A real solution to a real problem, the Modern Data Stack was also a marketing concept and a de-facto alliance amongst a number of startups across the value chain of data.
Fast forward to today, the situation is very different. In 2023, we had previewed that the MDS was “under pressure”, and that pressure will only continue to intensify in 2024.
The MDS is facing two key issues:
- Putting together a Modern Data Stack requires stitching together various best-of-breed solutions from multiple independent vendors. As a result, it’s costly in terms of money, time and resources. This is not looked upon favorable by the CFO office in a post ZIRP budget cut era
- The MDS is no longer the cool kid on the block. Generative AI has stolen all the attention from execs, VCs and the press – and it requires the kind of unstructured data pipelines we mentioned above.
Watch: MAD Podcast: Is the Modern Stack Dead? With Tristan Handy, CEO, dbt Labs (Apple, Spotify)
- Consolidation in data infra, and the big getting bigger
Given the above, what happens next in data infra and analytics in 2024?
It may look something like this:
- Many startups in and around the Modern Data Stack will aggressively reposition as “AI infra startups” and try to find a spot in the Modern AI Stack (see below). This will work in some cases, but going from structured to unstructured data may require a fundamental product evolution in most cases.
- The data infra industry will finally see some consolidation. M&A has been fairly limited to date, but some acquisitions did happen in 2023, whether tuck-ins or medium-size acquisitions – including Stemma (acquired by Teradata), Manta (acquired by IBM), Mode (acquired by Thoughtspot), etc (see PART III below)
- There will be a lot more startup failure – as VC funding dried up, things have gotten tough. Many startups have cut costs dramatically, but at some point their cash runway will end. Don’t expect to see flashy headlines, but this will (sadly) happen.
- The bigger companies in the space, whether scale-ups or public companies, will double down on their platform play and push hard to cover ever more functionality. Some of it will be through acquisitions (hence the consolidation) but a lot of it will also be through homegrown development.
- Checking in on Databricks vs Snowflake
Speaking of big companies in the space, let’s check in on the “titanic shock” (see our MAD 2021 blog post) between the two key data infra players, Snowflake and Databricks.
Snowflake (which historically comes from the structured data pipeline world) remains an incredible company, and one of the highest valued public tech stocks (14.8x EV/NTM revenue as of the time of writing). However, much like a lot of the software industry, its growth has dramatically slowed down – it finished fiscal 2024 with a 38% year-over-year product revenue growth, totaling $2.67 billion, projecting 22% NTM rev growth as of the time of writing). Perhaps most importantly, Snowflake gives the impression of a company under pressure on the product front – it’s been slower to embrace AI, and comparatively less acquisitive. The recent, and somewhat abrupt, CEO transition is another interesting data point.
Databricks (which historically comes from the unstructured data pipeline and machine learning world) is experiencing all-around strong momentum, reportedly (as it’s still a private company) closing FY’24 with $1.6B in revenue with 50%+ growth. Importantly, Databricks is emerging as a key Generative AI player, both through acquisitions (most notably, MosaicML for $1.3B) and homegrown product development – first and foremost as a key respiratory for the kind of unstructured data that feeds LLMs, but also as creator of models, from Dolly to DBRX, a new generative AI model the company just announced at the time of writing.
The major new evolution in the Snowflake vs Databricks rivalry is the launch of Microsoft Fabric. Announced in May 2023, it’s an end-to-end, cloud-based SaaS platform for data and analytics. It integrates a lot of Microsoft products, including OneLake (open lakehouse), PowerBI and Synapse Data Science, and covers basically all data and analytics workflows, from data integration and engineering to data science. As always for large company product launches, there’s a gap between the announcement and the reality of the product, but combined with Microsoft’s major push in Generative AI, this could become a formidable threat (as an additional twist to the story, Databricks largely sits on top of Azure).
- BI in 2024, and Is Generative AI about to transform data analytics?
Of all parts of the Modern Data Stack and structured data pipelines world, the category that has felt the most ripe for reinvention is Business Intelligence. We highlighted in the 2019 MAD how the BI industry had almost entirely consolidated, and talked about the emergence of metrics stores in the 2021 MAD.
The transformation of BI/analytics has been slower than we’d have expected. The industry remains largely dominated by older products, Microsoft’s PowerBI, Salesforce’s Tableau and Google’s Looker, which sometimes get bundled in for free in broader sales contracts. Some more consolidation happened (Thoughtspot acquired Mode; Sisu was quietly acquired by Snowflake). Some young companies are taking innovative approaches, whether scale-ups (see dbt and their semantic layer/MetricFlow) or startups (see Trace* and their metrics tree), but they’re generally early in the journey.
In addition to potentially playing a powerful role in data extraction and transformation, Generative AI could have a profound impact in terms of superpowering and democratizing data analytics.
There’s certainly been a lot of activity. OpenAI launched Code Interpreter, later renamed to Advanced Data Analysis. Microsoft launched a Copilot AI chatbot for finance workers in Excel. Across cloud vendors, Databricks, Snowflake, open source and a substantial group of startups, a lot of people are working on or have released “text to SQL” products, to help run queries into databases using natural language.
The promise is both exciting and potentially disruptive. The holy grail of data analytics has been its democratization. Natural language, if it were to become the interface to notebooks, databases and BI tools, would enable a much broader group of people to do analysis.
Many people in the BI industry are skeptical, however. The precision of SQL and the nuances of understanding the business context behind a query are considered big obstacles to automation.
- The Rise of the Modern AI Stack
A lot of what we’ve discussed so far had to do with the world of structured data pipelines.
As mentioned, the world of unstructured data infrastructure is experiencing a very different moment. Unstructured data is what feeds LLMs, and there’s rabid demand for it. Every company that’s experimenting or deploying Generative AI is rediscovering the old cliche: “data is the new oil”. Everyone wants the power of LLMs, but trained on their (enterprise) data.
Companies big and small have been rushing into the opportunity to provide the infrastructure of Generative AI.
Several AI scale-ups have been aggressively evolving their offerings to capitalize on market momentum – everyone from Databricks (see above) to Scale AI (which evolved their labeling infrastructure, originally developed for the self-driving car market, to partner as an enterprise data pipeline with OpenAI and others) to Dataiku* (which launched their LLM Mesh to enable Global 2000 companies to seamlessly work across multiple LLM vendors and models).
Meanwhile a new generation of AI infra startups is emerging, across a number of domains, including:
- Vector databases, which store data in a format (vector embeddings) that Generative AI models can consume. Specialized vendors (Pinecone, Weaviate, Chroma, Qudrant etc) have had a banner year, but some incumbent database players (MongoDB) were also quick to react and add vector search capabilities.
- Frameworks (LlamaIndex, Langchain etc), which connect and orchestrate all the moving pieces
- Guardrails, which sit between an LLM and users and make sure the model provides outputs that follow the organization’s rules.
- Evaluators which help test, analyze and monitor Generative AI model performance, a hard problem as demonstrated by the general distrust in public benchmarks
- Routers, which help direct user queries across different models in real time, to optimize performance, cost and user experience
- Cost guards, whichhelp monitor the costs of using LLMs
- Endpoints, effectively APIs that abstract away the complexities of underlying infrastructure (like models)
We’ve been resisting using the term “Modern AI Stack”, given the history of the Modern Data Stack.
But the expression captures the many parallels: many of those startups are the “hot companies” of the day, just like MDS companies before them, they tend to travel in pack, forging marketing alliances and product partnerships. And perhaps there
And this new generation of AI infra startups is going to face some of the same challenges as MDS companies before them: are any of those categories big enough to build a multi-billion dollar company? Which part will big companies (mostly cloud providers, but also Databricks and Snowflake) end up building themselves?
WATCH – we featured many emerging startups on the MAD Podcast:
- Where are we in the AI hype cycle?
AI has a multi decade-long history of AI summers and winters. Just in the last 10-12 years, this is the third AI hype cycle we’ve experienced: there was one in 2013-2015 after deep learning came to the limelight post ImageNet 2012; another one sometime around 2017-2018 during the chatbot boom and the rise of TensorFlow; and now since November 2022 with Generative AI.
This hype cycle has been particularly intense, to the point of feeling like an AI bubble, for a number of reasons: the technology is incredibly impressive; it is very visceral and crossed over to a broad audience beyond tech circles; and for VCs sitting on a lot of dry powder, it’s been the only game in town as just about everything else in technology has been depressed.
Hype has brought all the usual benefits (“nothing great has ever been achieved without irrational exuberance”, “let a 1000 flowers bloom” phase, with lots of money available for ambitious projects) and noise (everyone is an AI expert overnight, every startup is an AI startup, too many AI conferences/podcasts/newsletters… and dare we say, too many AI market maps???).
The main issue of any hype cycle is the inevitable blowback.
There’s a fair amount of “quirkiness” and risk built into this market phase: the poster-child company for the space has a very unusual legal and governance structure; there are a lot of “compute for equity” deals happening (with potential round-tripping) that are not fully understood or disclosed; a lot of top startups are run by teams of AI researchers; and a lot of VC dealmaking is reminiscent of the ZIRP times: “land grabs”, big rounds and eye-watering valuations for very young companies.
There certainly have been cracks in AI hype (see below), but we’re still in a phase where every week a new thing blows everyone’s minds. And news like the reported $40B Saudi Arabia AI fund seem to point that money flows into the space are not going to stop anytime soon.
- Experiments vs reality: was 2023 a headfake?
Related to the above – given the hype, how much has been real so far, vs merely experimental?
2023 was an action packed year: a) every tech vendor rushed to include Generative AI in their product offering, b) every Global 2000 board mandated their teams to “do AI”, and some enterprise deployments happened a record speed, including at companies in regulated industries like Morgan Stanley and Citibank and c) of course, consumers showed rabid interest for Generative AI apps.
As a result, 2023 was a year of big wins: OpenAI reached $2B in annual run rate; Anthropic grew at a pace that allowed it to forecast $850M in revenues for 2024; Midjourney grew to $200M in revenue with no investment and a team of 40; Perplexity AI went from 0 to 10 million monthly active users, etc.
Should we be cynical? Some concerns:
- In the enterprise, a lot of the spend was on proof of concepts, or easy wins, often coming out of innovation budgets.
- How much was driven by executives wanting to not appear flat-footed, vs solving actual business problems?
- In consumer, AI apps show high churn. How much was it mere curiosity?
- Both in their personal and professional lives, many report not being entirely sure what to do with Generative AI apps and products
- Should we view Inflection AI’s decision to fold quickly an admission that the world doesn’t need yet another AI chatbot, or even LLM provider?
- LLM companies: maybe not so commoditized after all?
Billions of venture capital and corporate money are being invested in foundational model companies.
Hence everyone’s favorite question in the last 18 months: are we witnessing a phenomenal incineration of capital into ultimately commoditized products? Or are those LLM providers the new AWS, Azure and GCP?
A troubling fact (for the companies involved) is that no LLM seems to be building a durable performance advantage. At the time of writing, Claude 3 Sonnet and Gemini Pro 1.5 perform better than GPT-4 which performs better than Gemini 1.0 Ultra, and so on and so forth – but this seems to change every few weeks. Performance also can fluctuate – ChatGPT at some point “lost its mind” and “got lazy”, temporarily.
In addition, open source models (Llama 3, Mistral and others like DBRX) are quickly catching up in terms of performance.
Separately – there are a lot more LLM providers on the market than could have appeared at first. A couple of years ago, the prevailing narrative was that there could be only one or two LLM companies, with a winner-take-all dynamic – in part because there was a tiny number of people around the world with the necessary expertise to scale Transformers.
It turns out there are more capable teams than first anticipated. Beyond OpenAI and Anthropic, there are a number of startups doing foundational AI work – Mistral, Cohere, Adept, AI21, Imbue, 01.AI to name a few – and then of course the teams at Google, Meta, etc.
Having said that – so far the LLM providers seem to be doing just fine. OpenAI and Anthropic revenues are growing at extraordinary rates, thank you very much. Maybe the LLM models do get commoditized, the LLM companies still have an immense business opportunity in front of them. They’ve already become “full stack” companies, offering applications and tooling to multiple audiences (consumer, enterprise, developers), on top of the underlying models.
Perhaps the analogy with cloud vendors is indeed pretty apt. AWS, Azure and GCP attract and retain customers through an application/tooling layer and monetize through a compute/storage layer that is largely undifferentiated.
WATCH:
- LLMs, SLMs and a hybrid future
For all the excitement about Large Language Models, one clear trend of the last few months has been the acceleration of small language models (SLMs), such as Llama-2-13b from Meta, Mistral-7b and Mixtral 8x7b from Mistral and Phi-2 and Orca-2 from Microsoft.
While the LLMs are getting ever bigger (GPT-3 reportedly having 175 billion parameters, GPT-4 reportedly having 1.7 trillion, and the world waiting for an even more massive GPT-5), SLMs are becoming a strong alternative for many use cases are they are cheaper to operate, easier to finetune, and often offer strong performance.
Another trend accelerating is the rise of specialized models, focused on specific tasks like coding (Code-Llama, Poolside AI) or industries (e.g. Bloomberg’s finance model, or startups Orbital Materials building modelsl for material sciences, etc).
As we are already seeing across a number of enterprise deployments, the world is quickly evolving towards hybrid architectures, combining multiple models.
Although prices have been going down (see below), big proprietary LLMs are still very expensive, experience latency problems, and rso users/customers will increasingly be deploying combinations of models, big and small, commercial and open source, general and specialized, to meet their specific needs and cost constraints.
Watch: MAD Podcast with Eiso Kant, CTO, Poolside AI (also: Apple Podcasts, Spotify)
- Is traditional AI dead?
A funny thing happened with the launch of ChatGPT: much of the AI that had been deployed up until then got labeled overnight as “Traditional AI”, in contrast to “Generative AI”.
This was a little bit of a shock to many AI practitioners and companies that up until then were considered to be doing leading-edge work, as the term “traditional” clearly suggests an impending wholesale replacement of all forms of AI by the new thing.
The reality is a lot more nuanced. Traditional AI and Generative AI are ultimately very complementary as they tackle different types of data and use cases.
What is now labeled as “traditional AI”, or occasionally as “predictive AI” or “tabular AI”, is also very much part of modern AI (deep learning based). However, it generally focuses on structured data (see above), and problems such as recommendations, churn prediction, pricing optimization, inventory management. “Traditional AI” has experienced tremendous adoption in the last decade, and it’s already deployed at scale in production in thousands of companies around the world.
In contrast, Generative AI largely operates on unstructured data (text, image, videos, etc.). Is exceptionally good at a different class of problems (code generation, image generation, search, etc).
Here as well, the future is hybrid: companies will use LLMs for certain tasks, predictive models for other tasks. Most importantly, they will often combine them – LLMs may not be great at providing a precise prediction, like a churn forecast, but you could use an LLM that calls on the output of another model which is focused on providing that prediction, and vice versa.
- Thin wrappers, thick wrappers and the race to be full stack
“Thin wrappers” was the dismissive term everyone loved to use in 2023. It’s hard to build long lasting value and differentiation if your core capabilities are provided by someone else’s technology (like OpenAI), the argument goes. And reports a few months ago that startups like Jasper were running into difficulties, after experiencing a meteoric revenue rise, seem to corroborate that line of thinking.
The interesting question is what happens over time, as young startups build more functionality. Do thin wrappers become thick wrappers?
In 2024, it feels like thick wrappers have a path towards differentiation by:
- Focusing on a specific problem, often vertical – as anything too horizontal runs the risk of being in the “kill zone” of Big Tech
- Building workflow, collaboration and deep integrations, that are specific to that problem
- Doing a lot of work at the AI model level – whether finetuning models with specific datasets or creating hybrid systems (LLMs, SLMs, etc) tailored for their specific business
In other words, they will need to be both narrow and “full stack” (both applications and infra).
- Interesting areas to watch in 2024: AI agents, Edge AI
There’s been plenty of excitement over the last year around the concept of AI agents – basically the last mile of an intelligent system that can execute tasks, often in a collaborative manner. This could be anything from helping to book a trip (consumer use case) to automatically running full SDR campaigns (productivity use case) to RPA-style automation (enterprise use case).
AI agents are the holy grail of automation – a “text to action” paradigm where AI just gets stuff done for us.
Every few months, the AI world goes crazy for an agent-like product, from BabyAGI last year to Devin AI (an “AI software engineer”) just recently. However, in general, much of this excitement has proven premature to date. There’s a lot of work to be done first to make Generative less brittle and more predictable, before complex systems involving several models can work together and take actual actions on our behalf. There are also missing components – such as the need to build more memory into AI systems. However, expect AI agents to be a particularly exciting area in the next year or two.
Another interesting area is Edge AI. As much as there is a huge market for LLMs that run at massive scale and delivered as end points, a holy grail in AI has been models that can run locally on a device, without GPUs, in particular phones, but also intelligent, IoT-type devices. The space is very vibrant: Mixtral, Ollama, Llama.cpp, Llamafile, GPT4ALL (Nomic). Google and Apple are also likely to be increasingly active.
- Is Generative AI heading towards AGI, or towards a plateau?
It’s almost a sacrilegious question to ask given all the breathless takes on AI, and the incredible new products that seem to come out every week – but is there a world where progress in Generative AI slows down rather than accelerates all the way to AGI? And what would that mean?
The argument is twofold: a) foundational models are a brute force exercise, and we’re going to run out of resources (compute, data) to feed them, and b) even if we don’t run out, ultimately the path to AGI is reasoning, which LLMs are not capable of doing.
Interestingly, this is more or less the same discussion as the industry was having 6 years ago, as we described in a 2018 blog post. Indeed what seems to have changed mostly is the sheer amount of data and compute we’ve thrown at (increasingly capable) models.
How close we are from any kind of “running out” is very hard to assess. The frontier for “running out of compute” seems to be pushed back further every day. NVIDIA’s recently announced Blackwell GPU system, and the company says it can deploy a 27 trillion parameter model (vs 1.7 trillion for GPT-4).
The data part is complex – there’s a more tactical question around running out of legally licensed data (see all the OpenAI licensing deals), and a broader question around running out of textual data, in general. There is certainly a lot of work happening around synthetic data. Yann LeCun discussed how taking models to the next level would probably require them to be able to ingest much richer video input, which is not yet possible.
From the narrow perspective of participants in the startup ecosystem (founders, investors), perhaps the question matters less, in the medium term – if Generative AI stopped making progress tomorrow, we’d still have years of opportunity ahead deploying what we currently have across verticals and use cases.
- The GPU wars (is NVIDIA overvalued?)
Are we in the early innings of a massive cycle where compute becomes the most precious commodity in the world, or dramatically over-building GPU production in a way that’s sure to lead to a big crash?
As pretty much the only game in town when it comes to Generative AI-ready GPUs, NVIDIA certainly has been having quite the moment, with a share price up five-fold to a $2.2 trillion valuation, and total sales three-fold since late 2022, massive excitement around its earnings and Jensen Huang at GTC rivaling Taylor Swift for the biggest event of 2024.
Perhaps this was also in part because it was the ultimate beneficiary of all the billions invested by VCs in AI?
Regardless, for all its undeniable prowess as a company, NVIDIA’s fortunes will be tied to how sustainable the current gold rush will turn out to be. Hardware is hard, and predicting with accuracy how many GPUs need to be manufactured by TSMC in Taiwan is a difficult art.
In addition, competition is trying its best to react, from AMD to Intel to Samsung; startups (like Groq or Cerebras) are accelerating, and new ones may be formed, like Sam Altman’s rumored $7 trillion chip company. A new coalition of tech companies including Google, Intel and Qualcomm is trying to go after NVIDIA’s secret weapon: its CUDA software that keeps developers tied to Nvidia chips.
Our take: As the GPU shortage subsides, there may be short-to medium term downward pressure on NVIDIA, but the long term for AI chips manufacturers remains incredibly bright.
- Open source AI: too much of a good thing?
This one is just to stir a pot a little bit. We’re huge fans of open source AI, and clearly this has been a big trend of the last year or so. Meta made a major push with its Llama models, France’s Mistral went from controversy fodder to new shining star of Generative AI, Google released Gemma, and HuggingFace continued its ascension as the ever so vibrant home of open source AI, hosting a plethora of models. Some of the most innovative work in Generative AI has been done in the open source community.
However, there’s also a general feeling of inflation permeating the community. Hundreds of thousands of open source AI models are now available. Many are toys or weekend projects. Models go up and down the rankings, some of them experiencing meteoric rises by Github star standards (a flawed metric, but still) in just a few days, only to never transform into anything particularly usable. It’s been dizzying for many.
Our take: the market will be self-correcting, with a power law of successful open-source projects.
- How much does AI actually cost?
The economics of Generative AI is a fast-evolving topic. And not surprisingly, a lot of the future of the space revolves around it – for example, can one seriously challenge Google in search, if the cost of providing AI-driven answers is significantly higher than the cost of providing ten blue links? And can software companies truly be AI-powered if the inference costs eat up chunks of their gross margin?
The good news, if you’re a customer/user of AI models: we seem to be in the early phase of a race to the bottom on the price side, which is happening faster than one may have predicted. One key driver has been the parallel rise of open source AI (Mistral etc) and commercial inference vendors (Together AI, Anyscale, Replit) taking those open models and serving them as end points. There are very little switching costs for customers (other than the complexity of working with different models producing different results), and this is putting pressure on OpenAI and Anthropic. An example of this has been the significant cost drops for embedding models where multiple vendors (OpenAI, Together AI etc) dropped prices at the same time.
From a vendor perspective, the costs of building and serving AI remain very high. It was reported in the press that Anthropic spent more than half of the revenue it generated paying cloud providers like AWS and GCP to run its LLMs. There’s the cost of licensing deals with publishers as well
On the plus side, maybe all of us as users of Generative technologies should just enjoy the explosion of VC-subsidized free services:
Watch: MAD Podcast with Nomic
- Big companies and the shifting political economy of AI: Has Microsoft won?
This was one of the first questions everyone asked in late 2022, and it’s even more top of mind in 2024: will Big Tech capture most of the value in Generative AI?
AI rewards size – more data, more compute, more AI researchers tends to yield more power. Big Tech has been keenly aware of this, and unlike incumbents in prior platform shifts, intensely reactive to the potential disruption ahead.
Among Big Tech companies, it certainly feels like Microsoft has been playing 4-D chess. There’s obviously the relationship with OpenAI, but Microsoft also partnered with open source rival Mistral. It invested in ChatGPT rival Inflection AI (Pi), only to acqui-hire it in spectacular fashion recently. And ultimately, all those partnerships seem to only create more need for Microsoft’s cloud compute – Azure revenue grew 24% year-over-year to reach $33 billion in Q2 2024, with 6 points of Azure cloud growth attributed to AI services.
Meanwhile, Google and Amazon have partnered with and invested in OpenAI rival Anthropic (at the time of writing, Amazon just committed another $2.75B to the company, in the 2nd tranche of its planned $4B investment). Amazon also partnered with open source platform Hugging Face. Google and Apple are reportedly discussing an integration of Gemini AI in Apple products. Meta is possibly under-cutting everyone by going full hog on open source AI. Then there is everything happening in China.
The obvious question is how much room there is for startups to grow and succeed. A first tier of startups (OpenAI and Anthropic, mainly, with perhaps Mistral joining them soon) seem to have struck the right partnerships, and reached escape velocity. For a lot of other startups, including very well funded ones, the jury is still very much out.
Should we read in Inflection AI’s decision to let itself get acquired, and Stability AI’s CEO troubles an admission that commercial traction has been harder to achieve for a group of “second tier” Generative AI startups?
- Fanboying OpenAI – or not?
OpenAI continues to fascinate – the $86B valuation, the revenue growth, the palace intrigue, and Sam Altman being the Steve Jobs of this generation: