Generative AI that can do WHAT?

on

|

views

and

comments


It’s just been the first anniversary since ChatGPT burst into the world and launched the term ‘generative AI’ into the mainstream. It’s been a big year for AI. While the year end numbers are still being calculated there’s been over $15.2 billion invested in generative AI startups globally in the first half of 2023 according to Pitchbook. The figures VC investments in generative AI for the whole of 2023 are expected to be way over $20 billion.

Much of the funding was raised by a handful of companies building their own LLMs and foundational models (including OpenAI, Anthropic, Cohere, Mistral AI, Stability AI etc), there’s been a number of application layer unicorns created including Character.ai, Runway ML, Synthesia, Hugging Face and others.

The pace of technological advancement has been frantic. From Open Source LLMs, to new APIs, both startups and tech giants continue to push the envelope on what is possible to automate and improve using generative AI technologies.

But while the main tools like ChatGPT or Dall-e 3 are well known, there’s a very long tail of products, projects and research papers that while relatively unknown, they will drop your jaw off. Let me dive in to a few examples in the world of video.

Animate Anyone by Alibaba

Animate Anyone is an SDK created by the AI research team of Alibaba that can animate a single picture into a dancing character video with remarkable consistency and control. To explain the potentials implications of this consider that TikTok has 400 million videos uploaded to its platform daily. How many of those are of people dancing? It’s hard to know for sure, but this can transform content creation for social media.

Another example of cool new tech that is relatively obscure is the “Seamless communication” models by Meta AI. This suite of AI language translation models not only enables characters to seamlessly have a character speak in another language, but also to keep its tone of voice, pauses, and emphasis. It also helps preserve facial expressions and improve streaming.

Gaussian Avatars by the Technical University of Munich and Toyota

Gaussian Avatars are Photorealistic Head Avatars with Rigged 3D Gaussians. The avatars are edited and rendered in realtime. While the technology is still not 100% reliable, you can imagine what a potential nightmare in can create for people impersonation on video calls, or ads…

The list of examples, with really incredible results, goes on and on. While each of these demos has a lot of promise, their significance is not in the specific tech features, but in what they represent for the generative AI space as a whole.

The big tech giants have the 3 pre-requisites for AI innovation

Incumbents have a lot of power when it comes to AI research – to create ground breaking technology in generative AI, companies need 3 things:

1) access to top notch AI researchers – $$$

2) access to vast amounts of data – $$

3) access to Nvidia GPUs and cloud resources – $$$$ (it’s expensive to train a model and subsequently to offer it to the public)

All of these lend themselves well to large tech companies. Alibaba, Google, Microsoft, Amazon, and even companies like Toyota, can satisfy all three criteria. But for startups, it’s a difficult feat unless they have access to deep pulls of capital, hence the hundreds of millions rounds raised by Anthropic, Mistral, Runway ML, etc.

It’s tricky for investors to allocate in this space

The biggest risk for investors allocating capital in the generative AI space (apart from FOMO driven decisions) is the risk of commoditisation. The space is moving so quickly, that what is novel today, becomes abundantly available tomorrow. There are many examples already. Such as services that offered AI generated avatar pictures for a fee. While there may still people paying $19 for a picture, it won’t take long until they learn they can do it for free (or for the same price of a pro subscription for OpenAI’s Dall-e 3).

In addition, investors should care about where the data to train the models came from. There’s a reason why OpenAI offers to cover the legal fees of business customers sued for copyright infringement. In doing so, OpenAI joins IBM, Microsoft, Amazon, Getty Images, Shutterstock and Adobe who’ve also explicitly said they’ll indemnify generative AI customers over IP rights claims.

Asking for forgiveness rather than permission might work for startups, but definitely an inhibitor to adoption for enterprise clients. A bank for example, wouldn’t risk a lawsuit for using pirated content. Several lawsuits are currently in motion and might create precedents for the future. Despite Biden’s executive order on AI, stipulating that training data should be licensed, this area is still a mess.

Regulation will have a big impact on the generative AI space

There’s not a question of ‘IF’ generative AI will be regulated, but the question of ‘HOW’ is still wide open. The UK government recently hosted an AI safety summit in Bletchley Park, and the EU is about to launch its AI Act, a well intended set of rules, that companies will struggle to keep, therefore forcing products not to be active in the market and inhibiting innovation.

In addition, experts already warn that we don’t have the guardrails in place for companies to rush into deploying AI. Hackers and bad actors are already leveraging generative AI technology for nefarious reasons including spam, phishing and impersonation.

Open Source might be the future, but it faces big hurdles

Tools like Chatbot Arena are useful in comparing the quality of results on various models. For example, it’s still pretty clear that ChatGPT pro, based on GPT-4, the largest language model commercially available, is better than Anthropic (although the latter is catching up), which is in turn better than LLaMa-2 etc. But companies like Mistral are claiming that new LLMs can be much smaller, more accurate when focused on specific tasks, and fully open sourced.

Mistral AI believes in the democratization of AI and the power of open collaboration. They aim to make their models and research readily available to the public, fostering innovation and accelerating the development of beneficial AI technologies. But companies like OpenAI and Anthropic advocate that we should be careful on what companies get to train new models. This could create a situation in which regulation is helping the incumbents concentrate power and slow down open source development significantly.

Listen to Anthropic’s CEO talk about the challenge of Open Source:

It’s hard to make money

While OpenAI is reportedly on track to close on $1 billion in revenue in 2023, many companies in the pace, primarily in the application layer (i.e. not developing the generative AI tech directly, but rather using an API and building a wrapper product) have struggled to maintain the revenue over time. Pay attention to what I said – it’s not that they struggled to generate revenue altogether – some made a quick buck by being first to the market or offering a novel use of the tech. But many also suffered from churn just as quickly, as their services become commoditised.

A good exception to this challenge is Jasper. The company was valued at $1 billion just before of ChatGPT and it’s safe to assume they’ve lost a lot of customers to free alternatives. But they are still in business. The primary reason is that they’ve been able to build workflow automations that save time, and streamline the task automation for their clients. I suspect a lot of companies will choose to adopt similar tactics to survive, the question is, would that be enough?

We need to talk about hallucinations

While incredibly impressive, no model is free of limitations. One of the major limitations of LLMs is the risk of model hallucinations. Read: the model’s tendency to write very convincingly answers, that are totally wrong/ made up. While products like Anthropic’s Claude.ai claim to reduce hallucinations

We need to talk about AGI, and its risks

While some like Meta’s head of AI Yan LeCun downplay the existential threat AI could pose on humanity, others, like OpenAI’s co-founder and CTO Ilya Sutskever are pretty certain that we’re on the path to achieving AGI (artificial general intelligence), a tool so powerful that will be able to teach itself and ‘think’ for itself in the foreseeable future (no specific timeline, but assume a decade to be pragmatic). Ilya suggests that the AI might treat humans as inferiors, akin to how humans treat animals. A lot of what happened behind closed doors that led to the board firing (and later the investors re-instating) Sam Altman as CEO, I suspect it had to do with the path and timeline to AGI.

Once we achieve AGI, most of the technologies that have been developed (and funded) before that have the risk of becoming obsolete. Let alone incumbent technology products that have yet to be powered by AI. While regulation is trying to address some of these challenges, there are not enough people, professional bodies and companies talking about the risks and need to balance innovation with safeguarding measures.

***

If you read this far, allow me to indulge in a short self-pitch. Given our background in media and entertainment at Remagine Ventures, we always cared about technology that automates content creation, distribution and monetisation. Combined with a natural curiosity, that led us to make our first generative AI investments in 2019. We’ve since invested in 5 other generative AI startups, including companies developing foundational models as well as fast-growing startups in the application layer. We’re primarily focused on Israel and the UK. If you have an original approach, and building the next big thing in the generative AI space, we’d love to talk to you and give you a friendly investor perspective.

Eze is managing partner of Remagine Ventures, a seed fund investing in ambitious founders at the intersection of tech, entertainment, gaming and commerce with a spotlight on Israel.

I’m a former general partner at google ventures, head of Google for Entrepreneurs in Europe and founding head of Campus London, Google’s first physical hub for startups.

I’m also the founder of Techbikers, a non-profit bringing together the startup ecosystem on cycling challenges in support of Room to Read. Since inception in 2012 we’ve built 11 schools and 50 libraries in the developing world.

Eze Vidra
Latest posts by Eze Vidra (see all)



Share this
Tags

Must-read

The Great Bitcoin Crash of 2024

Bitcoin Crash The cryptocurrency world faced the hell of early 2024 when the most popular Bitcoin crashed by over 80% in a matter of weeks,...

Bitcoin Gambling: A comprehensive guide in 2024

Bitcoin Gambling With online currencies rapidly gaining traditional acceptance, the intriguing convergence of the crypto-trek and gambling industries is taking place. Cryptocurrency gambling, which started...

The Rise of Bitcoin Extractor: A comprehensive guide 2024

Bitcoin Extractor  Crypto mining is resources-thirsty with investors in mining hardware and those investing in the resources needed as the main beneficiaries. In this sense,...

Recent articles

More like this