Voice.ai raises $6 million as its real-time voice changer nears 500,000 users

Image credits: Bryce Durbin/TechCrunch

Services like Midjourney and ChatGPT have pushed the boundaries of how AI can create images and text from basic text messages. Now, audio appears to be the inevitable next frontier. The generation of music based on word prompts, AI tutors for language learning and speech simulators have seen developments in recent months. Voice.ai hopes to be part of that conversation (heh) with technology that allows users to change (and disguise) their voices in real time, and now it’s raised its first external funding on the back of initial growth.

With more than 480,000 users and a library of over 50,000 voice filters, Voice.ai has raised $6 million, funding it plans to use to take its voice-changing technology to new places.

Mucker Capital and M13 lead the round. Before now, Voice.ai grew by word of mouth, the startup has a Discord channel with over 120,000 people thanks to $3 million in self-funding.

Currently the company’s tools available as apps for Mac, PC, Android and iOS are being adopted by gamers, content creators, Vtubers and others on TikTok, Zoom, Discord, Minecraft, GTA5, Fortnite, Valorant, League of Legends, Among Us , Skype , Whatsapp and other platforms. The Voice.ai interface allows them to create a new voice or select from approximately 50,000 different pre-created voices (created and shared by users like them), which can be used as-is or modified, for live use in supported platforms, or for registrations.

The plan is to use the funding to hire more technical talent and build new SDKs and APIs to work with other platforms like Meta, Unreal, and Unity; activate multilingual support; and add new applications like singing where the voice takes center stage.

The startup doesn’t identify it, but it will be interesting to see if it will also use part of the funds to increase server capacity.

It is no small burden. Anecdotally, we’ve heard that the GPU issue is one of the biggest controlling factors in how many AI apps are able to scale right now. (It’s partly why you’re seeing big deals that include strategies that provide processing and server capacity.)

For Voice.ai specifically, your voice is processed locally and channeled wherever it will be used through what founder and CEO Heath Ahrens described to me as a virtual audio cable. But when you look at reviews of its apps, a common complaint is that when you sign up you’re placed on a waitlist because the overwhelming demand has our servers at capacity with the promise that you’ll be notified when the service increases that capacity.

There are dozens of text-to-speech and voice-to-speech services on the market today, and already a lot of activity between them: Spotify acquired Sonantic last year, and Snap bought an AI voice assistant even earlier; another startup Sanas is working on changing your accent and there are voice simulators Murf and Acapela, among many others. Voice.ai falls into the same general category as Respeecher and ElevenLabs, two voice-to-voice AI startups, which allow users to apply masks to completely change or transform their voices, in some cases creating fully synthetic voices instead of real ones .

Respeecher, founded and based in Ukraine, has made a name for itself by helping build a new Darth Vader voice for new Star Wars installments, based on how James Earl Jones sounded 45 years ago when he originated the role. (In keeping with a character hell-bent on destroying worlds, Darth’s voice was delivered to the Hollywood client from his offices in Ukraine as Russia marched into the country.)

ElevenLabs has famously (or infamously as the case may be) built a platform that’s scarily good at cloning rumors, and earlier this month it raised its most recent $19 million funding round from a group of big investors.

Voice.ai is trying, in that mix, to position itself as the AI ​​voice editing app for Everyman.

There are a lot of companies looking to bring a different flavor of voice technology to businesses, Ahrens told TechCrunch in an email (ironically, a live interview with him could not be arranged). Ahrens has some experience building B2B AI technology: His two previous companies m iSpeech for text-to-speech and Haystack for facial recognition are built around API offerings.

What sets Voice.ai apart is that we focus on bringing technology that was previously reserved for enterprise businesses right into the hands of consumers in an affordable way. Many users, he noted, come to us from the classic DSP voice changers and voice modulators they used to use in the past and which are still popular with many gamers and streamers.

Affordable comes in two tiers, with most users now on a free service requiring them to provide computational power to train Voice.ais models, with its service being built on their own private data set consisting of millions of unique users. No prices are provided on the site – we ask for those details.

We believe in making technology accessible and plan to work with the open source community to democratize VoiceAI technology, added Ahrens.

Voice.ai also says it takes a fundamentally different approach to the challenge of changing a voice, tapping into some of the ethos that has built up around the use of avatars by Vtuber, gamers, and others online.

Most voice AI companies entering the space are looking to build enterprise-focused, scalable text-to-speech solutions or costly voice-to-voice services for production studios, Ahrens said. We start on the opposite spectrum and try to offer value to people who are looking to expand the way they play online. The core value proposition of our spoken word is not that it can replicate a particular person perfectly. It is that it preserves the fundamental elements of a user’s speech: their emotion, rhythm and emphasis while replacing the sound of the voice, in order to create a completely unique new end result, in real time.

It may be due to the way demographics in interactive platforms like gaming skew, but for now Voice.ais’ audience is 70% male versus 30% female with new categories opening up not only around who is using the technology, but also why.

This includes not only those who use avatars and build voices to match them, or those seeking greater privacy protections, but also, she said, transgender users who can represent themselves with voices that match their identities, as well as users who explore online personas fully new. for them.

There is already a user base that taps into Voice.ais’ direct-to-consumer offerings, but one reason Mucker is investing in the startup is because he believes there is an opportunity to build a network of developers who use and complement his technology.

Voice.ai is poised to revolutionize the AI ​​developer community in a way similar to AdMob’s impact on the mobile app developer community, said Omar Hamoui, partner at lead investor Mucker Capital. (Hamoui previously founded the mobile advertising startup Admob, which was later acquired by Google, so he has first-hand experience building tools for mobile developers.) Offering user-friendly solutions that were once exclusive to large companies, Voice.ai aims to democratize access for developers worldwide.

Karl Alomar, the former COO of Digital Ocean who led the investment for M13, said investors will take an active role in the next phase of development. Also at Digital Ocean we saw the value of building a builder community by builders, he said. We are excited for creators and developers to build on the Voice.ai platform.

#Voice.ai #raises #million #realtime #voice #changer #nears #users
Image Source : techcrunch.com

Leave a Comment