You may have heard of deepfakes of images and videos. Those eerily realistic videos made with artificial intelligence? Now, Meta (formerly Facebook) appears to have developed a new artificial intelligence model called Voicebox that’s all about audio. It’s like a powerful text-to-speech system that can create synthetic speech based solely on text prompts.
What is a voice box?
At the heart of Voicebox is an artificial intelligence model that creates synthetic speech from simple text prompts. In other words, you give it some text and it reads it aloud in a human-sounding voice. It’s similar to the text-to-speech feature you might be using on your phone or computer, but it takes things to a whole new level.
What sets Voicebox apart is its ability to replicate a specific speech style based on very short audio samples – we’re talking just two seconds! This means you could have a synthetic voice that sounds like your favorite celebrity or even yourself. It’s almost like having an on-demand voice actor ready to read whatever you want in the voice style of your choice.
Competing AI Speech Models
Speechify and ElevenLabs are also players in the text-to-speech game. Speechify is an app that converts any text into audio. It can read books, articles, notes, emails, PDFs, images and web pages aloud. Speechify also claims to offer voice cloning, voice editing, and voice sampling features. Speechify offers hundreds of free timeless audiobooks, with a desktop app designed to help people with dyslexia.
Mark Zuckerberg’s ‘Twitter killer’ post angers users over massive data collection: ‘Privacy is close to zero’
ElevenLabs, on the other hand, is a startup that uses artificial intelligence to generate synthetic speech with context-sensitive emotions and natural language understanding. They provide a platform for creating and customizing high-quality spoken audio in any voice and style for industries as diverse as video games, animation, digital assistants, education, entertainment, advertising, and podcasting. They also have a tool for detecting synthetic voices and verifying their authenticity. ElevenLabs works with actors who provide voice samples and gets paid when their voice clones are used. They use a proprietary deep learning model to create AI speech.
They’re all cool, but they don’t have the same versatility as Voicebox, which can mimic a real voice from just a few seconds of audio. It’s like comparing a swiss army knife to a couple of really good spoons. They both have their uses, but one of them definitely has more uses.
The Power of the Voice Box
But it’s not just about creating fake voices. Voicebox can also clean up your audio by removing annoying background noise – say, a dog barking when you’re trying to record. This is not just an English problem. The AI also speaks French, Spanish, German, Polish, and Portuguese, and can even translate passages from one language to another while maintaining the same speech style.
Go away, Siri: Apple’s new audiobook AI voice sounds human
Meta’s Voicebox: Breakthrough or Threat?
Unfortunately, or fortunately, Meta doesn’t plan to open-source Voicebox right away, depending on where you stand on AI. This makes people wonder if they are trying to avoid some potential problems. For example, AI voice technology could be used negatively, such as in harassment campaigns. Alternatively, Meta may have some future plans to monetize this model.
Voicebox massive training data source
One of the interesting things about Voicebox is that it was trained on a huge amount of data — more than 60,000 hours of audiobook utterances in English and another 50,000 hours of audiobook utterances in multiple languages. Meta says they use public domain audiobooks as their primary data source, but they also use other sources such as podcasts, talks and radio shows. However, there are some challenges and limitations to using public domain audiobooks, such as quality, consistency, coherence, and speaker identity. Meta claims that they have addressed some of these issues through data processing and model design.
for more of my security alert, To subscribe to my free CYBERGUY Reports newsletter, go to CYBERGUY.COM/Newsletters
The double-edged sword of technology
Obama company tore up ‘stupid’ court order after judge blocks Biden executive’s communications with social media firm
The rise of AI voices is a somewhat touchy subject, especially for voice actors and more recently writers. They worry that companies will use AI to synthesize their voices without paying them. The audiobook market has been growing significantly, and companies have been looking to cut costs, so this may end up being another problem for voice pros.
Make no mistake, though; it’s not just about work. People do worry about how deep fake voices are used in scams. For example, there was one case where a synthetic voice impersonating a CEO was used during a major robbery. There are also concerns that deepfake voices could be used to jam systems such as voice biometric systems, which are used in systems such as online banking.
You see, as cool as this technology sounds, it also has a dark side. Imagine getting a call from your boss asking you to transfer a large sum of money to close your account. You do what you’re told because, well, it’s your boss. but it is not the truth. That’s right; it’s a fake synthetic voice created using artificial intelligence to sound just like your boss. Wild, isn’t it? But that’s not the plot of the movie; it’s the plot of the movie. It really happens! It’s the first time a fake voice has been used in a heist, leaving law enforcement and artificial intelligence experts scratching their heads.
DALLE-2 VS. BING CREATOR – Who will stand out in this artificial intelligence showdown?
It’s not just about robbery. Deepfake speech can be used to trick systems that rely on speech recognition. We’re talking about things like online banking that uses your voice as a form of identification. If criminals are able to create a convincing fake voice for you, they could potentially gain access to your account. It’s a bit like forging a signature, but with your voice.
Countering the Deepfake Threat
So while we marvel at the amazing things technology can do, it’s also important to be aware of the potential risks and stay one step ahead. It’s a high-tech game of cat and mouse, with artificial intelligence experts and companies working to spot and stop these deepfake voices before they can do any harm.
Fortunately, there are attempts to fight back against the potential misuse of deepfake voices. For example, some countries have begun to pass laws to regulate deepfakes.Additionally, there are projects like the Automatic Speaker Verification Spoofing and Countermeasures Challenge (ASVspoof), where scientists and engineers are working on ways to combat deepfake speech attacks
Kurt’s key takeaways
We’re living in an age where technology is advancing at breakneck speed and changing the way we work, communicate and even listen. While the potential of AI like Meta’s Voicebox is certainly exciting, it’s clear we also need to proceed with caution. There’s a fine line between innovation and intrusion, and we’re still finding the balance.
Click here for the Fox News app
With all these advances and potential risks, what do you think about the future of artificial intelligence and deepfake technology? Do you think this is a boon or a curse?please write to us Cyberguy.com/Contact
For more of my security alerts, subscribe to my free CyberGuy Reports newsletter by going to Cyberguy.com/Newsletter
Copyright 2023 CyberGuy.com. all rights reserved.