"Celebrating 6 years of our Purpose and Beliefs Awards" | Source: Electronic Arts
Electronic Arts’ newly issued patent aims to enable players to make their in-game characters communicate using their own voices.
The patent outlines a system that involves inputting speech content data into a synthesiser module, which generates source acoustic features representing the desired voice or style.
The heart of the patent is a voice converter, which uses an acoustic feature encoder to meld the source acoustic characteristics with a target speaker embedding associated with the player, replicating the player’s voice for the in-game character.
This technology can generate speech audio in the player’s voice, allowing the character to speak with the player’s unique, personal voice, and it can even capture non-verbal elements of speech like tone, emotions, and emphasis.
The input data for this system can take the form of text, allowing players to type the dialogue they want their character to deliver, making it a more practical and versatile approach compared to traditional methods that require extensive recorded speech data.
Today, we encountered a newly issued patent by Electronic Arts titled “Generating speech in the voice of a player of a video game,” which was submitted in October 2020. This patent, made public earlier this month, outlines a technology for producing speech audio within a video game, enabling players to make their in-game characters communicate using their own voices.
“A computer-implemented method of generating speech audio in a video game is provided. The method includes inputting, into a synthesizer module, input data that represents speech content. Source acoustic features for the speech content in the voice of a source speaker are generated and are input, along with a speaker embedding associated with a player of the video game into an acoustic feature encoder of a voice convertor,” reads the patent’s abstract.
“One or more acoustic feature encodings are generated as output of the acoustic feature encoder, which are inputted into an acoustic feature decoder of the voice convertor to generate target acoustic features. The target acoustic features are processed with one or more modules, to generate speech audio in the voice of the player.”
In video games, players often create their own characters, and they might want these characters to speak in a specific voice. However, the old methods to make this happen needed a lot of recorded speech data, making them less practical. Electronic Arts introduces a system that can make in-game characters speak in a player’s voice without requiring tons of recorded speech data.
The process begins with a “synthesiser module” that takes input data representing speech content. The synthesiser generates “source acoustic features” that capture the desired voice or style from a source speaker, which could be another character or actor.
The heart of this patent is a “voice converter,” which has been trained to convert acoustic features from a source speaker into those of the player’s character. This technology enables the video game to replicate a player’s voice, giving their character a unique, personal touch.
It comprises two essential elements: an acoustic feature encoder that melds the source acoustic characteristics with a “target speaker embedding” linked to the player, essentially representing the player’s distinct voice. This encoder generates “acoustic feature encodings” for every point in the speech content, and an acoustic feature decoder receives these encodings and converts them into “target acoustic features” that mirror the speech content in the player’s voice.
After creating the target acoustic features, the technology can further process them using different modules. One key module mentioned is a “vocoder,” responsible for converting the target acoustic features into actual speech audio in the player’s voice.
The patent also notes that the input data can take the form of text, allowing players to type the dialogue they desire their character to deliver. Additionally, the system can capture the non-verbal elements of speech, such as tone, emotions, or emphasis.
In August, we came across a comparable Electronic Arts patent, which aimed to ensure that character voices in video games authentically reflect their ages as the storyline progresses. It appears that the company is dedicating significant efforts to enhance the immersive and realistic aspects of video games for players.
The patent’s potential applications are vast. Players can interact with in-game characters in their own voice, making dialogues and conversations feel more personal and engaging. This innovation could open up entirely new avenues for narrative-driven video games, where the player’s character becomes an extension of themselves.
Nevertheless, it’s essential to emphasise that the system is currently just a patented concept. Whether Electronic Arts intends to integrate this envisioned system into their existing or future video game franchises is a matter that only time will clarify.
What do you think about this? Do tell us your opinions in the comments below!
From writing short stories in his room to finding true enthusiasm for video game and computer hardware journalism, Huzaifa plays video games and write all the latest and greatest news about them. Currently pursuing a Bachelor of Science degree in Data Science, he dives deep into the news, authenticating every tiny detail to serve his audience. When he’s not breaking news, he becomes a master storyteller, conjuring up captivating tales from the depths of his imagination. With a wealth of experience as a Video Game Journalist. He has also worked with Publishers like eXputer, The Nerd Mag and Gamesual making him an expert in Gaming News Industry.