ã€Exclusive secret technology behind the Google Voice Assistant Duplex technology, really as the outside world rumors of mentally handicapped Siri it? Google's 2018 Developer Conference (Google I/O2018) introduced many new products and features such as AndroidP, Gmail, Gboard, and TPUv3, and it was undoubtedly the new addition of Duplex to personal assistant Google Assistant. Stores such as restaurants and hair salons call to help users make appointments.
In the past few days, the media has reported and marveled, even talking about Siri. Xiao Bian found the blog of Yaiv Leviathan, the chief engineer of engineers from Google, and Yossi Matias, the vice president of engineering, on Google's AIblog blog. They revealed the technology that Duplex used in the blog.
Google duplex is a tool that is used in specific areas to perform tasks by making calls.
Specific fields are, for example, booking a restaurant, booking a haircut, and the like. Google duplex can perform very natural human-computer conversations that sound very natural, just as real people are making phone calls.
Google duplex uses technology
With recent technological developments in language understanding, interaction, time control, and speech generation, Google Duplex's conversation sounds quite natural.
In order to deal with the challenges mentioned above, the heart of Duplex is a RNN network, which is built by TensorFlowExtended (RFX). To achieve high accuracy, Google trained Duplex's RNN network with anonymous phone conversation data.
This network will use Google's automatic speech recognition (ASR) recognition results text, but also use the features in the audio, dialogue history, dialogue parameters (such as the service to be booked, the current time) and so on.
Google trained different models of understanding for each different task, but there are also some training materials shared between different tasks. Finally, Google has further refined the model by using TFX's hyperparameter optimization.
The input speech is first processed by the automatic speech recognition system (ASR), and the generated text is input to the RNN network together with the context data and other input, and the generated response text is read out by the text-to-speech (TTS) system.
In summary, the techniques used by Google duplex include:
1. Use Google's own ASR (speech recognition) technology to convert the voice of the dialogue party into text;
2. Use Tensor Flow to build a model based on RNN (Recurrent Neural Network). Based on the anonymous phone conversation data corpora, Duplex training is performed. A trained model can generate corresponding text responses based on the speech converted into words by the dialogue party.
3. Using the integrated TTS engine (Tacontron and WaveNet) to translate the text generated by the deep learning model into speech as a final dialogue response;
4, duplex can be used with Googleassistant, Googleassistant can call duplex in the background, the implementation of the task.
Google duplex conversation is very natural reason
Google jointly uses a cascading TTS engine and a generative TTS engine (which uses Tacotron and WaveNet) to control the tone of speech according to different contexts.
This system can also generate some modal words (such as "hmmm", "uh"), which also makes speech more natural. When cascading TTSs need to combine very varied speech units, or need to increase the pauses generated, the modal words are added to the generated speech, which allows the system to indicate to each other in a natural way. "I'm listening," or "I'm still thinking about it." (Many words are often spoken while people are talking.) Google’s user survey also confirmed that humans feel more familiar and natural conversations with modal words.
On the other hand, the system's delay must also meet human expectations. For example, when a person speaks a simple sentence like “hello†on the phone, they will hope to hear a brief reply soon, and this time will be more sensitive to delay. When the AI ​​system detects a situation that requires a short delay, it uses a faster but less accurate model. In some extreme cases, the system will not even wait for the RNN to operate. Instead, it will use the fast approach model directly (usually in combination with a slower formal response, just as humans will hesitate when they do not fully understand the other party). ).
This approach allows the system to achieve very short delays within 100ms. Interestingly, Google has found that in some cases it is necessary to add some delay to make the dialogue sound more natural, such as when replying to a very complicated sentence.
In summary:
1. Duplex is limited to use in specific areas, which allows technical personnel to carry out detailed technical design for the field, so as to achieve targeted and very natural results;
2. The input of the neural network model is not only the text result of ASR, but also includes the history of the dialogue, so that the model can better understand the context of the dialogue and generate a more accurate response;
3, Duplex will use "ah", "ah", etc. to express pauses, mood words, or extend certain words, as if it is taking time to think of an answer, making the voice response sound more natural; because people are doing During the dialogue, it is sometimes expected to respond quickly and promptly. For example, “Hello?†In similar extreme cases, duplex will not even wait for the response of the deep learning model, but use a faster similar response. Thus, It makes the response more natural.
It is reported that: This summer, Google will start testing Duplex based on Google Assistant, from the restaurant reservations, booking salons, asking questions about the business hours of the holidays began.
All in one pc is a new trend for desktop type computer nowadays. What you can see at this store is Custom All In One PC. There are 19 inch all in one pc, 21.5 All In One PC, All In One PC 23.8 Inch and 27 inch all in one pc, which are the main sizes at the market. How to choose the most suitable one for special application? According to clients` feedback, 19.1 inch entry level, 21.5 inch middle and low level, 23.8 or 27 inch higher level-All In One PC I7. Some clients may worry the heat-releasing since equipped releasing fan into the back of monitor, see no releasing fences on back cover. However, totally no need worry that point, cause special back cover material and releasing holes can meet the demand of heat releasing.
You can see All In One Business Computer, All In One Gaming PC, and All In One Desktop Touch Screen series at this shop.
Any other unique design or parameters, just feel free to contact us so that can get right and value information quickly.
Believe will try our best to support you!
All In One PC,All In One Pc I7,Custom All In One Pc,All In One Pc 23.8 Inch,21.5 All In One Pc
Henan Shuyi Electronics Co., Ltd. , https://www.shuyioemelectronics.com