2020-06-08 21:53:49 来源 : P生活画 点击 : 796
苹果、亚马逊、微软以及 Google 都提供语音助理服务，孰优孰劣？根据路透社报导，苹果的语音助理 Siri 在辨识语音、回答问题方面或许不再具优势，但 Siri 一大优势是能说最多种语言，现在正要学习说上海话，我们来看看它怎幺做到。进入本文前，想想以下单字英文怎幺说：
量身订做 含糊的 规模化
The voice-assistant wars arein full swing, with Apple, Amazon, Microsoft and now Google all offering electronic assistants to take your commands.
Many researchers believe that Apple has squandered its lead when it comes to understanding speech and answering questions. However there is at least one thing Siri can do that the other assistants cannot: speak 21 languages localized for 36 countries, a very important capability in a smartphone market where most sales come from outside the United States.
许多研究人员认为在语音辨识和回答问题方面，苹果的领先优势已消耗殆尽，不过有件事目前只有 Siri 做到：说 36 个国家的 21 种语言。此功能在智慧手机市场极为重要，因为大部分智慧手机都销往美国以外地区。
Microsoft Cortana, by contrast, has eight languagestailored for 13 countries. Google's Assistant, which began in its Pixel phone but has since moved to other Android devices, speaks four languages. Amazon's Alexa features only English and German. Siri will even soon start to learn Shanghainese, a special dialect of Wu Chinese spoken only around Shanghai.
微软 Cortana 为 13 个国家制定了 8 种语言。Google 助理会说 4 种语言，这项服务出自 Google 自家手机 Pixel，现已开放其他 Android 系统手机使用。亚马逊的 Alexa 只会说英语和德语。而 Siri 马上要开始学上海话了，这是一种只在上海及其周边地区使用的吴语方言。
At Apple, the company starts working on a new language by bringing in humans to read passages in a range of accents and dialects, which are then transcribed by hand so the computer has an exact representation of the spoken text to learn from, said Alex Acero, head of the speech team at Apple. Apple also captures a range of sounds in a variety of voices. From there, an acoustic model is built that tries to predict word sequences.
苹果语音团队负责人 Alex Acero 说，要发展新语言功能时，会让有各种方言和口音的真人唸出文字段落，然后再手动转录，这样电脑就可以拥有準确的学习样本。苹果还会从不同的声音中捕捉各种语音，接着建立一个声学模型，以尝试预测字元序列。
Apple then deploys "dictation mode," its text-to-speech translator, in the new language, Acero said. When customers use dictation mode, Apple captures a small percentage of the audio recordings and makes them
anonymous. The recordings, complete with background noise andmumbled words, are transcribed by humans, a process that helps cut the speech recognition error rate in half.
After enough data has been gathered and a voice actor has been recorded to play Siri in a new language, Siri is released with answers to what Apple expects will be the most common questions, Acero said. Once released, Siri learns more about what real-world users ask and is updated every two weeks with more tweaks.
收集了足够资料、配音员为 Siri 录製讲新语言的声音后，Siri 即可发布。发布时，Siri 能回答出苹果预期最常见的问题。发布后，Siri 也能从用户的实际问题学习，每两周作调整并更新。
However, script-writing does notscale, said Charles Jolley, creator of an intelligent assistant named Ozlo. "You can't hire enough writers to come up with the system you'd need in every language. You have to synthesize the answers," he said.
不过，智慧助理 Ozlo 的创造者 Charles Jolley 说，撰写脚本无法规模化，「不可能聘僱够多的作者，来打造每种语言所需的系统，必须人工合成回答。」
The founders of Viv, a startup founded by Siri's original creators that Samsung acquired last year, is working
on just that. "Viv was built to specifically address the scaling issue for intelligent assistants," said Dag Kittlaus, the CEO and co-founder of Viv. "The only way to leapfrog today's limited fuctionality versions is to open the system up and let the world teach them."
「Siri 之父」的新创公司 Viv，正着手解决这个问题。这间公司去年由三星收购。Viv 的联合创始人兼 CEO Dag Kittlaus 说：「Viv 想解决智慧助理的规模化问题，想要让当今功能侷限的版本升级，唯一的方法就是开放系统，让世界来教它们。」
1. In full swing 如火如荼；全力进行
By ten o'clock, the party was in full swing.