Alexa AI’s machine translation controls for formality, ranks first place
Alexa AI的机器翻译在输出正式表达方面排名第一
Amazon’s Alexa AI team was recently ranked in first place on a shared task at the International Conference on Spoken Language Translation (IWSLT), which focused on developing a machine translation (MT) system that could produce output with different levels of formality in the target language.
近日,亚马逊的Alexa AI团队在国际口语翻译会议(IWSLT)上的一项共享任务中排名第一。其目的是开发机器翻译(MT)系统,使用目标语言输出不同程度的正式表达。
Amazon’s English-to-Japanese MT model outperformed the second-place entry by nearly 10%. As with other context-dependent forms of language, such as slang, formality can be difficult for MT systems to get just right, especially since languages express formality quite differently.
亚马逊的英-日MT模式比第二名高出了近10%的输入。对于机器翻译系统来说,正式表达和俚语等其他依赖语境的语言形式一样,很难恰当的表达,尤其是因为不同语言表达正式的方式完全不同。
“Machine translation (MT) models typically return a single translation for each input, without regard to the intended use case or target audience,” the company wrote in an Aug. 15 blog post. “This kind of unconditional translation is useful in many cases but fails to account for differences in language use in different parts of the world.”
该公司在8月15日发表的一篇博客文章中称:“机器翻译(MT)模型通常为每个输入返回一个翻译,而不考虑预期的用例或目标受众。这种无限定条件的翻译可以应用于许多不同的情况,但并未考虑到全球各地区语言使用的差异。”
In the field of sociolinguistics, sentences and other utterances can be ranked according to their level of formality — for instance, the English sentence, “I am going to the store, would you like me to buy you something?” is much more formal than: “I’m heading to the store, want anything?” The former is more likely to be uttered between acquaintances or professional colleagues in a formal situation, while the latter is more likely to be uttered among friends and family members in a casual or informal setting.
在社会语言学领域,句子和其他话语表达可以根据其正式程度进行排序——例如,英语句子“I is go to the store,would you like me to buy you something?”比“I is heading to the store,want anethy?”更为正式。前者更多用于在正式场合中熟人或专业同事之间的谈话,而后者更可能出现在随意或非正式场合中朋友和家人之间的谈话。
In the shared task’s overview, the IWSLT’s organizers wrote that formality can cause difficulties in translation, since some languages may express formality in ways that others don’t — for example, the English “Are you tired?” can be translated into German using the more formal “Sind Sie müde?” or “Bist du müde,” the latter being a less formal and more familiar translation.
在共享任务的概述中,IWSLT的组织者表示,正式表达给翻译带来了困难,因为一些语言可能会以独特的方式做出正式表达——例如,英语“are you weirt?”可以用更正式的“sind Sie müde?”或“bist du müde”翻译成德语,后者是一种不太正式但更熟悉的翻译。
“Leaving the model to choose between different valid options can lead to translations that use an inappropriate degree of formality, which can be perceived as rude or jarring for speakers from certain cultures and in certain use cases, such as customer support chat,” Amazon’s blog post continues.
亚马逊的博客文章还写道:“让模型在不同的有效选项之间进行选择,可能会导致翻译使用不适合的正式程度。对于来自某些文化和某些用例的人来说,不适合的正式程度会被认为是粗鲁或不和谐的表达,如客户支持聊天。”
By annotating formal and informal language in the training data, as well as leveraging post-editing techniques, Amazon was able to create an MT system with more accuracy in producing output that matches the formality of the input. When translating from English to Japanese, the model’s formal accuracy was 95.5% and informal accuracy was 100%.
通过对训练数据中的正式和非正式语言做出注释,并利用后期编辑技术,Amazon创建了一个MT系统。该系统在产生与正式输入相匹配的正式输出方面具有更高的准确性。在英日翻译时,模型的正式输出正确率为95.5%,非正式输出正确率为100%。