Language Log: Dramatic reading of ASR voicemail transcription

Following up on the recent post about ASR error rates, here's Mary Robinette Kowal doing a dramatic reading of the Google Voice transcript of three phone calls (voicemail messages?) from John Scalzi: John Scalzi's reaction: All of the human experience is in there. All in one minute and eight seconds. It. Is. Magic. I've never tried getting Google Voice transcripts of voicemail, because I've been trying to ignore voicemail entirely for many years, every since my university's hospital circulated my office phone number as the fax number for submitting applications for new radiation safety badges. My voice mail filled up with hundreds of recordings of plaintive fax-machine noises, and thus became even more useless as a communications medium than voicemail normally is. I no longer even have an office phone (why pay for something I don't use?), but the habit of ignoring voicemail has stuck with me. However, I make extensive use of the the ASR "note to self" feature on my Android cell phone, and it generally works pretty well. For example, my email inbox now contains this dictated "note to self": Language Log post about dramatization of John Scalzi is voicemail messages which has got one substitution (is for 's) in 11 words, for a "word error rate" of 1/11 = 9%. [I presume that] Google's ASR system can do this sort of thing so well because Google knows a great deal about me, including my relationship to Language Log, and (probably) the fact that I recently visited John Scalzi's web site. The system is using an adaptive language model, for which the perplexity of what I said is radically lower than it would be in the case of a model of the English language at large. [Update — No, it ISN'T using a personally-adapted language model, according to a comment by Vincent Vanhoucke, who should know. That makes the performance all the more impressive, since the effective perplexity will obviously be much higher than if it were able to make use of what Google knows about me.] It doesn't always work so well — another dictated note in my email inbox reads Language Log post about the relationship between perplexity and we're there a which amusingly substitutes "we're there a" for "word error", yielding a WER more like 30%. But usually, 10% WER is about what I see, which I think is pretty good for material dictated into a cell phone in a restaurant, on a street corner, or in a moving train — and even the more spectacular errors, like that last one, generally leave the overall message interpretable (at least to me).

Language Log: Dramatic reading of ASR voicemail transcription

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112