Google, which bought YouTube in 2006, has been working to improve the service, but the article highlights the huge challenges it faces. To be accurate, the software must be able to recognise a million words, the different ways they can be pronounced, and their different sounds according to context (whether a person’s voice is raised or not, for example).
It must also distinguish between speech which needs to be captioned and background noise, and the audio quality of many videos posted on YouTube is very poor.
So far, more than 60 million YouTube videos have been auto-captioned, but the results are extremely variable. Arielle Schacter, a student from New York, said, “The reality…is that auto-captioning is often wrong. Instead of being able to read the actual dialogue, I am forced to view nonsensical statements or letters/numbers.”
Google has said that the most recent version of its software has reduced error rates by 20%. Ken Harrenstien, the technical head of the project, who is deaf, is confident that auto-captioning will improve, although it “may never be perfect”.
In the meantime, owners of YouTube videos which have been auto-captioned have the option of downloading the caption files and correcting the errors.
The full article can be read on the Scientific American website.
For more information on the different ways that a YouTube video can be captioned, see the online media section of Media Access Australia’s website.
Top of page