I frequently like to listen to Uncontrolled Vocabulary, an LIS call-in talk show podcast run by Greg Schwartz.
When he posts each episode, Greg also posts a list of the show’s participants and summary of what was discussed- and that makes the podcast somewhat searchable. If one wants to know when gaming has been discussed, one can use the search field in the right sidebar and get results like these which show three episodes Greg noted as having discussed gaming.
Greg puts a lot of time and effort into Uncontrolled Vocabulary, but itd be much more searchable if Greg transcribed every episode and made that that transcription available for searching. It’d be even cooler if Greg indexed the transcription against timestamps in the audio files so we could jump to the point in the audio where a particular search term is spoken. However, Greg has a job, a family, and a life- so that’s just not a reasonable thing to suggest he do.
EveryZing as the Solution:
Fortunately, EveryZing is already doing it for him.
EveryZing machine-transcribes each episode of Uncontrolled Vocabulary and lets you search that transcript. When it finds your search terms, it links you to the moment in the audio where the search term is spoken.
This link will take you to EveryZing’s index of Uncontrolled Vocabulary episodes. From here you can search not just Greg’s notes on each show, but transcripts. If we use EveryZing to search for “gaming,” we can see that it is mentioned in seven episodes
Say I want to hear the moment in Episode 50 where the phrase “gaming initiatives” appears.
All we have to do is click the hyperlinked timestamp and EverZing will load that episode in a flash player and queue it up to that moment in time. Even cooler, I can embed that player at that timestamp on a Web page…like this:
Machine transcription is far from perfect and it is entertaining to see “ALA” transcribed as “Malay,” but I’m pretty impressed by the potential of this technology.
Almost two years ago, I wrote:
Eventually, the metadata of an audio file (any audio file) should contain not just a text transcript of the audio content, but searchable transcript, indexed to minutes and seconds of the audio. Lets say you want to download the latest Library 2.0 Gang podcast specifcally because you want to hear the first thing Michael Stephens has to say on the topic du jour. You should be able to search the Podcast for the word “Stephens”, select the first first hit in the returned search results, and be taken instantly to the first moment in the audio when the word “Stephens” is spoken.
Imagine the usefulness of such a feature for a clinician attempting to find specific details in a podcast he/she has downloaded.
We’re not quite there yet. The transcription is not included in the audio file itself and the portable audio players don’t yet have the software to search it- but EveryZing shows we’re definitely closer. You can search the transcription of NEJM Interviews, JAMA Audio Commentaries, Johns Hopkins PodMed, MedlinePlus: NLM Director’s Comments and others.
Other neat features of EveryZing: