Transcribed Podcasts and Audio Books

John Udell is taggins some of his links to podcasts with transcriptavailable, transcripts have been generated manually. This could be a nice source of data for experiments with information retrieval from podcasts.

Sort of relatedly, I just discovered LibriVox which hosts volunteer recordings of out of copyright literary works (eg. Project Gutenberg books). I sampled War of the Worlds and the quality seems great. Worth a browse.