Crowdsourcing The Transcription of Archival Materials

28 12 2010

The New York Times has published an interesting article about the use of crowdsourcing to transcribe digital material. The article discusses Transcribe Bentham and other crowdsourced transcription projects. It also mentions that Sharon Leon, a historian at the Center for History and New Media at George Mason University, recently received a grant from the National Endowment for the Humanities to design a free digital tool that any archive or library could use to open transcription to the public.

The article also quotes people who are skeptical of the whole idea of crowdsourcing the transcription of primary sources. The critics of crowdsourcing transcripton work to online volunteers include Daniel Stowell, who has been director and editor of the Papers of Abraham Lincoln project in Springfield, Illinois since 2000. Unlike Sharon Leon’s effort, Stowell’s initiative is a traditional transcription project done by a small number of employees who have been trained in paleography.  Like Sharon Leon’s project, Dr. Stowell’s transcription is funded by the NEH.  Stowell reports that his office experimented with the hiring of an unspecified number of “nonacademic transcribers”, but they produced so many transcription errors that “we were spending more time and money correcting them as creating them from scratch.” He also reports that when tens of thousands of unpublished and rarely seen documents written by or to Lincoln were digitally scanned in advance of the Lincoln bicentennial celebration in 2009, the National Center for Supercomputing Applications at the University of Illinois, Urbana-Champaign, created a prototype for crowd-sourced transcription. This prototype was ultimately abandoned for reasons not specificied in the article.