Scripto: New Open Source Software for Creating Crowdsourced Transcription Websites

1 01 2011

I have written before about crowdsourcing the transcription of primary sources. I have posted before about Transcribe Bentham. I would now like to bring your attention to Scripto, new open-source software that allows archives and libraries to crowdsource the transcription of archival materials. This information was sent to me by Prof. Sharon Leon of George Mason University, the head of the project and I am taking the liberty of re-posting it here.

The lead programmer for Scripto is Jim Safley, who is Web Programmer and Digital Archivist for the Center. He received his undergraduate degree in history at GMU and is currently working towards his master’s degree in American history. Beginning his archiving career in 1999 at the National Archives and Records Administration, Jim moved through several related positions, including records manager at Phi Beta Kappa national headquarters and archivist assistant at GMU’s Special Collections and Archives. Arriving at CHNM in 2002, Safley applied his traditional archiving experience to his work in digital archiving, web programming, and database administration. His interests include metadata standards, database design, web technologies, progressive history and history of technology. Safley was involved in developing that September 11 Digital Archive.

The Scriptio software is currently being developed the Center for History and New Media at George Mason University for its transcription of the Papers of the War Department, 1784-1800 project. The software will then be made available for others to use and, if they wish, modify. They will be launching the tool to allow for crowdsourcing of transcription toward the end of January. After that point, they will begin work on writing connector scripts for the tool so that it can be used with common content management systems (Omeka, Drupal, WordPress, etc.).

Scripto uses the wikimedia api and editing interface and some additional scripting to capture the transcriptions and pass them back to the CMS. Thus, it provides for all of the versioning and notation capacities of wikimedia, but makes the current version of the transcription available to the main CMS for search and association with the rest of the standardized archival metadata. This is one of the differences between Scripto and the system that Transcribe Bentham is using; the Bentham project is totally contained within the wikimedia interface and has no way to export standardized metadata. Additionally, the Transcribe Bentham project has created an interface for TEI mark-up (Text Encoding Initiative) of the texts. The people at the Scripto project have not added this modification to their use of wikimedia, but since the tool is open source, another programmer could add that modification on a individual basis or could release a plugin for our system.

Actions

Information

Date : January 1, 2011
Tags: Center for History and New Media at George Mason University, digital humanities, Scripto, Transcribe Bentham, Transcription in a Digital World, War Department Papers
Categories : Uncategorized

The Past Speaks