New Video on Transcribe Bentham

13 02 2011

2010: The Year of Crowdsourced Transcription

4 02 2011

2010 was the year that collaborative manuscript transcription finally caught on, according to a recent blog post by Ken Brumfield.

Brumfield states that:

Probably the biggest news this year was TranscribeBentham, a project at University College London to crowdsource the transcription of Jeremy Bentham’s papers. This involved the development of Transcription Desk, a MediaWiki-based tool which is slated to be released under an open-source license. The team of volunteers had transcribed 737 pages of very difficult handwriting when I last consulted the Benthamometer. The Bentham team has done more than any other transcription tool to publicize the field — explaining their work on their blog, reaching out through the media (including articles in the Chronicle of Higher Education and the New York Times), and even highlighting other transcription projects on Melissa Terras’s blog.


I agree that Transcribe Bentham was the big digital humanities news story of 2010.


Scripto: New Open Source Software for Creating Crowdsourced Transcription Websites

1 01 2011

I have written before about crowdsourcing the transcription of primary sources. I have posted before about Transcribe Bentham. I would now like to bring your attention to Scripto, new open-source software that allows archives and libraries to crowdsource the transcription of archival materials. This information was sent to me by Prof. Sharon Leon of George Mason University, the head of the project and I am taking the liberty of re-posting it here.

The lead programmer for Scripto is Jim Safley, who is Web Programmer and Digital Archivist for the Center. He received his undergraduate degree in history at GMU and is currently working towards his master’s degree in American history. Beginning his archiving career in 1999 at the National Archives and Records Administration, Jim moved through several related positions, including records manager at Phi Beta Kappa national headquarters and archivist assistant at GMU’s Special Collections and Archives. Arriving at CHNM in 2002, Safley applied his traditional archiving experience to his work in digital archiving, web programming, and database administration. His interests include metadata standards, database design, web technologies, progressive history and history of technology. Safley was involved in developing that September 11 Digital Archive.

The Scriptio software is currently being developed the Center for History and New Media at George Mason University for its transcription of the Papers of the War Department, 1784-1800 project. The software will then be made available for others to use and, if they wish, modify. They will be launching the tool to allow for crowdsourcing of transcription toward the end of January.  After that point, they will begin work on writing connector scripts for the tool so that it can be used with common content management systems (Omeka, Drupal, WordPress, etc.).

Scripto uses the wikimedia api and editing interface and some additional scripting to capture the transcriptions and pass them back to the CMS.  Thus, it provides for all of the versioning and notation capacities of wikimedia, but makes the current version of the transcription available to the main CMS for search and association with the rest of the standardized archival metadata.  This is one of the differences between Scripto and the system that Transcribe Bentham is using; the Bentham project is totally contained within the wikimedia interface and has no way to export standardized metadata.  Additionally, the Transcribe Bentham project has created an interface for TEI mark-up (Text Encoding Initiative) of the texts.  The people at the Scripto project  have not added this modification to their use of wikimedia, but since the tool is open source, another programmer could add that modification on a individual basis or could release a plugin for our system.

Crowdsourcing The Transcription of Archival Materials

28 12 2010

The New York Times has published an interesting article about the use of crowdsourcing to transcribe digital material. The article discusses Transcribe Bentham and other crowdsourced transcription projects. It also mentions that Sharon Leon, a historian at the Center for History and New Media at George Mason University, recently received a grant from the National Endowment for the Humanities to design a free digital tool that any archive or library could use to open transcription to the public.

The article also quotes people who are skeptical of the whole idea of crowdsourcing the transcription of primary sources. The critics of crowdsourcing transcripton work to online volunteers include Daniel Stowell, who has been director and editor of the Papers of Abraham Lincoln project in Springfield, Illinois since 2000. Unlike Sharon Leon’s effort, Stowell’s initiative is a traditional transcription project done by a small number of employees who have been trained in paleography.  Like Sharon Leon’s project, Dr. Stowell’s transcription is funded by the NEH.  Stowell reports that his office experimented with the hiring of an unspecified number of “nonacademic transcribers”, but they produced so many transcription errors that “we were spending more time and money correcting them as creating them from scratch.” He also reports that when tens of thousands of unpublished and rarely seen documents written by or to Lincoln were digitally scanned in advance of the Lincoln bicentennial celebration in 2009, the National Center for Supercomputing Applications at the University of Illinois, Urbana-Champaign, created a prototype for crowd-sourced transcription. This prototype was ultimately abandoned for reasons not specificied in the article.

Primary Sources Are Going Online

29 11 2010


Library and Archives Canada Building, 395 Wellington Street, Ottawa, Ontario

Library and Archives Canada recently completed the digitization of the papers of Sir John A. Macdonald.

Macdonald in 1883. Image from LAC. Mikan: 3218716


"Come Into My Office" Image of the Office of Sir John A. Macdonald

Previously, scholars wishing to look at the correspondence of Macdonald had to look a microfilms of the originals. There is now a database online that allows you to download images of the correspondence in PDF format.

The search engine for the Macdonald correspondence looks like this:

I have pasted an image of an actual document in the Macdonald correspondence below. In this case, it is a rare letter that Laurier sent to Macdonald.

Laurier to Macdonald, 7 February 1884

LAC’s wonderful decision to put the Macdonald papers online is part of a growing trend to digitize primary sources and place them online. The Library of Congress has put Abraham Lincoln’s Papers online. See here.

The wonderful thing about the LoC’s Lincoln Papers search engine is that you can view both images of the primary sources as well as plain text transcriptions of each item of correspondence. For instance, I found this letter from a private citizen in Canada to Lincoln dated 25 Feb 1863.

Here is the transcription of the letter, which was completed the folks at the Lincoln Studies Center, Knox College. Galesburg, Illinois.

P. Tertius Kempson to Abraham Lincoln, Wednesday, February 25, 1863 (Support and autograph request from Canada; endorsed by Elbridge G. Spaulding)

From P. Tertius Kempson to Abraham Lincoln, February 25, 1863

Fort Erie C. W.

Feby 25th 1863.

Honoured Sir,

Englishmen and Canadians are charged that their sympathies have been with the Southern Rebellion and Slavery and my cheeks flush with shame for my countrymen, when I own that this has been too much the case– Thank God, there are numerous glorious exceptions and as a proof of this I take the liberty of sending you a Copy of a Speech delivered recently by the foremost man in Canada and I am happy in being able to assure you that it contains the sentiments and views of thousands of Canadians and millions of British Subjects;

Yes! honoured Sir, you have our earnest and most constant prayers that you may entirely succeed in ridding the Great and Glorious Union of the foul Canker worm of Slavery.

I had the honour and happiness of a personal introduction to you when you passed through Buffalo; May I ask you to enable me to perpetuate the remembrance of yourself and the honour I then enjoyed by giving me a line or two in autograph that I may be able to leave to my children & my childrens children, as a heir loom in remembrance of the great apostle of Liberty of the 19th Century–

By confering upon me this small favor, I shall ever be yours most respectfully & gratefully

P. Tertius Kempson

Another wonderful recent initiative is the Transcribe Bentham project, which seeks to transcribe the papers of Jeremy Bentham, the great philosopher. In this case, the transcription is being done by crowdsourcing. Image of all of the correspondence in the Bentham collection was placed online on a website that allow interested members of the public to try their hands at transcribing the documents. The results are monitored by trained archivists and paleographers to maintain quality control.

Transcribe Bentham Project