Moving Towards Open Data in Business History

13 04 2018

I have long been an advocate of Open Data and of the adoption of an Open Data norm in the field of business history. I recently published a paper in the journal Business History that outlines why our field needs to adopt Open Data and a system called Active Citation. What Open Data would mean in practice is that whenever a business historian cites a primary source (e.g., a letter in an archive or an article in a historic newspaper), the footnote must include a hyperlink to a scanned image of the document. This system would have a number of advantages. First, it would accelerate the digitization of primary sources and once a primary source has been put online for one purpose, it can be re-used by another researcher. Moreover, the creation of an Open Data rule in business history would be yet another victory for the research transparency movement. In the last half decade, a variety of academic disciplines have embraced research transparency and Open Data is a big part of the research transparency movement.  The requirement that raw data be published alongside the article based on that data is designed to counteract the impression that researchers sometimes use data selectively or in an otherwise unprincipled way.


Although the impetus for research transparency and Open Data has come largely from academics concerned about data mis-representation, the movement has been able to make so much progress in recent years because it has had a backer with deep pockets, the philanthropist John Arnold. Arnold was recently profiled in Wired magazine. I would encourage anyone interested in Open Data and Research Transparency to check out this article.

In view of the importance of Open Data to the future of the field of business history, it is exciting to see that an increasing number of business-historical data sources are being made freely available online.

I see from The Exchange, the blog of the Business History Conference, that The Newberry Library in Chicago has announced a major revision to its policy regarding the re-use of collection images: “images derived from collection items are now available to anyone for any lawful purpose, whether commercial or non-commercial, without licensing or permission fees to the library.”

This reform to the Newberry Library’s rule would certainly help to make it easier for researchers who based papers on materials in their collection to use Open Data in their papers. I would like to congratulate the Newberry Library on their wise decision and would like to encourage other repositories of business historical materials to follow this example whenever they are legally allowed to do so.

The Business Historian and the Archive in the Post-Snowden Era

9 10 2013

Back in May, Stefan Schwarzkopf  of Copenhagen Business School posted an essay about the relationship between business historians and corporate archives on NEP-HIST.  You can download the entire paper here.

Abstract:  Archival records are a constitutive element of business historical research, and such research, in turn, is fundamental for a holistic understanding of the role of enterprise in modern capitalist societies. Despite an increasing debate within business history circles about the need to theorize the historian as author and creator of narratives, a fuller reflection on the uses and limitations of the archive in business historical research has not yet taken place. This article takes its lead from theories of organisational epistemology, and asks to what extent business historians are trapped by an outdated, realist methodology and epistemology which is in danger of ignoring the multiple roles that archives play in their knowledge production.

Essentially, Schwarzkopf  is asking for business historians to be more critical in their use of this particular set of primary sources. Scholars working in other branches of history have recently become much more conscious of the ways in which the selectiveness of their archives bias their work. For instance, historians of criminal law are aware that police archives give us the perspective of State employees, not those of the people who were deemed to be criminals in a given era. Religious historians are acutely aware that church archives give us the perspectives of the missionaries, not the so-called “pagans” they were attempting to convert. The adoption of critical stances to archives by other groups of historians  has been driven by the emergence of postmodernist and postcolonial perspectives on the sociology of knowledge.  Most business historians, according to Schwarzkopf, are stuck in an outdated and uncritical mindset towards the corporate archives that are the foundation of their research. They are, in his view, naive empiricists.

Stephanie Decker, a business historian at Aston Business School, has followed up Schwarzkopf’s piece with a short reaction essay of her own, Decker’s piece draws on her research into the area of African business history and the development policies, which has involved trips to the World Bank Archive in Washington, DC.

In my view, the most interesting part of the essays by Decker and Schwarzkopf relate to digital technology.  Since the 1960s, government and corporate archivists have been struggling with the issue of how to save data recorded on punch cards, magnetic tapes, and successive generations of electronic storage material.  (You can read about the first generation of digital archivists in the US government here). More recently, there have been efforts to put documents online.  For instance, you can now read the handwritten letters of Abraham Lincoln from the comfort of your own home.

Decker applauds organizations such as the World Bank for digitizing parts of their vast holdings and putting scanned images of certain document categories online.  She points out that since digitization of hard copies is inevitably selective, it may bias future historical research towards topics and perspectives supported by those documents which happened to put online.  Decker writes:

How does digitisation affect how archives are used, and vice versa? Will it determine what the collection stands for, more so than the entire body of files? Perhaps not a new problem for libraries that contain individual high value items that eclipse the totality of their collection, but certainly a phenomenon that will spread with digitisation. Just consider decisions to digitise parts of archival collections that are of greater public interest, such as World Bank’s digitisation of the Robert McNamara’s files. Faced with the impossibility of digitising an archive as vast as theirs, files of greater relevance to present-day audiences are prioritised, negating the need for people to physically enter 1818 H Street, NW, and engage with the overall collection. Is this a manipulation by the archivists, or is this it the pressure of demand shaping organisational responses?

Schwarzkopf  asserts that digital records are easier to manipulate and delete than the hard copies. (For the time being, let’s assume this claim is correct, although I’m sceptical because many electronic documents leave traces that can be recovered by experts.)  Selective editing of archived emails may create problems for future business historians interested in the early internet era (i.e., the present). Observing that much business communications (reports, emails, memos, etc.) are increasingly becoming  digital-only, he suggests that there is little to stop governments and corporations employing twenty-first  century Winston Smiths to deal with their own digital records in the same way?

Schwarzkopf is here alluding to the protagonist of George Orwell’s 1984.  I’m not suggesting that Schwarzkopf’s concerns about the deletion of incriminating documents are invalid.  There have been examples of such documents “disappearing” from archives or the archives simply remaining personally closed to researchers. Enron employees shredded many documents in the last few days of that company’s existence. Deutsche Bank kept records related to the Holocaust secret for years. In 2005,  Hydro One, a Canadian SOE, suddenly closed its archive to all researchers when a non-academic began to use their archive to find material to support a lawsuit against the company. Art has imitated life:  the plot of the film Michael Clayton revolves around the efforts of a fictitious company to hide a document

Personally, however, I am less worried than Schwarzkopf  about the selective editing by people looking to hide incriminating documents than about the simple accidental deletion of documents. The most serious problem is the deliberate deletion of documents related to storage costs.

Today’s business historians depend on documents that others have kindly saved for us.  Since 1934, the Business Archives Council in the UK has been helping companies to save their archives and make them available to outside researchers.  There are equivalent organizations in other countries. These efforts, which were supported by the business historians of earlier decades, make our research possible today. We have an obligation to help preserve today’s corporate records for the future.

This means that we need to think about how we can save the data formats that are being created today. It occurs to me that cloud computing might allow us to do this cheaply.  Many companies now outsource the storage of their email and other data to trusted firms such as Amazon Web Services. Interestingly enough, the cloud computing divisions of Amazon and IBM are now suing each other for the right to store the data created by the CIA and the NSA in the United States.

It occurs to me that the Business Archives Council or some other charitable organization might undertake to save all or part of the data that a cross-section of companies upload to the cloud.  Perhaps it could be a completely separate organization. Let’s call it the Business E-Archives Council or BEAC for short. Under this scenario, the customers of the cloud computing firm would consent to the release of part of their data to BEAC on the understanding that the data would be kept in a secure environment and would only be released to the researchers for a predetermined period.

It seems to me that there are three basic reasons  a company might be reluctant to consent to a heritage organization copying a cross-section of their electronic files for posterity.

1) The first is concern that the data might fall into the hands of enemies of the firm.

2) The second  is the sheer administrative hassle of asking IT people in their company to liaise with with the archivists at the heritage organization to decide which email accounts to copy and how to go about transferring the files over.  Sharing information with others costs real resources, most notably time. That’s true regardless of whether one is setting up data backup account for one’s home computer or arranging cloud computing services for a major bank. It is difficult enough to synchronize data systems within a given organisation (such as a university at the start of term) let alone ask IT staff to allow outsiders to get involved.

3) The third concern relates to Public Relations in the post Edward Snowden era. A company might be reluctant to do business with a company that had announced it was allowing the BEAC to record some documents. After all, they might worry that consumers would be concerned about the protection of their data.

The first and third concerns could be addressed through a variety of legal and social mechanisms. First, you could reinforce the confidentiality agreements between the companies and the BEAC by incorporating the BEAC through a special piece of legislation that removes any doubt about whether the BEAC’s right to protect the data for term prescribed in the contract. In other words, the BEAC’s charter would be a special act of parliament or Congress.  Including representatives of the country’s top companies and business leaders on the board of the BEAC would also bolster the credibility of the firm with the data-generating companies. Including prominent citizens of the country in question on the board of the committee might help to allay consumer fears about the BEAC.

Cloud computing would help to reduce to the costs of participating for the companies. Since company X has already done the difficult work of making its systems work with those of the cloud computing company, allowing the BEAC to record a cross-section of their archived emails would not cost them any person hours. I admit that the connection between the cloud computing companies and BEAC would cost some money to set up, but surely it is easier for the BEAC to deal with just one or two cloud computing firms than with all of the companies served by these firms.

The benefits of recording the electronic data for future generations of business historians would be massive. Exciting things happen when academics meet big data. Consider what Google N-gram allows literary scholars and people in the field of corpus linguistics to do. I would love to do keyword frequency counts of the internal correspondence of the companies I study.


Podcasts About Business Archives

2 09 2013

AS: Business archives are crucial to the work of business historians. That’s why I was excited to learn that the National Archives of the UK has podcast two talks in which the archivists of prominent European firms talk about their jobs and the role of internal archives in maintaining the social memory of their respective employers. 

 The Archives of F Hoffmann-La Roche

The Archives of F Hoffmann-La Roche

In the first podcast, Dr Andrea Tanner, archivist at Fortnum and Mason, shares some of the delicious secrets of the archive

In the second podcast, Alexander Bieri, the archivist at the Swiss pharmaceutical company F Hoffmann-La Roche, talks about the international angle of Roche’s work, the role of corporate archives in the company today..

Engaging Corporate Heritage

22 03 2012

Public history blogger Krista McCracken has published a great post on corporate archives and institutional memory. In her post, she offers reasons why companies ought to care about preserving old records.

She writes: On the most basic level institutional memory can help prevent the repetition of past mistakes.  Often the biggest gaps in institutional memory occur during a change in administration or management.  For example, a newly hired administrator implements new methods without being aware of what has worked or failed in the past, and he makes the same mistake that was made six months ago.  Institutional memory isn’t designed to stall innovation (though it can be misused that way).  Rather, it can help organizations avoid reinventing the wheel. Having records which highlight past work allow for informed decisions to be made in the present.

She also says that company archives

can help cultivate institutional culture and pride.  Remembering past triumphs and projects can help employees see the long term impact of their work and the institution at large.  Celebrating anniversaries and other important dates in the organization’s history can further instill pride and a sense of longevity.

I think that the second point is a really important one.  Long-established companies like invoking images of their histories in advertising aimed at the public and in communication for internal consumption. It builds confidence among consumers and pride among workers. That’s why older bank branches often have the year of incorporation displayed prominently– it helped to reinforce the idea that the bank (and the depositors’ money) would be around for the long term. Although most bank branches built after the introduction of government-run deposit insurance systems don’t display the dates of founding so prominently, banks still pride themselves on their heritage.

There are vast numbers of TV commercials that refer to the history of the company in question. For a recent example, see below:

As Deidre Simmons’s history of the archives department of the Hudson’s Bay Company, corporate archives can certainly help with all of these activities.






FT Article on Slavery and the City of London

30 06 2009

The front page of this weekend’s edition of the Financial Times carried a story about historical research that has uncovered new evidence regarding the details City of London’s involvement in slavery. [Note: story includes video of interview with noted historian Catherine Hall] The most interesting fact revealed in the article is that Nathan Mayer Rothschild accepted slaves as collateral for a loan. The House of Rothschild had previously been famous for arranging the loan that allowed the British government to borrow the money needed to compensate slaveholders when slavery was abolished in the British Empire in the 1830s.

I’m glad that the FT ran this story, because it gives readers a sense of the historical importance of corporate archives (although in this case the key documents were uncovered at the National Archives in Kew). However, I’m not certain why information about the Rothschilds’ indirect involvement in slavery is terribly newsworthy.  After all, the House of Rothschild were the bankers of the Empire of Brazil at a time when that country had slavery. Like many other firms in Britain, America, and elsewhere, many City firms were indirect beneficiaries of slavery. We knew this already.