• css.php

    “Everything On Paper Will Be Used Against Me”

    by  • June 26, 2013 • 2013-2014 Provost Digital Innovation Grant Winners

    Project Name: “Everything On Paper Will Be Used Against Me”: A Computational Analysis of Henry A. Kissinger’s Vietnam-Era Correspondence
    Grantee: Micki Kaufman
    Discipline: History
    Funding Cycle: 2013-2014
    Project Status: Cycle Complete
    White Paper: PDIGrantReport-MKaufman

    About the Project

    Micki_Image_1On April 8, 2013, Wikileaks made 1.7 million diplomatic cables covering the time period from 1973-1976 available for public review. This release was not a ‘leak’ per se – Wikileaks has stated that all of the documents in the archive were obtained legally either from publicly available files at the State Department’s National Archives and Records Administration (NARA), or have been made available subsequent to various FOIA (Freedom of Information Act) requests. If so, the material has been properly declassified. NARA, freely available via the web but locked inside a ‘big data’ problem like those beginning to appear in academic and journalistic endeavors from social science to history. Scarcity of information is a more common frustration for historians. This is especially true for researchers of antiquity, but not exclusively so. For students of twentieth- and twenty-first century history the opposite problem is also increasingly common — overwhelmed instead by a deluge of information and confronted by a vast field of haystacks within which they must locate the needles (and presumably, use them to knit together a valid historical interpretation), historians have already struggled with what is now understood as ‘big data’.

    Exhaustive efforts by historians at approaching vast troves of information have often employed a traditional ‘close-reading’ methodology in which each author’s thesis is illustrated by hand-picked, ostensibly representative samples, presented as valid proof of the underlying argument. Ensuring such examples are indeed representative for historical interpretation is increasingly difficult as the size of the archive increases. As larger and larger archives of human cultural output are accumulated, historians are beginning to employ other tools and methods — including those developed in other fields, including computational biology and linguistics — to overcome ‘information overload’ and facilitate new historical interpretations.

    Micki_Image_2This project is an application of ‘big data’ computational techniques to research the Digital National Security Archive (DNSA)’s recently released Kissinger Collections, more than 2 gigabytes of text comprising 17,000 meeting memoranda (‘memcons’) and teleconference transcripts (‘telcons’) detailing Kissinger’s correspondence during the period 1969-1977: it is a first effort at ‘Diplonomics’. The declassification in 2001-2006 of the Kissinger material by the State Department (more than 250,000 pages’ worth of ‘big data’) and the hosting of that material on the DNSA’s Kissinger Collection web site therefore presents an opportunity and a challenge for historians. While having this large volume of information online for researchers is valuable, the restriction to a web-based ‘search’ interface can render it of limited use to researchers. The application of more sophisticated computational techniques permits a comprehensive analysis of the historical records of the Kissinger collection at the DNSA, and facilitates meaningful historical interpretations. While this new way of looking at history is based on data, unlike other methods of historical analysis (eg ‘cliometrics’) it is the variations of the content of the text itself, rather than economic data, that is measured. Assembling a database of document ‘metadata’ and leveraging sophisticated techniques like ’n- gram counting’ and ‘topic modeling’ (adopted from the fields of computational biology, linguistics, and ‘machine learning’), one can study the underlying ways in which the documents’ words are used, associated, and related.

    Micki Kaufman is a third-year doctoral student in the Department of History at the CUNY Graduate Center. She received her B.A. in US History summa cum laude, Phi Beta Kappa from Columbia University in 2011. She is a co-author of “General, I Have Fought Just As Many Nuclear Wars As You Have,” published in the December 2012 American Historical Review, has served as a digital humanities consultant for the Hertog Global Strategy Initiative, the Blinken European Institute and the Gotham Center, and has taught US History at Hunter College.