By Professor Stephanie Decker
Department of Strategy and International Business, Birmingham Business School
Much of the buzz around Artificial Intelligence (AI), and more specifically Generative Artificial Intelligence (GAI), focuses on what the widespread access to these new technologies will mean for the future of society, professions and the workplace. But AI also has the potential to significantly change the way we know our past and how we access our heritage – which is increasingly digital. Most important here are perhaps emails – the “killer app” of the 1990s – which profoundly changed the way we communicate, even before the advent of social media.
Replacing letters and memos in organisational and private use, email has remained an ubiquitous and essential feature of modern life, but only rarely features in archival or heritage collections (with a few notable exceptions, for example, the Carcanet emails at the University of Manchester Library). In public scandals or official hearings, email is more likely to make an appearance. Most recently, in the Post Office scandal, internal email provided evidence that flaws in the Horizon accounting system were known to key people in the organisation.
One might think that email is the historian’s and archivist’s dream of a historical record, providing first hand, in-depth insights into historical figures and events. But the reality is more complex – data privacy in general, and GDPR more specifically, make the archiving and use of digital records more complex. Nevertheless, legal issues are not the only or even the main reason why we are not being inundated with digital records in the archival and heritage space – after all, GDPR provides an exception for historical research. A much bigger issue is the seamless intermingling between the personal and the professional, the mundane and the important in people’s (private or organisational) email inboxes. This has hindered many attempts to deposit email collections as well as many other types of digital records that would previously have been preserved on paper, because it is so difficult to review large collections of digital records. The Enron Email dataset, one of the few email collections in the public domain, contains about one million emails – by now, this is quite small in comparison. Most digital collections are still too vast to be reviewed by humans to make a judgment call whether sensitive or personal material should be removed before releasing them as historical records. Consequently, very few digital collections are being made available, even though archival closure periods (the time period in which ‘normal’ records are not accessible) are normally 20 or 30 years. So, files available from 1994, or 2004, would normally be digital, and most correspondence would have taken place on email. But these aspects of human activity are remaining in what is sometimes referred to as ‘dark archives’, that is records not accessible to the public.
This is where AI can help. The problems with reviewing the vast amounts of digital records that have been created since the advent of personal computing and email correspondence are solvable with AI tools that can be designed to search, discover, review and summarise the content of vast datasets. There are, however, surprisingly few tools that provide meaningful access to digital sources beyond simple keyword or phrase searches – and discovery as a ‘digital flaneur’ should allow you to find the kind of information you did not know you were looking for. Take the Wayback Machine for example – an extensive, publicly available digital archive, but in practice its users struggle to use it effectively.
In a collaboration between social scientists and computing scientists, we developed such an AI tool for email archives called EMCODIST, which seeks to bring light to dark archives. We used this tool on a currently closed email archive of a company from the Dot-Com boom era. With a few searches, we ended up with a trove of relevant email conversations which formed the basis of a digital history show case that presents the history of the company in four case studies which are freely available here.
But beyond researchers using digital archives, there lies a whole uncanny valley of GAI uses of historical data, which are in some ways more advanced than AI tools that facilitate any research use. Hello History offers Chat-GPT supported conversations with historical figures like Cleopatra and Tupac Shakur. When I asked Cleopatra how she maintained her power in a male-dominated society, the answer was that she aligned herself with powerful men such as Julius Caesar and Mark Anthony. Tupac offered his view on the evolution of hip hop since his death; good to know that he remains ever-present in spirit. Perhaps even more concerning, you can also chat with Santa Claus. And if you think that this erodes the line between historical reality and make-believe too much, then Project December can take you into the realm of chatting with your deceased loved ones.
In a future filled with our digital heritage, this kind of death capitalism may become as commonplace as genealogy websites, such as Ancestry, which provide data to the amateur family historian for a fee. And while Jane Austen may have had one of her characters in Northanger Abbey quipping in 1818 that she did not understand why history was so boring given that much of it must be invention, we may need to consider how much of our past we want to experience through the mediation (and hallucination) of GAI tools – especially when actual digital records of our recent history remain so tantalisingly scarce.
As AI becomes more pervasive in society, it may mediate not just our visions of the future, but also our understanding of our past. If you want to know more, come visit the AI Dialogues exhibition in The Exchange in Birmingham where I talk about these issues alongside many other colleagues from the University to demystify what AI means to us today.
- Find out more about Professor Stephanie Decker
- Back to Social Sciences Birmingham
The views and opinions expressed in this article are those of the author and do not necessarily reflect the official policy or position of the University of Birmingham.