Authorship and Source Integrity Challenges for Future Historians in the Age of AI
The Rise of AI-Generated Content
The advent of AI-generated content presents a number of challenges for historians. The current focus is often on student work and the concern that tools like ChatGPT may be used to generate essays, leading to questions about whether the credit given is deserved. However, there is another side to this issue: how will future historians treat AI-generated sources? Will it be possible to detect if a piece of work was created by AI?
There are certainly a number of tools that claim to detect AI involvement in the production of texts. Yet, it is difficult to be certain of their accuracy unless one understands the underlying algorithm—a level of forensic work that very few historians are likely to undertake. Even if the algorithm is accessible, many are housed inside so-called "black boxes," where their creators keep them secret for commercial reasons.
AI in Historical Research: From OCR to Handwriting Recognition
Will historians ever have retrospective access to these algorithms? It is possible in some cases, especially if programmers choose to make them freely available as open-source software. Nonetheless, the history of algorithms, let alone the history of the documents they have been used to produce, will be a very challenging field to study. In fact, we should recognize how far we already rely on various forms of AI in historical research. For example, many digitized documents depend on optical character recognition (OCR) or even AI-driven handwriting recognition.
These technologies are now intrinsic parts of our research process, even though we know that OCR can often be unreliable. While we might acknowledge this in our research findings, there is no definitive way to overcome the issue, although the technology continues to improve.
The Future of Authorship: Navigating AI-Influenced Texts
Another issue arises with text that one might consider "ordinary." Even these may have elements of AI prediction built into them. For instance, when typing a sentence, many word processing programs suggest what might come next. If you accept that suggestion, are you still the original author of the sentence? The same could be said of predictive text on mobile phones. Of course, the word suggested might be the one you were going to use anyway, but who can really say? Moreover, there is currently no way to retrospectively analyze documents—unless specifically preserved for this purpose—to determine what was human-generated and what was computer-generated.
Preserving the Integrity of Historical Sources in an AI-Driven World
None of this is intended to discourage the use of AI, which is becoming increasingly ubiquitous and indeed unavoidable. However, authorship will be much more complicated to discern in the future, particularly when the traditional giveaways, such as handwriting or specific typewriter models, are no longer present to provide clues. Perhaps what historians need to do is to explore whether there are ways to identify authorship or at least find clues to authorship, bearing in mind that concerns over these questions are not new, especially when dealing with ancient documents of anonymous origin.