The Application of Artificial Intelligence in Literary Text Analysis: Modern Approaches and Examples
In recent years, artificial intelligence (AI) has become an integral part of research in the field of digital humanities. One of the most promising directions is the use of AI for analyzing literary works, which opens new opportunities for understanding and interpreting texts.
Analysis of Style and Authorship
A study conducted by a group of researchers led by Arianna Di Bernardo demonstrated the application of machine learning methods to analyze Latin texts. The primary goal of the study was to identify stylistic features and determine the authorship of medieval manuscripts. Using natural language processing (NLP) algorithms, researchers were able to identify the probable authors of anonymous texts with a significant degree of accuracy. This highlights the potential of AI in solving complex attribution problems in literary studies.
One of the stages of the study was focused on determining authorship, which was implemented through text similarity analysis and subsequent clustering. The results showed that the model is capable of grouping texts with high accuracy (>90%). Additionally, the similarity matrix can be used to extract further insights regarding the relationships between different authors. Another important aspect of the study was the construction of a network of author connections, which graphically displays the degree of similarity between stylistic characteristics of their texts.
Named Entity Recognition and Relationship Extraction in Literary Works
One of the most intriguing applications of AI in literary text analysis is the automatic extraction of social networks from literary works. This approach enables researchers to identify complex interactions between characters, analyze their roles in the narrative, and construct models of social structures represented in the text.
A study published in PeerJ Computer Science tested various NLP tools to assess their capability in automatically extracting character networks from literary works. Researchers examined how different NLP methods can identify characters, their relationships, and types of interactions, creating visual maps that help interpret the social aspects of the text.
Sentiment and Emotional Analysis
The SenticNet project, originally developed at MIT Media Laboratory in 2009 and now supported by Nanyang Technological University in Singapore, aims to improve the machine understanding of the emotional components of text. SenticNet replaces the traditional “bag of words” model with an approach that treats text as a set of concepts and narratives, allowing for more accurate interpretation of the emotional context of literary works.
This tool is applied in various fields, including social data analysis, human-computer interaction, financial forecasting, and healthcare. The primary goal of SenticNet is to make conceptual and emotional information conveyed in natural language more accessible and understandable to machines. This is achieved by transitioning from word frequency analysis to a more profound comprehension of meaning and emotional tone.
Translating Literary Texts While Preserving Style
Translating literary texts is one of the most challenging tasks in machine translation, as it requires not only the accurate transfer of meaning but also the preservation of style, rhythm, and emotional depth of the original text. One of the key research areas in this field is the application of neural network models for poetry translation.
In the article “Don’t Go Far Off: An Empirical Study on Neural Poetry Translation”, researchers Tuhin Chakrabarty, Arkadiy Saakyan, and Smaranda Muresan explore how modern machine learning models can effectively convey stylistic and semantic features of poetic works when translating them into other languages.
The authors of the study focused on comparing different approaches to machine translation of poetic texts, including:
• Standard neural machine translators (NMTs), such as Google Translate, which are trained on large corpora of texts but often fail to retain rhyme and metaphors.
• Specialized poetic models, trained on corpora of poetic works, allowing them to better preserve rhythm and imagery.
• Multilingual training, where the model is trained on multiple languages simultaneously, enabling it to capture common poetic patterns and stylistic features.
The study results demonstrated that multilingual models trained on specially curated poetic corpora significantly outperform traditional NMT models in poetry translation. They are better at preserving rhythm, metaphors, and even elements of rhyme, which had previously been one of the key challenges in machine translation.
The researchers tested their models by translating works by renowned poets such as Pablo Neruda, Emily Dickinson, and Rainer Maria Rilke. One of the most insightful findings was a comparison between machine-generated translations and those produced by human translators. In some cases, the models successfully adapted original metaphors into the target language while maintaining their semantic depth. However, for more complex cases, human translators were still needed to refine the results.
The integration of artificial intelligence into literary text analysis is opening new horizons for research in the digital humanities. Modern technologies have already demonstrated their effectiveness in several areas, including style analysis, authorship attribution, identification of character relationships, interpretation of emotional contexts in texts, and even poetry translation.
Studies show that AI can not only automate complex literary analysis processes but also provide new methodological approaches to studying literary works. Machine learning algorithms allow researchers to examine classical texts from a fresh perspective, uncovering hidden patterns, literary influences, and stylistic characteristics that might have remained unnoticed through traditional analysis.
In the field of text attribution, AI enables high-accuracy authorship identification and helps detect stylistic similarities between different writers. Meanwhile, NLP-based character network analysis allows for the creation of interactive relationship maps, making it easier to study complex narrative structures. Tools like SenticNet enable a more detailed examination of sentiment and deeper meanings in texts, while specialized neural network models for translation present new possibilities for adapting literary works into different languages while preserving their original rhythm, style, and imagery.
Despite these advancements, challenges remain in incorporating AI into literary research. One of the main concerns is the necessity of deep literary interpretation, which extends beyond linguistic analysis and requires understanding of historical, cultural, and philosophical contexts. Additionally, the use of AI in literary analysis raises questions about algorithmic biases and the importance of combining machine analysis with expert human interpretation.
In the future, advancements in AI-driven literary analysis are expected to lead to the development of even more sophisticated models capable of not only analyzing existing texts but also predicting literary trends, modeling the styles of specific authors, and even generating literary works that could become independent subjects of study. Thus, artificial intelligence is not only complementing traditional methods of literary analysis but also introducing fundamentally new ways of interacting with texts. Further collaboration between AI researchers and humanities scholars will help deepen our understanding of literary processes and expand the boundaries of literary studies.