Historica Tech Lab scheme

Technological results. Developing an ETL Pipeline and Graph Database for Historical Data with LLMs

Recently, we at Historica Tech Lab have developed an innovative ETL (Extract, Transform, Load) pipeline integrated with a graph database to transform and analyze historical data. As part of our research, we aimed to prototype a "historical ontology," a knowledge repository of human history. Leveraging large language models (LLMs), this approach converts unstructured historical texts into structured data, advancing digital humanities significantly.

ETL Pipeline Innovation

The pipeline consists of three stages: 

1) Data Extraction 

2) Data Transformation 

3) Data Loading

The pipeline is designed to extract data from historical texts, transform this data and load it into a database. The ontology was built using Amazon Neptune to ensure efficient storage and access to these data. Amazon Neptune was chosen for its flexibility, scalability, and performance, crucial for modeling complex historical data relationships and executing intricate queries. 

Historical Ontology Development

The created "historical ontology" categorizes data into Units of Topography (UT), Units of Stratigraphy (US), and Actors (AC), detailing events, material evidence, and participants.

Integrating with OpenAI models it allows users to interact with the database through natural language queries, making it accessible to researchers without specialized query language knowledge.

Future Prospects and Challenges

The ETL pipeline and historical ontology prototype have demonstrated high efficiency and accuracy. This advancement is set to transform historical data processing and analysis, enabling new research and discoveries. 

Challenges remain in ensuring accurate data extraction from poorly structured texts and optimizing the ETL pipeline for large data volumes. Security is also crucial for sensitive historical data. We plan to expand our historical ontology to more regions and periods, enhancing LLM-based solutions for better data extraction accuracy. These advancements are expected to have broad applications in digital humanities and beyond.

Don't miss out on the latest news!
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

People also read

Full Stack Developer

We are looking for Fullstack Engineer (Python, JavaScript)

Join our team to bring history to life with cutting-edge technology! We're seeking a Fullstack Engineer to help launch our innovative web app, merging interactive maps with historical sources

August 14, 2024
2
min read
News
Historica Tech Lab scheme

Technological results. Developing an ETL Pipeline and Graph Database for Historical Data with LLMs

Innovative ETL Pipeline and Historical Ontology at Historica Tech Lab: Advancing Digital Humanities

July 17, 2024
3
min read
Generative AI
News
Historica Dataset

Historica's Latest Experiment Results: Using LLM for Feature Engineering in Historical Data

Discover how Historica uses AI to revolutionize historical data analysis, comparing cutting-edge language models for precise and efficient text annotation.

May 27, 2024
6
min read
News

Contribute to Historica's blog!

Learn guidelines, requirements, and join our history-loving community.

Become an author

FAQs

How can I contribute to or collaborate with the Historica project?
If you're interested in contributing to or collaborating with Historica, you can use the contact form on the Historica website to express your interest and detail how you would like to be involved. The Historica team will then be able to guide you through the process.
What role does Historica play in the promotion of culture?
Historica acts as a platform for promoting cultural objects and events by local communities. It presents these in great detail, from previously inaccessible perspectives, and in fresh contexts.
How does Historica support educational endeavors?
Historica serves as a powerful tool for research and education. It can be used in school curricula, scientific projects, educational software development, and the organization of educational events.
What benefits does Historica offer to local cultural entities and events?
Historica provides a global platform for local communities and cultural events to display their cultural artifacts and historical events. It offers detailed presentations from unique perspectives and in fresh contexts.
Can you give a brief overview of Historica?
Historica is an initiative that uses artificial intelligence to build a digital map of human history. It combines different data types to portray the progression of civilization from its inception to the present day.
What is the meaning of Historica's principles?
The principles of Historica represent its methodological, organizational, and technological foundations: Methodological principle of interdisciplinarity: This principle involves integrating knowledge from various fields to provide a comprehensive and scientifically grounded view of history. Organizational principle of decentralization: This principle encourages open collaboration from a global community, allowing everyone to contribute to the digital depiction of human history. Technological principle of reliance on AI: This principle focuses on extensively using AI to handle large data sets, reconcile different scientific domains, and continuously enrich the historical model.
Who are the intended users of Historica?
Historica is beneficial to a diverse range of users. In academia, it's valuable for educators, students, and policymakers. Culturally, it aids workers in museums, heritage conservation, tourism, and cultural event organization. For recreational purposes, it serves gamers, history enthusiasts, authors, and participants in historical reenactments.
How does Historica use artificial intelligence?
Historica uses AI to process and manage vast amounts of data from various scientific fields. This technology allows for the constant addition of new facts to the historical model and aids in resolving disagreements and contradictions in interpretation across different scientific fields.
Can anyone participate in the Historica project?
Yes, Historica encourages wide-ranging collaboration. Scholars, researchers, AI specialists, bloggers and all history enthusiasts are all welcome to contribute to the project.