The New Spain Fleets: Delving into three centuries of socioeconomic colonial history through Artificial Intelligence

Lead Research Organisation: Lancaster University
Department Name: History

Abstract

The encounter between Europe and the Americas in the late fifteenth and early sixteenth centuries undoubtedly changed the world. Spain would establish control over most of the Americas, inaugurating the modern global system we live in today. The richness and resources to which the Spanish crown got access were beyond imagination, and establishing the Viceroyalties of New Spain and Peru would enable the creation of a previously unseen and unparalleled social, economic, and political network. This was only made possible through Charles I's mandate to establish the Spanish Fleets that, between 1520 and 1790, carried out all transatlantic voyages between Spain and the Americas.

The Spanish Fleets transported not only gold, silver, and all the traditional riches found in these territories, but also enabled the exchange of all sorts of resources, including plants, animals, foods and most importantly, people and knowledge, both Indigenous and European. This would have unprecedented consequences ranging from population growth in Europe thanks to the introduction of new and highly resistant crops, to the acquisition of new resources and knowledge that would eventually impact the development of areas such as medicine, astronomy, chemistry, geography, history, and literature, among many others, in both sides of the Atlantic. The Spanish Fleets were overseen by the royal institution House of Trade (Casa de la Contratación). It recorded thousands of trips in an extensive document collection that provides invaluable information about the people that shaped these first global networks and the activities that would eventually mould the history of modern Latin America and the world.

This project will create a step-change in the way historical archaeology collects evidence and analyses information while transforming our knowledge about the New Spain Fleets (NSF), one of the most important maritime institutions and infrastructures of early modern history. This will be accomplished by, firstly, making readily available an unprecedented collection with thousands of documents about the NSF in computer-readable format; and secondly, by transforming our knowledge about the Spanish colonial maritime trade through five cutting-edge historical case studies delving into key social, economic, and spatial aspects of the NSF.

Making use of innovative computational methods based on artificial intelligence techniques, the NSF project will 1) create an unparalleled digital collection bringing together thousands of historical documents related to the NSF from two major archives; 2) carry out the semi and automated transcription of this collection, unlocking historical information in thousands of documents; 3) use automated annotation methods to identify, mine, and analyse meaningful information from these sources; 4) create an online platform that will facilitate to any scholar the exploration, query, and extraction of information from the New Spain Fleets documents; and 5) carry out, in five case studies, a series of historical analyses that will substantially advance our knowledge of the social, economic, and scientific revolutions facilitated by the New Spain Fleets.

In doing so, the project will open the opportunity for researchers and the interested public to access information and records that have been only in the remit of specialists in the past. Furthermore, the case studies will explore a series of topics that are far from being comprehensively and completely understood, thus opening a variety of potential new research areas and studies at a scale that is impossible at the moment. These will include the early migration of Indigenous peoples to Europe, the social networks of people in the fleets, the trade routes and the economic impact of the goods transported, the unofficial Spanish slave trade, the commercialisation and study of American plants and animals, and the exchange of scientific ideas about health, disease, and medicine.

Publications

10 25 50
 
Title Humanities-NLP Annotation Software 
Description This tool is a software created to carry out the annotation of any document with entities and fields that can be used for Natural Language Processing research and the training of Named Entity Recognition Machine Learning models. The team has carried out the design and this is still in development. 
Type Of Material Improvements to research infrastructure 
Year Produced 2025 
Provided To Others? No  
Impact This is still in development and in testing face of V1. 
URL https://annotator-dev.streamlit.app/
 
Title ML Historical Calligraphy Classifier 
Description This is a machine learning tool that allows historians to automatically classify historical documents by calligraphy type. This is still in development, although the team is already using its first iteration. 
Type Of Material Improvements to research infrastructure 
Year Produced 2025 
Provided To Others? No  
Impact We have managed to carry out a series of tests with the model. The results are outlined in a journal article now in press: Murrieta-Flores, P., Vega-Sánchez, R., Sánchez-Diaz, A., and Cruz-Ríos, F., Implementing Artificial Intelligence approaches for the automated transcription of large scale historical collections in colonial archives: A case for 16th and 17 century Spanish and Nahuatl. (2025-In Press). Submitted to the Science & Technology of Archaeological Research. 
URL https://lanc-ner.streamlit.app/
 
Title ML Historical Calligraphy Classifier 
Description This is a model still in development. The model enables researchers to carry out the automated classification of hand written historical documents in Spanish. It allows the classification of 5 types of calligraphy, but we are still training it to further refine it. Some of the results are about to be published in: Murrieta-Flores, P., Vega-Sánchez, R., Sánchez-Diaz, A., and Cruz-Ríos, F., Implementing Artificial Intelligence approaches for the automated transcription of large scale historical collections in colonial archives: A case for 16th and 17 century Spanish and Nahuatl. (2025-In Press). Submitted to the Science & Technology of Archaeological Research. 
Type Of Material Computer model/algorithm 
Year Produced 2024 
Provided To Others? No  
Impact We are now being able to automatically classify hundreds of documents by calligraphy type. This has the purpose to feed the documents classified into specific Handwritten Text Recognition models. 
URL https://lanc-ner.streamlit.app/
 
Description Universidad de Alicante 
Organisation University of Alicante
Country Spain 
Sector Academic/University 
PI Contribution We started a collaboration with Prof Juan Antonio Perez, specialist in Large Language Models at the Computer Science Department. The idea behind the collaboration has been to experiment with Generative Artificial Intelligence in the standardization of historical documents in early modern Spanish to modern Spanish. We have proposed to pursue a small internal funding from the university to expand to this research line which I believe can be really productive. We have contributed by establishing the possible problem and some of the solutions, carrying out already a series of controlled experiments, while the Alicante colleagues are helping and advising us taking their expertise in the training of LLMs.
Collaborator Contribution We have contributed by establishing the possible problem and some of the solutions, carrying out already a series of controlled experiments, while the Alicante colleagues are helping and advising us taking their expertise in the training of LLMs.
Impact This are still in development and I will report accordingly.
Start Year 2025
 
Title Humanities-NLP Annotation Software 
Description This application allows researchers to annotate with entities and fields any textual document. The tool takes any controlled vocabulary and any text, allowing for the creation of datasets for Natural Language Processing and their use with other software such as Geographical Text Analysis. This is still in development, but already in testing and use by the team. The purpose will be to release this in time with an Open Source licence. 
Type Of Technology Webtool/Application 
Year Produced 2025 
Impact With this software we will be able to carry out annotations at large scale of historical documents and train ML models to automatically identify categories of interest to researchers and the public in genral. 
URL https://annotator-dev.streamlit.app/