Published: Ontologies Applied to Archival Records: A Preliminary Proposal for Information Retrieval

A paper I co-authored with Dr Thiago Henrique Bragato Barros et al has been published as part of the 2025 IEEE International Conference on Big Data (BigData) proceedings. The research for the paper, “Ontologies Applied to Archival Records: a Preliminary Proposal for Information Retrieval”, was led by Dr Barros and his team at Universidade Federal do Rio Grande do Sul, Brazil, with input from me and a recently retired Edinburgh Napier University colleague, Dr David Haynes.

Dr Barros, who has joined Napier as an associate researcher as part of a year-long sabbatical, presented the work at the 10th Computational Archival Science Workshop as part of the larger BigData conference.

You can access the paper on the Napier repository here.

Cite: Barros, T. H. B., Batista, R. R. d. C., Ryan, F., da Silva, M. C., & Haynes, D. (2025). Ontologies Applied to Archival Records: A Preliminary Proposal for Information Retrieval. 2025 IEEE International Conference on Big Data (BigData) (pp. 5954-5959). IEEE. https://doi.org/10.1109/bigdata66926.2025.11402543

ABSTRACT
Archives preserve records that document actions, rights, and memory. Yet, queries against archival catalogues often underperform when faced with term ambiguity, complex provenance, multi-level description, and evolving institutional contexts. This paper proposes a preliminary, ontology-driven approach to improve information retrieval (IR) over archival descriptions and digital objects. We review relevant literature from ontology engineering and information science, outline design principles aligned with archival theory (provenance, original order, context), and present a modular ontology pattern-ARCO (Archival Records, Contexts & Operations) covering Records, Agents, Functions, Activities, Mandates, Places, Events and Concept Schemes. We define competency questions for retrieval, describe indexing and reasoning workflows. We close with implementation considerations for public-sector environments and future work on authority control, multilingual access, and alignment to domain vocabularies and linked open data. Background claims draw from handbooks on ontologies and ontology-driven information systems, visual knowledge modelling, and e-government data publishing.

Join the conversation!