Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

 
 
Session Overview
Session
Session 4B: Long Papers
Time:
Thursday, 15/Sept/2022:
9:30am - 11:00am

Session Chair: Elisa Beshero-Bondar, Penn State Behrend
Location: ARMB: 2.16

Armstrong Building: Lecture Room 2.16. Capacity: 100

Presentations
ID: 138 / Session 4B: 1
Long Paper
Keywords: IPIF, Prosopography, Personography, Linked Open Data

From TEI Personography to IPIF data

R. W. J. Hadden1, G. Vogeler2,1

1Austrian Academy of Sciences, Austria; 2University of Graz, Austria

The International Prosopography Interchange Format (IPIF) is an open API and data model for prosopographical data interchange, access, querying and merging, using a regularised format. This paper discusses the challenges for converting TEI personographies into the IPIF format, and more general questions of using the TEI for so-called 'factoid' prospographies.

Hadden-From TEI Personography to IPIF data-138.docx


ID: 147 / Session 4B: 2
Long Paper
Keywords: data modeling, information retrieval, data processing, digital philology, digital editions

TEI as Data: Escaping the Visualization Trap

R. Rosselli Del Turco1, E. Magnanti2, G. Cerretini3

1Università di Torino, Italy; 2University of Vienna, Austria; 3Università di Pisa, Italy

During the last few years, the TEI Guidelines and schemas have continued growing in terms of capability and expressive power. A well-encoded TEI document constitutes a small treasure trove of textual data that could be queried to quickly derive information of different types. However, access to such data is mainly intended for visualization purposes in many edition browsing tools, e.g. EVT (http://evt.labcd.unipi.it/). Such an approach seems to be hardly compatible with the strategy of setting up databases to query this data, thus leading to a splitting of environments: DSEs to browse edition texts versus databases to perform powerful and sophisticated queries. It would be interesting to expand the capabilities of EVT, and possibly other tools, adding functionalities which would allow them to process TEI documents to answer complex user queries. This requires both an investigation to define the text model in terms of TEI elements and a subsequent implementation of the desired functionality, to be tested on a suitable TEI project that can adequately represent the text model.

The Anglo-Saxon Chronicle stands out as an ideal environment to test such a method. The wealth of information that it records about early medieval England makes it the optimal footing upon which to enhance computational methods for textual criticism, knowledge extraction and data modeling for primary sources. The application of such a method could here prove essential to assist the retrieval of knowledge otherwise difficult to extract from a text that survives in multiple versions. Bridging together, cross-searching and querying information dispersed in all the witnesses of the tradition would allow us to broaden our understanding of the Chronicle in unprecedented ways. Interconnecting the management of a wide spectrum of named entities and realia—which is one of the greatest assets of TEI—with the representation of historical events would make it possible to gain new knowledge about the past. Most importantly, it would lay the groundwork for a Digital Scholarly Edition of the Anglo-Saxon Chronicle, a project never undertaken so far.

Therefore, we decided to implement a new functionality capable of extracting and processing a greater amount of information by cross-referencing various types of TEI/XML-encoded data. We developed a TypeScript library to outline and expose a series of APIs allowing the user to perform complex queries on the TEI document. Besides the cross referencing of people, places and events as hinted above—on the basis of standard TEI elements such as <listPerson>/<person>, <listPlace>/<place>, <listEvent>/<event> etc.—we plan to support ontology-based queries, defining the relationships between different entities by means of RDF-like triples. In a similar way, it will be possible to query textual variants recorded in the critical apparatus by typology and witness distribution. This library will be integrated in EVT to interface directly with its existing data structures, but it is not limited to it. We are currently working on designing a dedicated GUI within EVT to make the query system intuitive and user-friendly.

Rosselli Del Turco-TEI as Data-147.docx


ID: 120 / Session 4B: 3
Long Paper
Keywords: linked data, conversion, reconciliation, software development

LINCS’ Linked Workflow: Creating CIDOC-CRM from TEI

C. Crompton, H. Zafar, A. Defours

University of Ottawa, Canada

TEI data is so often carefully curated without any of the noise and error common to algorithmically created data, that it is a perfect candidate for linked data creation; however, while most small TEI projects boast clean beautifully crafted data, linked data creation is often out of reach both technically and financially for these project teams. This paper reports (following where others have tread ) on the Networked Cultural Scholarship project (LINCS) workflow, mappings, and tools for creating linked data from TEI resources.

The process of creating linked data is far from straightforward since TEI is by nature hierarchical, taking its meaning from the deep nesting of elements. Any one element in TEI may be drawing its meaning from its relationship to a grandparent well up the tree (for example a persName appearing inside a listPerson inside the teiHeader is more likely to be a canonical reference to a person than a persName whose parent is a paragraph). Furthermore, the meaning of TEI elements are not always well-represented in existing ontologies and the time and money required to represented TEI-based information about people, places, time, and cultural production as linked data is out of reach of many small projects.

This paper introduces the LINCS workflow for creating linked data from TEI. We will introduce the named entity recognition and reconciliation service, NSSI (pronounced nessy), and its integration into a TEI-friendly vetting interface, Leaf Writer. Following NSSI reconciliation, Leaf Writer users can download their TEI with the entity uris in idno elements for their own use. If they wish to contribute to LINCS, they may proceed to enter the TEI document they have exported from Leaf Writer into XTriples, a customized version of Mainz’s Digitale Akademie’s XTriples tool of the same name, which converts TEI to CIDOC-CRM for either private use, or for integration into the LINCS repository. We have adopted the XTriples tool because it meets the needs of a very common type of TEI user: the director or team member of a project who is not going to be able to learn the intricacies of CIDOC-CRM, or indeed perhaps not even of linked data principles, but would still like to contribute their data to LINCS. That said, we are keen to get the feedback of the expert users of the TEI community on our workflow, CIDOC-CRM mapping, and tools.

1.

Bodard, Gabriel, Hugh Cayless, Pietro Liuzzo, Chiara Cenati, Alison Cooley, Tom Elliott, Silvia Evangelisti, Achille Felicetti, et al. “Modeling Epigraphy with an Ontology.” Zenodo, March 26, 2021.

Ciotti, Fabio. “A Formal Ontology for the Text Encoding Initiative.” Umanistica Digitale, vol. 2, no. 3, 2018.

Eide, Ø., and C. Ore. “From TEI to a CIDOC-CRM Conforming Model: Towards a Better Integration Between Text Collections and Other Sources of Cultural Historical Documentation.” Digital Humanities, 2007.

Ore, Christian-Emil, and Øyvind Eide. “TEI and Cultural Heritage Ontologies: Exchange of Information?” Literary and Linguistic Computing, vol. 24, no. 2, 2009, pp. 161–72., https://doi.org/10.1093/llc/fqp010.

Crompton-LINCS’ Linked Workflow-120.docx