TEI Conference and Members' Meeting 2022

September 12 - 16, 2022 | Newcastle, UK

JavaScript is Disabled
Your browser's JavaScript functionality is disabled. It has to be enabled to use this function of ConfTool.
Here you can find information on how to enable JavaScript
If you have any problems, please contact the organisers at tei2022@ncl.ac.uk.

Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

Session Overview

Session

Import to your local calendar

Poster Slam, Session, and Reception

Time:

Wednesday, 14/Sept/2022:

4:30pm - 6:00pm

Session Chair: Syd Bauman, Northeastern University

Location: ARMB: King's Hall

Armstrong Building's 'King's Hall'. Capacity 440

The Poster Slam and Session will start with a 1 minute - 1 slide presentation by all poster presenters summarising their poster and why you should come see it.

There will be an informal drinks and nibbles reception during the poster session.

Presentations

ID: 115 / Poster Session: 1
Poster
Keywords: Early modern history, Ottoman, Edition

The QhoD project: A resource on Habsburg-Ottoman diplomatic exchange

S. Kurz, M. Mayer, Y. Yılmaz

Austrian Academy of Sciences, Austria

After having started as a cross-disciplinary (Early modern history, Ottoman studies) in 2020, the Digitale Edition von Quellen zur habsburgisch-osmanischen Diplomatie 1500–1918 (QhoD) project has recently gone public with their TEI based source editions related to the diplomatic exchange between the Ottoman and Habsburg empires.

Unique features of QhoD are:

- QhoD is editing sources from both sides (Habsburg and Ottoman archives), giving complimentary views; Ottoman sources are translated into English

- diversity of source genres (e.g. letters, contracts, travelogues, descriptions and depictions of cultural artefacts in LIDO; protocol register entries, Seyahatnâme, Sefâretnâme, newspapers, etc.)

- openness to outside collaboration (bring your TEI data!)

For Ottoman sources, QhoD is adhering to the İslam Ansiklopedisi Transkripsiyon Alfabesi transcription rules (Arabopersian to Latin transliteration). Transcriptions are aided by using Transkribus HTR mainly for German language sources, with ventures into Ottoman HTR together with other projects. Named entity data is curated in a shared instance of the Austrian Prosopographical Information System (APIS), aligned to GND identifiers.

By the time of writing, <https://qhod.net> features

- by language: 60 German, 42 Ottoman language documents

- by genre: 60 letters, 20 protocol register entries, 16 official records, 5 artefacts, 4 travelogues, 4 reports, 3 instructions.

- by embassy/timeframe: 16 sources related to correspondence between Maximilian II and Selim II (1566–1574); 31 sources on Rudolf Schmid zu Schwarzenhorn’s internuntiature (1649); 61 sources on the mutual grand embassies of Virmont and Ibrahim Pasha (1719–1720)

The poster will describe those sources and the TEI-infused reasoning behind their edition, as well as the technical implementation, which uses the GAMS repository software to archive and disseminate data.

QhoD uses state-of-the-art TEI/XML technology to improve availability of archival material essential for understanding centuries of mutual relations between two large imperial entities.

Kurz-The QhoD project-115.odt

ID: 110 / Poster Session: 2
Poster
Keywords: Semantic Web, Mobility Studies, Travelogues

Building a digital infrastructure for the edition and analysis of historical travelogues

S. Balck

IOS Regensburg, Germany

The core of the project stems from the unpublished records of Franz Xaver Bronner's (1758 - 1850) journey from Aarau, via St. Petersburg, to the university in Kazan (1810) and his way back (1817) via Moscow, Lviv and Vienna. A digital edition of these manuscripts will be created, enhanced by Semantic Web and Linked Data technologies.

The project will use the annotated critical text edition of the work above as a Case Study with the aim of developing a modularly expandable digital research infrastructure. This infrastructure will support digital transcription, annotation and visualisation of travelogues.

In the preliminary stages of the project, the first and more extensive part (the outward journey) of Franz Xaver Bronner's travelogue manuscript has already been transcribed with Transkribus. High-quality digital copies were made for Handwritten Text Recognition and training modules were developed on the basis of the manually transcribed texts. These are to be used for the semi-automatic transcription of other related texts. People, places, travel and other events were annotated with XML markup elements using TEI.

In the next step, visualisations and ontology design patterns for travelogues and itineraries will be developed. This includes a new annotation scheme for linking the TEI annotated text passages to associated database entries. The edition will enable the visualisations of textual information and contextual data.

Balck-Building a digital infrastructure for the edition and analysis-110.docx

ID: 122 / Poster Session: 3
Poster
Keywords: scholarly digital editions, conceptual model, digital philology, textual criticism, text modeling

TEI and Scholarly Digital Editions: how to make philological data easier to retrieve and elaborate

C. Martignano

University of Florence, Italy

In the past few decades the number of TEI-encoded scholarly digital editions (SDEs) has risen significantly, which means that a big amount of philologically edited data is now available in a machine-readable form. One could try to apply computational approaches, in order to further study the linguistic data, the information about the textual transmission, etc. contained in multiple TEI-encoded digital editions. The problem is that retrieving philological data through different TEI-encoded SDEs is not that simple.

Every TEI-encoded edition has its own markup model, designed to respond to the philological and editorial requirements of that particular edition. Hence, it is difficult to share a markup model between various editions.

A possible way to bridge multiple digital editions, despite their different markup solutions, is to map them onto a same model that is able to represent SDEs on a more abstract level.

This kind of mapping would be particularly useful when the markup solutions are more prone to various interpretations or more ambiguous. The TEI guidelines, for example, show how the @type attribute can be used with the <rdg> element to distinguish between different types of variants. However, every edition may have its own set of possible values, beyond “orthographic” and “substantive”, to markup a wider range of phenomena of the textual transmission.

To build a model capable of representing different editions is a challenging task, for “scholarly practice in representing critical editions differs widely across disciplines, time periods, and languages.” However there is common ground that can be used to model scholarly editing: what the editor reads in the source/s, how the editor compares different sources and finally what the editor writes in the edition. Around these three concepts I am building a model that aims at making philological data more visible and easier to further elaborate.

Martignano-TEI and Scholarly Digital Editions-122.docx

ID: 117 / Poster Session: 4
Poster
Keywords: Spanish literature, Digital library, Services, TEI-Publisher

Between Data and Interface, Building a Digital Library for Spanish Chapbooks with TEI-Publisher

E. Leblanc, P. Jacsont

University of Geneva, France

This poster will present the project Untangling the cordel (2020-2023) and its experimentations with TEI-Publisher to develop a digital library (DL) that aims at studying and promoting the Geneva collection of Spanish chapbooks (Leblanc and Carta 2021).

Intended for a wide audience and sold in the streets, chapbooks recount fictitious or real events as well as songs, dramas, or religious writings. Although their contents are varied, they are characterised by their editorial form, i.e. small texts (4 to 8 pages), in in-quarto, arranged in columns and decorated with woodcuts. The interest in chapbooks ranges from literature to art and book history, sociology, linguistics, or musicology. This diversity reflects the hybridity of chapbooks, at the frontier between document, text, image, and orality (Botrel 2001; Gomis and Botrel 2019, 127–30).

An editorial workflow based on XML-TEI to display our corpus online was devised. After transcribing texts with HTR tools, they were 1) converted the transcriptions in XML-TEI via XSLT, 2) stored them in eXist-DB, and 3) published them with TEI-Publisher. Images of the documents are displayed with IIIF. Through this workflow, DL can offer services that stress different aspects of chapbooks.

Working with TEI-Publisher has influenced the way we think about our XML-TEI model. If the choices we have made are mainly driven by data, it appears that part of them have been influenced by the functionalities we wanted to implement, such as the addition of image links or keywords. Thus, our ODD reflects not only the nature of our documents but also the DL services. In this context, the use of TEI-Publisher invites us to reconsider a strict distinction between “data over interface” and “interface over data” (Dillen 2018), as data and interface are here mutually influenced.

Leblanc-Between Data and Interface, Building a Digital Library-117.docx

ID: 157 / Poster Session: 5
Poster
Keywords: software, editors, oxygen, frameworks, annotations

oXbytei and oXbytao. A Stack of Configurable oXygen Frameworks

C. Lück

Universität Münster, Germany

Until recently, adapting author mode frameworks for the oXygen XML

editor was rather limited. A framework was either a base framework

like TEI-C's *TEI P5* framework, or it was based on a base

framework. But since version 23.1+, the mechanism of *.framework files

for configuring frameworks is replaced/supplemented with extension

scripts. This allows us to design arbitrarly tall stacks of

frameworks, not only limited to height level 2. It's now possible to

design base and intermediate frameworks with common functions. Only a

thin layer is required for project specific needs.

oXbytei and oXbytao are such intermediate and higher level frameworks.

oXbytei is based on TEI-C's *TEI P5* framework. Its design idea is to get

as much of its configuration as possible from the TEI document's

header. E.g. depending on the variant encoding declared in the header

it produces a parallel segmentation, double end-point attached or

location referenced apparatus. Since not all information for setting

up the editor is available in the header, oXbytei comes with its own

XML configuration. It ships with Java classes for rather complex

actions. It has a plugin interface for aggregating and selecting

information either from local or remote norm data. It also offers actions

for generating anchor-based annotations, either with TEI `<span>` or

in RDF/OWL with OA.

oXbytao is a level 3 framework based on oXbytei. It offers common

actions that are more biased towards a certain kind of TEI usage,

e.g. for `<corr>` and `<choice>` or for encoding multiple recensions

of the same text within a single TEI document. It defines a template

directory on each oXygen project. CSS styles offer a collapsed and an

expanded view and optional views on the header or for editing through

form controls etc. All styles are fully customizable on a project

basis.

https://github.com/SCDH/oxbytei

https://github.com/SCDH/oxbytao

Lück-oXbytei and oXbytao A Stack of Configurable oXygen Frameworks-157.odt

ID: 141 / Poster Session: 6
Poster
Keywords: automation, validation, continuous integration, continuous deployment, error reports, quality control

Automatic Validation, Packaging and Deployment of TEI Documents. What Continuous Integration can do for us

C. Lück

Universität Münster, Germany

Keeping TEI documents in a Git repository is one way to store

data. However, Git does not only excel in robustness against data

losses, downtime of internet connections and enabling collaboration on

a TEI editions. Git-Servers also leverage automation of re-current

tasks: validation of all our TEI documents, generating human-readable

reports about their validity, assembling them in a data package, and

deploying it on a publication environment. These tasks can be

processed automatically in a continuous integration (CI) pipeline. In

software development, CI has established as a key to quality

assurance. It gets its strength from automation by running tests

*regularly* and *uniformly*. For obvious reasons, CI has been

transferred to quality assurance of research data (in life sciences)

by Cimiano et al. (2021). The poster presentation will be on a data

template for TEI editions, that runs the above listed tasks on a

Gitlab server or on Github and even generates and deploys an

Expath-Package on TEI Publisher:

https://github.com/scdh/edition-data-template-cx

The template extends the data template for TEI Publisher.[^1] It uses

Apache Maven as a pipeline driver because Maven only needs a

configuration file and thus enables us to keep our repository free of

software.[^2] It validates all TEI documents against common RNG and

schematron files. Jing's output is parsed and a human-readable report

is created and deployed on the Git-Server's publication environment

(e.g. gitlab pages). On successful validation, a XAR package is

assembled and deployed on a running TEI publisher instance.

References

Cimiano, Ph. et al. (2021): Studies in Analytic Reproducibility. The

Conquaire Project. U Bielefeld Press. doi: 10.4119/unibi/2942780

[^1]:

[https://github.com/eeditiones/tei-publisher-data-template](https://github.com/eeditiones/tei-publisher-data-template)

[^2]: Using GNU Make would not be portable and XProc lacks incremental

builds. Replacing Maven with Gradle is under development.

Lück-Automatic Validation, Packaging and Deployment of TEI Documents What Continuous Integration can do-141.odt

ID: 137 / Poster Session: 7
Poster
Keywords: braille, bibliography, accessibility, book history, publishing

Adapting TEI for Braille

E. Forget

University of Toronto, Canada

Bibliography as a field has undergone rapid changes to adapt for ever-evolving book formats in the digital age. Methods, tools, and techniques originally meant for manuscripts and printed books have now been adjusted to apply to ebooks (Rowberry 2017; Galey 2012 and 2021), audiobooks (Rubery 2011), and other bookish objects (Pressman 2020). However, there is much less work currently available that considers the bibliographical differences of accessible book formats or, more specifically, braille as a book format. Braille lends itself well to bibliographical analysis due to its tactile nature, but traditional bibliographical methods and tools were not developed with braille in mind and must be adapted to work with braille.

As part of a larger braille bibliography-focussed project, I am adapting TEI to work for analyzing braille books—specifically braille editions of illustrated children’s books. Illustrated books offer additional complexity to textual analysis that is compounded by the forced hierarchy of linear-text tools, and working with braille editions of illustrated books further complicates questions of hierarchy and format descriptions.

This poster will showcase the progress I have made so far in adapting TEI to work with braille, specifically using the multilingual prototype book as an example. The poster will touch on questions of textual hierarchy, line length/breaks, illustration descriptions, braille and format descriptions, and how languages are tagged, and it will include a wish list of TEI needs that I have not successfully adapted yet, as this is a work-in-progress project.

Forget-Adapting TEI for Braille-137.docx

ID: 135 / Poster Session: 8
Poster
Keywords: lexicography, Okinawan, endangered language, multiple writing systems, language revitalization

Okinawan Lexicography in TEI: Challenges for Multiple Writing Systems

S. Miyagawa¹, K. Kato², M. Zlazli³, S. Machida⁴, S. Carlino⁵

¹National Institute for Japanese and Linguistics (NINJAL), Japan; ²Tokyo University of Foreign Studies, Japan; ³SOAS University of London, UK; ⁴University of Hawaiʻi at Hilo, US; ⁵Kyushu University/Hitotsubashi University, Japan

Okinawan is classified as one of the Northern Ryukyuan languages in the Japonic language family. It is primarily spoken in the south and central parts of the Okinawa Island of the Ryukyu Archipelago. It was the official lingua franca of the Ryukyu Kingdom and a literary vehicle; e.g., the Omoro Soshi poetry collection, but currently an endangered language. Okinawan has been recorded in various written forms: A combination of Kanji logograms and Hiragana syllabary with archaic spellings (e.g. Omoro Soshi) or modern spelling variations to approximate actual pronunciation, pure Katakana syllabary (e.g., Bettelheim’s Bible translation), Latin alphabet (mostly by linguists), and pure Hiragana (popular).

The Okinawago Jiten (Okinawan Dictionary; OD), published by the National Institute for Japanese Language and Linguistics (NINJAL) in 1963 and revised in 2001[1], uses the Latin alphabet for each lexical entry. We first added the possible writing forms listed above to the data in CSV format. We then converted the CSV into TEI XML using Python. Figure 1 presents a sample encoding of the TEI file for each entry. Here, we solved the multiple writing forms with <orth> tags with corresponding writing systems in @xml:lang attribute following BCP47[2] (e.g., xml:lang=”ryu-Hira'' for Okinawan words written in Hiragana). We added the International Phonetic Alphabet (IPA) and the accent type to make the pronunciation clearer with the <pron> tags.

Fig. 1 TEI of each lexical entry

Using XSLT, we transformed this TEI file into a static webpage with a user-friendly GUI, as shown in Figure 2. It is anticipated that this digitization of OD and its publication under the open license will benefit key stakeholders, such as Okinawan heritage learners and worldwide Okinawan learners, being the largest Okinawan dictionary available online.

Fig. 2 Webpage rendition of TEI

Miyagawa-Okinawan Lexicography in TEI-135.docx

ID: 155 / Poster Session: 9
Poster
Keywords: 3D scholarly editions, annotation, <souceDoc>, Babylon.js

Text as Object: Encoding the data for 3D annotation in TEI

J. Ogawa¹, K. Nagasaki², I. Ohmukai³, Y. Nakamura³, A. Kitamoto¹

¹Center for Open Data in the Humanities, Japan; ²International Institute for Digital Humanities, Japan; ³University of Tokyo, Graduate School of Humanities and Sociology

This poster will present a way of representing the text on a 3D object and its annotations in TEI. Since the concept of 3D scholarly editions has recently been discussed in the field of Digital Humanities, we try to provide experimentally a practical method contributing to the realization of this concept.

Ogawa-Text as Object-155.docx

ID: 158 / Poster Session: 10
Poster
Keywords: Japanese text, close reading, interface, CETEIcean

Building Interfaces for East Asian/Japanese TEI data

K. Nagasaki¹, S. Nakamura², K. Okada³

¹International Institute for Digital Humanities, Japan; ²Historiographical Insutite, The University of Tokyo; ³Hokkai Gakuen University

Over the past several years, East Asian/Japanese (henceforth, EAJ) TEI data have been created in various fields. Under the situation, one of the issues the authors have been working on is constructing an easy-to-use interface. In this presentation, we will report the activity.

Nagasaki-Building Interfaces for East AsianJapanese TEI data-158.docx

ID: 170 / Poster Session: 11
Poster
Keywords: Natural Language Processing, Explainable AI, Computing, Social Media, Hate Speech

Explainable Supervised Models for Bias Mitigation in Hate Speech Detection: African American English

A. Gabriel, M. Sinclair

Northumbria University

Automated hate speech detection systems have great potential in the realm of social media but have seen their success limited in practice due to their unreliability and inexplicability. Two major obstacles they have yet to overcome is their tendency to underperform when faced with non-standard forms of English and a general lack of transparency in their decision-making process. These issues result in users of low-resource languages (those that have limited data available for training) such as African-American English being flagged for hate speech at a higher rate than users of mainstream English. The cause of the performance disparity in these systems has been traced to multiple issues including social biases held by the human annotators employed to label training data, training data class imbalances caused by insufficient instances of low-resource language text and a lack of sensitivity of machine learning (ML) models to contextual nuances between dialects. All these issues are further compounded by the ‘black-box’ nature of the complex deep learning models used in these systems. This research proposes to consolidate seemingly unrelated recently developed methods in machine learning to resolve the issue of bias and lack of transparency in automated hate speech detection. The research will utilize synthetic text generation to produce a theoretically unlimited amount of low-resource language text training data, machine translation to overcome annotation conflicts caused by contextual nuances between dialects and explainable ML (including integrated gradients and instance-level explanation by simplification). We will attempt to show that when repurposed and integrated into a single system these methods can both significantly reduce bias in hate speech detection tasks whilst also providing interpretable explanations of the system’s decision-making process.

Gabriel-Explainable Supervised Models for Bias Mitigation-170.docx

ID: 105 / Poster Session: 12
Poster
Keywords: manucript studies, palaeography, IIIF, cataloguing

A TEI/IIIF Structure for Adding Palaeographic Examples to Catalogue Entries

S. M. Winslow

University of Graz, Austria

The study of palaeography generally relies on either expert testimony with sparse examples or separate, specialist catalogues imaging and documenting the specific characteristics of each hand. Both practices presumably made much more sense due to the cost, difficulty, and space used by printed catalogues in the past, but with modern practice in cataloguing manuscripts via TEI and disseminating images via IIIF, these difficulties have been largely obviated. Accordingly, it is desirable to have a simple, consistent, and searachable way to embed examples of manuscript hands within the TEI, as a companion to elements from msdescription that describe hand features. This poster will demonstrate a simple and re-useable structure for embedding information about the palaeography of manuscript hands in msdescription and associating it with character examples using IIIF. An example implementation, part of the Hidden Treasures from the Syriac Manuscript Heritage project, will be demonstrated and an ODD containing the new elements and strcuture will be made available.

ID: 123 / Poster Session: 13
Poster
Keywords: Digital edition, projects, cooperation, digital texts, infrastructure

From facsimile to online representation. The Centre for Digital Editions in Darmstadt. An Introduction

K. Fischer, S. Kalmer, D. Kampkaspar, S. Müller, M. Scheffer, M. E.-H. Seltmann, K. Wunsch

University and State Library Darmstadt, Germany

The Centre for Digital Editions in Darmstadt (CEiD) covers all aspects of preparing texts for digital scholarly editions from planning to publication. It not only processes the library's own holdings, but also partners with external institutions.

Workflow

After applying various methods for text recognition (OCR/HTR) the output is used as a starting point for the realisation of the digital edition as an online publication. In addition, a variety of transformation tools is used to convert texts from different formats such as XML, JSON, WORD-DOCX or PDF into a wide range of TEI-based formats (TEI Consortium 2022), thus enabling uniformity across different projects. These texts can be annotated and enriched with metadata. Furthermore, entities can be marked up, which are managed in a central index file. This workflow is not static, but can be adapted according to the needs of the project.

Framework

The XML files are stored in eXist-db (eXist Solutions 2021) and presented in various user-friendly ways with the help of the framework wdbplus (Kampkaspar 2018). By default, the corresponding scan and the transcribed text are presented side by side. Additionally, different forms of presentation are available so that the special needs of individual projects can be taken into account. Further advantages of wdbplus are various APIs, which not only allow the retrieval of individual texts, but also of metadata and further information. Full-text search is realised at project level as well as across projects.

CEiD's portfolio includes several projects in which a multitude of texts are processed. The source material ranges from early modern prints and manuscripts to more recent texts and includes early constitutional texts, religious peace agreements, newspapers and handwritten love letters.

Fischer-From facsimile to online representation The Centre-123.docx

ID: 173 / Poster Session: 14
Poster
Keywords: Software Sustainability, Software Development, DH Communities

From Oxgarage to TEIGarage and MEIGarage

P. Stadler, A. Ferger, D. Röwenstrunk

Paderborn University, Germany

Poster proposal for presenting the history and future development of the OxGarage.

Stadler-From Oxgarage to TEIGarage and MEIGarage-173.odt

ID: 176 / Poster Session: 15
Poster
Keywords: marginalia, Old English, mise-en-page, sourceDoc, facsimile

Towards a digital documentary edition of CCCC41: The TEI and Marginalia-Bearing Manuscripts

P. O Connor

University of Oxford, United Kingdom

The specific aim of this case study is to demonstrate how the TEI Guidelines have transformed the representation of an important corollary of the medieval production process; the annotations, glosses and other textual evidence of an interactive engagement with the text. Cambridge, Corpus Christi College MS 41 (CCCC MS 41) best exemplifies the value of the TEI in this respect as this manuscript is noted for containing a remarkable record of textual engagement from early medieval England. CCCC MS 41 is an early-eleventh century manuscript witness of the vernacular translation of Bede’s Historia ecclesiastica, commonly referred to as the Old English Bede. However, in addition to preserving the earliest historical account of early medieval England, the margins of CCCC MS 41 contain numerous Old English and Latin texts. Of the 490 pages of CCCC MS 41, 108 pages contain marginal texts which span several genres of Old English and Latin literature; and thereby provide the potential for substantial evidence of interaction with the manuscript’s central text.

While the marginalia of CCCC MS 41 continue to excite scholarly attention, the representation of this vast body of textual engagement poses certain challenges to editors of print scholarly editions. This poster emphasises the importance of the transcription process in successfully conveying the mise-en-page of marginalia-bearing manuscripts and explains how adopting the <facsimile> or <sourceDoc> approach encourages further engagement with and a deeper understanding of CCCC MS 41’s marginalia.

O Connor-Towards a digital documentary edition of CCCC41-176.docx

ID: 169 / Poster Session: 16
Poster
Keywords: letters, America, France, transnational, networks

Transatlantic Networks - a Pilot: mapping the correspondence of David Bailie Warden (1772-1845)

J. Orr, S. Howard, J. Cummings

Newcastle University, United Kingdom

The scientific revolution of the nineteenth century is often seen as remediating the early modern republic of letters (Klancher) from the pens of learned individuals to learned Institutions. This project aims to map the transatlantic network of one of the most important hubs in the exchange of literary and scientific correspondence, David Bailie Warden (1772-1845). Warden is known as an Irish political asylum seeker, American diplomat, and respected Parisian scientific writer in his own right, authoring and collaborating in foundational statistical works on America, the burgeoning natural sciences, and anti-slavery. More importantly, his correspondence with at least 3000 individuals and learned institutions reframes our perspective on the scientific revolution, its historical context, and its everyday activities. In addition to traditional close reading methods, this project tests methods from the field of scientific network analysis to enable us to identify other important network nodes, enabling the process of continual discovery. This project seeks to compile not only a ‘who’s who’ of the intellectual community in this period but to identify previously hidden facilitative figures whose importance to the fabric of the republic of letters might not be at first obvious due to a range of marginalising factors including: social class, transnationality, gender, religion, or other liminal identities.

Mobile View Print View

Contact and Legal Notice · Contact Address:
Privacy Statement · Conference: TEI 2022