Conference Agenda

Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).

 
Only Sessions at Location/Venue 
 
 
Session Overview
Session
Session 7B: Long Papers
Time:
Friday, 16/Sept/2022:
9:30am - 11:00am

Session Chair: Gimena del Rio Riande, CONICET
Location: ARMB: 2.16

Armstrong Building: Lecture Room 2.16. Capacity: 100

Show help for 'Increase or decrease the abstract text size'
Presentations
ID: 132 / Session 7B: 1
Long Paper
Keywords: TEI, born-digital heritage, retrocomputing, digitality, materiality

Is it still data? Scholarly Editing of Text from Early Born-Digital Heritage

T. Roeder

Universität Würzburg, Germany

Digital heritage is strongly bound to original devices and displays. Even in today’s standardized environments, text can change its appearance depending on the monitor technology, on the processing software, and on the available fonts on the system: Text as data depends much on technical interpretation.

Creating a scholarly digital edition from born-digital heritage, expecially text, needs to consider the original conditions, like encoding and hardware standards. My question is: Are the encoding guidelines of the TEI suitable for representing born-digital text? How much information is required about the original environment? Can a screenshot serve as facsimile, or it is neccessary to link to emulated states of the display software?

To give an example, I will present a preliminary scholarly TEI-based digital edition of “disk magazines”. These magazines were a special type of periodical that was published mostly on floppy disk mainly in the 1980s and 1990s. Created by home computer enthusiasts for the community, disk magazines are potentially valuable as a historical resource to study the experiences of programmers, users and gamers in the early stage of microcomputing.

In the examples (one of them is available at https://diskmags.github.io/md_87-11.html), the digital texts are decompressed byte sequences of PETSCII code, which is only partially compatible to ASCII. The appearance of the characters could be changed completely by the programmer to display foreign characters or alphabets. Further, it depended on a 40x25 characters layout, where text had to be aligned manually by inserting whitespaces. The once born-digital text – as data – is transformed into readable text – as image – on a screen. The example demonstrated that the connection between textual data and textual display can be very fragile.

For TEI encoding, this would have some consequences. On the one side, there would be a requirement to preserve as much of the original data as possible. On the other side, a scholarly edition needs to represent the semantics of the visible document. It would require an interpretative layer to communicate between these two levels, which could be implemented by different markup strategies; however it needs to be discussed whether classes like “att.global.rendition” are actually suited for this. It also needs to be discussed in which way a digital document (or which instance of it: as stored data, as memory state, as display?) can be interpreted in the same way as a material document – and which implications this would have for TEI encoding of born-digital heritage.

Roeder-Is it still data Scholarly Editing of Text from Early Born-Digital Heritage-132.odt


ID: 152 / Session 7B: 2
Long Paper
Keywords: publishing, LOD, TEI infrastructure

Using Citation Structures

H. Cayless

Duke University, United States of America

This paper is really a follow-up to one I gave at Balisage in 2021.[1] Citation Structures are a TEI feature introduced in version 4.2.0 of the Guidelines, which provide an alternative (and more easily machine-processable) method for declaring their internal structures.[2] This mechanism is important because of the heterogeneity of texts and consequently of the TEI structures used to model them. This heterogeneity necessarily means it is difficult for any system publishing collections of TEI editions to treat their presentation consistently. For example, a citation like “1.2” might mean “poem 1, line 2” in one edition, and “book 1, chapter 2” in another. It might be perfectly sensible to split an edition into chapters, or even small sections, for presentation online, but not at all to split a poem into lines (though maybe groups of lines might be desirable). A publication system otherwise will have to rely on assumptions and guesswork about the items in its purview, and may fail to cope with new material that does not behave as it expects. Worse, there is no guarantee that the internal structures of editions are consistent within themselves. We might consider, for example, Ovid’s ‘Tristia’, in which the primary organizational structure is book, poem, line, but book two is a single, long poem.

Citation structures permit a level of flexibility hard to manage otherwise, by allowing both nested structures and alternative structures at every level. In addition, a key new feature of citation structures over the older reference declaration methods is the ability to attach information that may be used by a processing system to each structural level. The <citeData> element which makes this possible will allow, for example, a structural level to indicate what name it should be given in a table of contents, or even whether or not it should appear in such a feature.

I will discuss the mechanics of creating and using citation structures. Finally, I will present a working system in XSLT that can exploit <citeStructure> declarations to produce tables of contents, split large documents into substructures for presentation on the web, and resolve canonical citations to parts of an edition.

1. https://www.balisage.net/Proceedings/vol26/html/Cayless01/BalisageVol26-Cayless01.html

2. See https://tei-c.org/release/doc/tei-p5-doc/en/html/CO.html#CORS6 and https://tei-c.org/release/doc/tei-p5-doc/en/html/SA.html#SACRCS.



ID: 160 / Session 7B: 3
Long Paper
Keywords: Manuscript cataloguing, semantic markup, retro-conversion vs. born-digital

Text between data and metadata: An examination of input types and usage of TEI encoded texts

T. Schaßan

Herzog August Bibliothek Wolfenbüttel, Germany

Many texts that have been encoded using the TEI in the past are retro-converted from printed sources: manuscript catalogues and dictionaries are examples for highly structured texts, drama, verse, and performance texts are usually less structured, editions appear somewhere inbetween.

Many of the text types for which the TEI offers specialised elements represent both metadata and data, according to the scenarios in which these texts are used.

In the field of manuscript cataloguing, it has been a question for a long time whether the msdescription module is sufficient for the representation of a retro-converted text of a formerly printed catalogue. One may argue, that a catalogue is first of all a visually structured text, a succession of paragraphs, whose semantics are only loosely connected to the main elements the TEI defines, such as <msContents>, <physDesc>, or <msPart>. On the other hand, on a sub-paragraph level, the TEI offers structures, which may not be align-able with the actual text of the catalogue so that the person who carries out the retro-conversion has to decide whether to change the text according to the TEI schema rules or encode the text semantically wrong or structure the text with much less semantic information as it would be possible.

Now, that the TEI is more and more used to store these kind of texts as born-digitals, the questions is whether the structures offered by the TEI meet all the needs the texts and their authors might have in different scenarios: Is a TEI-encoded text of a given kind equally useful for all search and computational uses, as well as publishing needs? Are the TEI structures flexible enough or do they privilege some uses over others? How much of the semantic information is encoded in the text and how much of it might be realised only in the processing of the sources?

In this paper, manuscript catalogues serve as an example for the more general question about what structures, how much markup and what kind of markup is needed in the time of powerful search engines and artificial intelligence, authority files and the Linked Open Data.

Schaßan-Text between data and metadata-160.odt


 
Contact and Legal Notice · Contact Address:
Privacy Statement · Conference: TEI 2022
Conference Software - ConfTool Pro 2.6.145+CC
© 2001–2022 by Dr. H. Weinreich, Hamburg, Germany