TEI Conference and Members' Meeting 2022
September 12 - 16, 2022 | Newcastle, UK
Conference Agenda
Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
|
Session Overview |
Session | ||
Session 8A: Long Papers
| ||
Presentations | ||
ID: 102
/ Session 8A: 1
Long Paper Keywords: medieval studies; medieval literature; xforms; manuscript; codicology Codex as Corpus : Using TEI to unlock a 14th-century collection of Old French short texts University of Oxford, United Kingdom Medieval manuscript collections of short texts are, in a sense, materially discrete corpora, offering data that can help scholarship understand the circumstances of their composition and early readership. This paper will discuss the role played by TEI in an ongoing mixed-method study into a fourteenth-century manuscript written in Old French: Bibliothèque nationale de France, fonds français, 24432. The aim of the project has been to display how fruitful the combination of traditional and more data-driven approaches can be in the holistic study of individual manuscripts. TEI has been critical to the project so far, and has enabled discoveries about the manuscript which have eluded less technologically enabled generations of scholarship. For example, quantitative analysis of scribal abbreviation, made possible through the manuscript’s encoding, has illuminated the contributions of a number of individuals in the production of the codex. Similarly, analysis of the people and places mentioned in the texts allows for greater localisation of the manuscript than was previously considered possible. As with any project of this nature, the process of encoding BnF fr. 24432 in TEI has not been without difficulty, and so this paper will also discuss the ways in which attempts have been made to streamline the process through automation and UI tools, most notably in the case of this project through the use of XForms.
ID: 149
/ Session 8A: 2
Long Paper Keywords: ODD, ODD chaining, RELAX NG, schema, XSLT Stylesheets atop: another TEI ODD processor 1Northeastern University, United States of America; 2University of Neuchâtel, Switzerland; 3University of Victoria, Canada; 4State and University Library Hamburg, Germany TEI is, among other things, a schema. That schema is written in and customized with the TEI schema language system, ODD. ODD is defined by Chapter 22 of the _Guidelines_, and is also used to _define_ TEI P5. It can also be used to define non-TEI markup languages. The TEI supports a set of stylesheets (called, somewhat unimaginatively, “the Stylesheets”) that, among other things, convert ODD definitions of markup languages (including TEI P5) and customizations thereof into schema languages like RELAX NG and XSD that one can use to validate XML documents. Holmes and Bauman have been fantasizing for years about re-writing those Stylesheets from scratch. Spurred by Maus’ comment of 2021-03-23[1] Holmes presented a paper last year describing the problems with the current Stylesheets and, in essence, arguing that they should be re-written.[2] Within a few months the TEI Technical Council had charged Bauman with creating a Task Force for the purpose of creating, from scratch, an ODD processor that reads in one or more TEI ODD customization files, merges them with a TEI language (likely, but not necessarily, TEI P5 itself), and generates RELAX NG and Schematron schemas. It is worth noting that this is a distinctly narrower scope than the current Stylesheets,[3] which, in theory, convert most any TEI into any of a variety of formats including DocBook, MS Word, OpenOffice Writer, MarkDown, ePub, LaTeX, PDF, and XSL-FO (and half of those formats into TEI); and convert a TEI ODD customization file into RELAX NG, DTD, XML Schema, ISO Schematron, and HTML documentation. A different group is working on the conversion of a customization ODD into customized documentation using TEIPublisher.[4] The Task Force, which began meeting in April, comprises the authors. We meet weekly, with the intent of making slow, steady progress. Our main goals are that the deliverables be a utility that can be easily run on GNU/Linux, MacOS, or within oXygen, and that they be programs that can be easily maintained by any programmer knowledgeable about TEI ODD, XSLT, and ant. Of course we also want the program to work properly. Thus we are generating test suites and performing unit testing (with XSpec[5]) as we go, rather than creating tests as an afterthought. We have also developed naming and other coding conventions for ourselves and written constraints (mostly in Schematron) to help enforce them. So, e.g., all XSLT variables must start with the letter ‘v’, and all internal parameters must start with the letter ‘p’ or letters “tp” for tunnel parameters. We are trying to tackle this enormous project in a sensible, piecemeal approach. We have (conceptually) completely separated the task of assembling one or more customization ODDs with a source ODD into a derived ODD from the task of converting the derived ODD into RELAX NG, and from converting the derived ODD into Schematron. In order to make testing-as-we-go easier, we are starting with the derived ODD→RELAX NG process, and expect to demonstrate some working code at the presentation.
|