Conference Agenda
Overview and details of the sessions of this conference. Please select a date or location to show only sessions at that day or location. Please select a single session for detailed view (with abstracts and downloads if available).
|
Session Overview |
Session | |||||
Virtual Poster Session on Gather.Town
The link to gather.town https://app.gather.town/app/DVLCOOcP1lTL5Zkh/TEI2022 will only work at the time of the session.
| |||||
Presentations | |||||
ID: 142
/ Virtual Poster Session: 1
Virtual Poster Keywords: digital edition, text processing, data management, tei publisher, ocr From Archives to TEI Publisher: Digital Edition of German Work Regulations in the Project 'Non-state Law of the Economy' Max Planck Institute for Legal History and Legal Theory, Germany The aim of the project 'Non-state Law of the Economy' is to build a digital collection of primary sources, showing the normative world of industrial relations in the German metal industry of the 19th and 20th centuries. This collection includes various types of textual documents: collective and individual agreements, internal work regulations, rental contracts of company flats, company health or pension insurance, etc. The focus of my poster is on the life cycle of the textual data inside our project, in particular, on showing what stages our data comes through from archives to publication on TEI Publisher. I am sure that this poster could become a valuable contribution to the discussion about effective textual data workflow strategies among the TEI community. The core of the project’s approach to handling of the sources is a wide use of computer-assisted processing and various digital humanities tools and methods. The current workflow begins in archives, where the textual data is collected with the help of our researcher’s smartphones and a scan tent. As a next step an open-source OCR software is applied to the scans obtained earlier and the output in form of XMLPages format is produced, which requires a subsequent XSLT transformation to a standard TEI XML. Once a TEI-compliant document is produced, it is possible to begin with a basic structural annotation of the text. After that the text goes to the correction, if necessary (in case there are some unclear spots where the OCR programme failed to recognise characters successfully) and then finally to TEI Publisher Platform, where it is subsequently annotated and enriched with more specific meta data. At the present moment there are around 50 sources published at our instance of TEI Publisher and around 300 documents being processed at different stages.
ID: 174
/ Virtual Poster Session: 2
Virtual Poster Keywords: feature structures, character analysis, theater, personography, Alsatian Feature structures for character social variable annotation and an application to Alsatian theater 1Université de Strasbourg, France; 2Université de Neuchâtel, Switzerland Several works address the computational treatment of dramatic characters. Zöllner-Weber (2008, 2011) presents a character analysis ontology. Galleron (2017) developed a characteriseme (characterization unit) taxonomy based on character lists in French theater between 1630 and 1810, formalized as a TEI feature structure (FS) library (see Romary, 2015). Following Phelan (1989), the taxonomy includes mimetic features, which give characters traits assimilating them to humans, and synthetic ones, describing their role in the plot. We believe that characteriseme analysis using a common annotation schema can help comparative drama analysis. We successfully adapted Galleron’s FS approach to model characters in a different language (Alsatian) and period (1870-1940). This can help compare the Alsatian tradition to the hegemonic literatures surrounding it (German and French), one of the goals towards which our ongoing MeThAL project contributes (Ruiz et al., 2022). The poster’s contributions: - A character feature (characteriseme) taxonomy using feature structures, inspired by Galleron (2017) but providing an improved, more modular implementation, and enabling the description of more recent drama - A TEI personography where each of our corpus’ 2386 characters is described according to the feature structure - First characterization analyses in the corpus based on it Intermediate levels were added in our FS to better group mimetic features into basic traits (age, gender, origin, language), socioeconomic traits (profession, class) and relation-position traits (where a character stands in personal or professional relations, e.g. spouse or manager). Controlled vocabularies were added, including a list of ca. 350 professions and a taxonomy of socioprofessional groups. Personography compliance was ensured with a schema automatically derived from the FS System Declaration (Bermúdez, 2019). The annotations have yielded insight into how female characters are characterized differently by female authors (increased reference to character’s profession) vs. male ones. An interface to navigate the corpus based on the annotations was created.
ID: 172
/ Virtual Poster Session: 3
Virtual Poster Keywords: language, script, typography, multilingualization Multilingualism and multiscriptism in TEI publishing: DH2022 1International Institute for Digital Humanities, Japan; 2Graduate school of the University of Tokyo; 3The University of Tokyo In the conference DH2022 Tokyo held in this July, the book of abstracts has been published entirely through the XSL-FO pipeline based on ADHO’s DHConvalidator and TEI to PDF Book Creator. The texts of the abstracts were converted by each author with DHConvalidator, which generates a format not always expected by the original TEI to PDF Book Creator. Moreover, while the text body is mostly written in English and some other languages which are accepted in CFP, it also embraces a large number of words and phrases in various Asian languages, reflecting the theme of the conference and authors’ regional backgrounds. Thus, we needed to adapt the original stylesheets to multi-script typography by a large expansion of linguistic and typographical templates as well as extensive annotation. Our modification involves extraction and annotation of Asian language fragments in TEI documents, locale-oriented typeface differentiation, adjustment for typographical conventions, and mixed script typesetting. We will share our methodology and decisions we applied to the actual book, hoping that it serves as a case study that leads to dissemination of attention to, and better practices in, non-Latin and/or multi-script publication in the TEI community.
ID: 156
/ Virtual Poster Session: 4
Virtual Poster Keywords: Japanese script, East Asian texts, character variation, hentaigana, digital editions Celebrating Deviation: Encoding Variant Japanese Phonetic Characters known as Hentaigana Hosei University, Japan In digitalizing the manuscript heritage of secret writings by the Japanese 15th century actor, playwright, producer, and teacher Zeami, I am including the premodern script variations known as hentaigana now available in Unicode. Hentaigana are variant hiragana, phonetic characters that are used to write various Japanese grammatical and function words and often ruby, for which the TEI released elements last year. In 2017, Unicode formally added 285 hentaigana characters in their Kana Supplement and Kana Extended-A code charts. These alternatives are fluid or cursive abbreviations of various “parent characters,” phonetically used Chinese characters (kanji), with varying patterns and degrees of simplification. In encoding both the modern, standardized hiragana and hentaigana, this project makes the manuscripts more accessible to learners of Japanese premodern script. Comparisons of the variants in different text witnesses using such markup might be useful for future analyses of text genealogy. In this poster, I will present my methods for systematically including hentaigana developed while transcribing manuscript witnesses of Zeami’s writings and explain my inclusion of the “old character forms” (kyūjitai) of kanji using similar markup. I will furthermore share initial orthographical explorations of texts encoded thus far and consider methods for sharing the project with digitally savvy users, noh theater experts with no IT background, and educated non-specialists.
ID: 171
/ Virtual Poster Session: 5
Virtual Poster Keywords: alliteration, XML-TEI, encoding, poetry, translation Theoretical and practical challenges of automatically identifying and encoding alliteration in texts written in Italian 1Sapienza University of Rome; 2Università Cattolica del Sacro Cuore, Milan In our proposed presentation, we would like to display the theoretical and practical challenges posed by the creation of a program aimed at automatically identifying and encoding alliteration in texts written in Italian. Furthermore, a reflection on the possibilities offered by the analysis of the above mentioned phenomenon will be presented: from examining the style of a poet to determining if and how this literary device is preserved in translation. On the theoretical level, alliteration is generally defined as a literary device consisting of the repetition of sounds at the beginning of adjacent words (cf. Beltrami, 2011). But what kind of sounds are we talking about? How long should they be? And what do we mean by ‘adjacent’? These are all crucial interrogatives that must be dealt with at the very beginning of any investigation on alliteration, especially considering that scholars (cf. Valesio, 1967; Lausberg, 1969; Menichetti, 1993; Mortara Garavelli, 1997; Ellero and Redisori, 2001; Ghiazza and Napoli, 2007; Mortara Garavelli, 2010; Arduini and Damiani, 2010; Lavezzi, 2017; Motta, 2020) tend to offer slightly different definitions. On the practical level, once the rule-based program is created for Italian, it can be easily adapted to languages with a high degree of correspondence between graphemes and phonemes. Given a TXT file, the program is likely to be able to automatically identify the above-mentioned phenomenon. However, in this case the demanding task is the encoding process: a thorough reflection is needed to find a proper way to define an XML-TEI tag that contains all the important information such as the repeated sound and the number of words involved.
|