TCR02: Report of the TEI Council to the Members Meeting 2004


Christian Wittern12 March 2005The Text Encoding Initiative

No source

Converted to P5Added prose from Alex, Natasha and chris.Revised lightly.Compiled from some submissions by Council members, but mainly from the meeting minutes and out of the blue.

Report of the TEI Council to the Members Meeting 2004

The TEI Council is in charge of overseeing the technical development of the TEI Guidelines. This year, all energy was devoted to develop P5, the next major revision of the Guidelines. The development takes place mainly in the workgroups charged by the Council and through the work of the TEI editors. The Council held one face to face meeting and five telephone conference calls to discuss reports from workgroups and plan for further development. In addition, to further public visibility and to be able to serve the needs of the members more directly, an open workspace was created at sourceforge.net where TEI users are able to submit and discuss proposals for new features of the Guidelines. A more detailed report on the work of the working groups and task forces active during the past year follows.

Character Encoding Workgroup Chaired by Christian Wittern, charged July 2001

This workgroup was charged by the TEI Council to revise those areas of the TEI Guidelines that deal with representation of characters, languages and writing systems, which includes the current Chapter 4 (Languages and Character Sets) and Chapter 25 (Writing System Declaration).

The last meeting of this workgroup was held immediately prior to the Members Meeting in Nancy, November 2003. Some of the more important outcomes of this meeting include the decision to recommend to the Council that language identification in TEI documents should follow other XML vocabularies and use the xml:lang attribute as outlined in the XML recommendation. While there was some discussion afterwards and concerns were raised by members who had not been present in Nancy, the decision was not reversed and later endorsed by the Council. The workgroup submitted two draft documents to the Council for discussion at its meeting in Ghent. All draft documents of the workgroup are available from the workgroup area on the TEI web site ().

Aspects of the workgroup’s current proposal were presented at the ACH/ALLC conference in Goteborg by Syd Bauman.

Stand-Off Markup Workgroup Chaired by David Durand, charged May 2002

The Workgroup on Stand-Off markup and Linking has been working on several working-group documents, discussing linking strategy, linguistic annotations, the TEI canonical reference system, and a summary of the rationale for the larger decisions taken in this process. Progress has been slow, but is ongoing. Some portions of the work have been presented to the council for feedback, while others are still in preparation. Any feedback on the work so far is welcomed.

Another activity of the group has been the development of some sample software tools that may prove helpful to projects engaging with the changes in pointing mechanisms. This effort has produced two sample implementations: a Perl script that can translate old-format TEI Extended Pointers in TEI documents into the W3C XPointer syntax, recommended by the workgroup, and an implementation of the W3C XPointer language.

    Working documents updated since last year include:

  • TEI SOW09 Basic working decisions on pointing and linking. Discusses and presents some of the major changes proposed by this workgroup.
  • SOW3 (last updated Oct. 2004) is a preliminary revision of chapter 14 of the Guidelines, “Linking, Segmentation and Alignment” not yet integrated into P5.
  • TEI SO W 08 on Canonical References has also been revamped, with more updates expected..
  • The changes proposed by this workgroup are likely to impact almost every other chapter in TEI P5. Some work has begun on considering the changes that will need to be made elsewhere, in particular to the discussion of simple pointers and linking in the core chapters.

Progress on this workgroup has been slow, and there remain many things yet to do. The work is carried on by email list and conference call. Energetic volunteers are encouraged to contact the chair if they want to participate.

Taskforce on SGML/XML Conversion Chaired by Christine Ruotolo, charged May 2002

The NEH-funded TEI Task Force on SGML to XML Migration was convened in May 2002 and charged with developing recommendations for migrating existing TEI resources from SGML to XML. The Task Force comprised representatives from projects with significant TEI SGML, along with selected technical experts and the TEI editors. It worked for the past 18 months to diagnose and document the problems, methods, and tools necessary to migrate legacy TEI data to XML.

The members of the Task Force met three times altogether — in October 2002 in Chicago (immediately following TEI Annual Members Meeting), in February 2003 at the University of Maryland, and in June 2003 at the University of Alicante. Minutes from each of the meetings are available from the Task Force activities page, as is a detailed workplan written after the first meeting.

The primary deliverables of the Task Force are two reports, Strategic Considerations in Migration of TEI Documents from SGML to XML and the Practical Guide to Migration of TEI Documents from SGML to XML. The first report, intended for administrators and project managers, emphasizes the planning and decision-making involved in data migration, while the second report describes the mechanics of conversion in greater detail and is written primarily for the technical staff who will implement the conversion. The specific recommendations in the technical report are augmented by a set of Migration Case Study Reports that discuss individual migration efforts undertaken by members of the Task Force.

All documents produced by the Task force — final reports, along with meeting minutes and other working papers, are available from the TEI web site in the section given above. The information presented there is current as of June 2004; however the Task Force has fulfilled its charter and does not expect to update these materials after 01 July 2004.

Manuscript Description Task-force Chaired by Matthew Driscoll, charged February 2003

The Manuscript Description Task-force was charged to reconcile or merge the various schemes for encoding manuscript descriptions using TEI-conformant XML (principally MASTER and the TEI-MMSS workgroup, but also the scheme devised for the Repertorium of Old Bulgarian Literature). This task has been fulfilled as of October 2004 and a preliminary draft of the chapter for P5 was produced by the task force and is now made available for public comment.

The council recognized that this work is a first step towards a better encoding for manuscript description within the TEI framework, and that much works remains to be done in formalizing several aspects of the descriptions which are currently given as running text only.

Metalanguage Workgroup Chaired by Sebastian Rahtz, charged March 2003

This report should be read in conjunction with other regular reports from the work group.

The workgroup held a face to face meeting in Paris in March 2004, at which the broad structure of the new ODD language for representing the innards of the TEI Guidelines was hammered out, after many revisions during 2003. Some of the details have been further polished over the following months. One of the remaining questions is whether the benefit of the abstraction from the underlying RelaxNG schema language given through the ODD syntax made up for the disadvantage of not being able to use all of RelaxNG syntax; however, the Council conference call in September agreed that the ODD language should be frozen for at least 3 months to allow the editors and others time for proper assessment of its usability.

The current draft of TEI P5 is fully conformant with the ODD language.

The ODD processor, Roma, for delivery of customized TEI schemas in RelaxNG, DTD and W3C Schema syntax, has been completely rewritten and is now available as an experimental service from the TEI website, and is also available as a package for Debian GNU/Linux systems and on a TEI customization of the Knoppix CD.

ISO/TEI Workgroup on Feature Structures TEI Liaison: Laurent Romary, charged March 2003

This is a joint activity of the TEI together with the ISO TC37/SC4 workgroup on Feature Structures. The aim is to revise the chapters on Feature Structure Representation (chapter 16) and Feature Structure Declaration Representation (chapter 26) and align them with the current work going on in the ISO WG.

The workgroup met in Nancy Nov. 2003, in Jeju-do, Korea in Feb. 2004 and in Paris August 2004. The current draft of chapter 16 for P5 is available now. This chapter will form part of ISO DIS 24610.

Workgroup on Physical Bibliography Chaired by Terry Catapano, charged May 2004

This is the only new work group charged this past year. It has been charged to develop guidelines for encoding information about the physical structure of printed books: specifically, information about how individual pages are located and identified within the larger structures of signatures and gatherings. The audience being served is fairly narrowly construed as the community of descriptive and analytical bibliographers for whom this information is essential as a way of documenting the construction of the physical book and as the basis for further analysis of printing practices and the history of particular editions.

While this work group is not expected to cover issues specific to manuscripts, its recommendations should not unnecessarily preclude their use to encode similar information about manuscript documents, and to this end this work group is expected to maintain communication with the Manuscript Description Taskforce and to review the materials they have already produced in this area.

Charge, work plan and draft documents for this work group are available at .

Other activities of the Council

The council was approached by John Smith, Cambridge this spring with the proposal to develop markup constructs to represent word boundaries in Sanskrit compound work. The council charged him with this task subsequently and requested that the recommendations were developed in a way that could also be applied to similar problems in other languages. Different proposals have been discussed within the workgroup so far (the latest one is available at , but no final recommendations have been submitted yet.)

Individual Members of the Council reported the following about their TEI-related activities this past year

Syd Bauman

As one of the only three people who is paid to do TEI-C work, I am acutely aware that part of my job is devoted to those tasks that it is hard or impossible to get volunteers to do — the scut work, as it were. But it is certainly not the case that all is boring or meaningless; quite the opposite. Some of my activities are listed below.

  • At Council’s request, worked with the FSF to develop a mechanism to apply the GFDL to the Guidelines
  • Assisted the Chair and Executive Director with the creation of our operating procedures
  • Detailed editing and revisions of the Migration Task Force’s deliverables (TEI MI W 02, 03, and 06)
  • Developed the version numbering system for P5 and future releases of the Guidelines (TEI ED W 80)
  • Provided feedback on drafts of several chapters of P5 (including MS, TD, CH & WD, CR)
  • Revamped the structure of ED W 77 (P4 errors reported & fixes made), and wrote a CSS stylesheet to make it easy to read on the web.
  • Applied approximately 30 changes to P4, ranging from fixing minor typographical errors, to significant non-corrigible changes.
  • Published a new (error-fix) release of P4, both HTML and DTDs.
  • Progressed the process of removing textual attributes from P5 (the so-called war on attributes, ED W 79)
  • Assisted with the creation of the new work-group on physical bibliography.
  • Created and actively participated in the choice group
  • At Council’s request, with David Durand developed positions on how linking will work in P5, and began transition to XPointer
  • Assisted CE in coming to decisions on how language identification will work in P5
  • Initiated the discussions and work (now taken over by Sebastian Rahtz) on creating Debian packages for TEI

I have also been active in teaching, promulgating, and promoting TEI, including the following

  • Participated in a three-day TEI training workshop at Wheaton College (Norton MA, USA)
  • Co-wrote and co-presented a paper on the TEI ODD system at the Extreme Markup Languages conference (Montreal QC, Canada)
  • Wrote and presented a paper on the TEI’s proposed use of xml:lang at ALLC/ACH (Göteborg, Sweden)
  • Continued to address user questions, both on and off TEI-L

Alejandro Bia

In the last few years, I’ve been involved in several efforts of mutual cooperation with Hispanic scholarly groups, and the organization of courses, seminars, and other training and support activities in Spanish language.

I’ve been struggling to spread of the TEI markup scheme in Spanish speaking countries:

  • presented courses on XML-TEI markup in Italy, Spain and México:, CLIP 2003, JBIDI 2004, 8 SII (Simposium Internacional de Informática). In the future there are several similar courses already scheduled.
  • been cooperating with the TEI Internationalization project by translating the whole TEI tagset and part of the documentation descriptions to Spanish. The goal is to have this new Spanish tagset and documentation completed and integrated into P5.
  • cooperated with the TEI SGML-to-XML Migration Group.
  • participated in several research and development projects involving TEI markup like teiPublisher, automation of the Newsletters System of the MCDL, and others.

Lou Burnard

As a TEI editor the bulk of my time is supposed to be devoted to editing other people’s drafts for P5. In practice, I seem to have devoted more time to requesting, wheedling, demanding, or provoking drafts and constructive suggestions for revision (or facilitating production of same) than anything else much.

This year in particular I have:

  • provided drafting assistance for chapters MS, TD, and FS of P5, as well as parts of CO.
  • proposed and implemented a way of prioritizing feature requests on Source Forge; also provided feedback on all current such requests
  • tested and provided design input to the redesign of the ODD system and associated software developments

In addition, I have been active in promoting the TEI in various forums, notably:

Under AOB, I edited a CD called BNC Baby containing three English language corpora in TEI XML together with a preliminary version of the Xaira software which goes on sale in October. I also spent a substantial amount of time preparing a volume of essays on textual editing for publication by the MLA.

M. J. Driscoll

In addition to chairing the Manuscript Description taskforce mentioned above, and ensuring that its outputs were made available in time for this meeting, Matthew has been responsible for a number of TEI-related training and outreach exercises this year.

  • I held a two-day TEI workshop at Johns HopkinsUniversity, 26 and 27 April 2004.
  • I delivered a 90-minute plenary lecture on the TEI at The Third International Conference on New Technologies and Standards: Digitization of National Heritage, Belgrade, 3-5 June 2004. This lecture will be published in the next volume of the Review of the National Center for Digitization.
  • Anne Mette Hansen and I held a one-day workshop on text encoding here at the institute on 27 March, where we introduced our colleagues (not for the first time) to the wonders of the TEI. Some of them actually now use it.
  • I organised (along with colleagues from Stofnun Árna Magnússonar in Reykjavík, Nordische Abteilung of Universität Tübingen and Abteilung für Nordische Philologie, Universität Zürich) a week-long summer school in manuscript studies, 6th-10th September 2004, one aspect of which was electronic text encoding. We had 26 attendees from all over Europe, principally Germany, Switzerland and the UK. It was a tremendous success.

Sebastian Rahtz

I have worked for the TEI in 9 areas over the last year:

    I led the Meta workgroup in a deep revision of the ODD language in which the Guidelines are written. I have also been responsible for all the implementation of the processing of the Guidelines to generate schemas and documentation.

    I wrote a replacement (named Roma) for the TEI Pizza Chef, which generates customized schemas for P5. This was then rewritten from scratch, under my supervision, by Arno Mittelbach and delivered for testing. It has been kept up to date with changes to the ODD language.

    I reimplemented the work of Alejandro Bia on internationalization of the TEI element, class and attribute names, and worked with Alejandro on subsequent revisions. This system is supported by Roma.

    I packaged some key TEI deliverables (TEI-Emacs, schemas, dtds, document, stylesheets, Roma) as Debian packages, according to the Debian XML guidelines, and produced an experimental CD for distribution to members

    I developed a system using an eXist database to deliver the P5 sources via a query system, delivered on the same CD.

    I taught courses in Oxford on the TEI, and talked about the TEI at several international conferences

    I did the initial setup of the Sourceforge project for the TEI and have gradually transferred most of my work to it (Roma, stylesheets, internationalization, tei-emacs)

    I assisted the TEI European editor on day to day work of the TEI web site and Guidelines maintenance.

    I assisted Julia Flanders and Geoffrey Rockwell in the creation of a proposed new logo for the TEI

Laurent Romary

  • In the summer 2004, the Print Dictionary chapter has been fully revised to check its consistancy with the new ODD specification platform. This work conducted by LR in conjunction with LB and SR was also an opportunity to experiment with extension mechanisms in ODD. A call for contributions has been made on the TEI list to gather up dictionary samples and observe current practices in this domain. From the dozen answers received, some proposals have made to improve the expressive power of the PD chapter in the domains of historical notes and morphology.
  • A series of discussion have taken place within the TEI list and offline as to the best strategy to adopt with regards the outdated P4 chapter on terminology. It is proposed that this chapter be fully rewritten to become conformant to the TBX syntax as recommended by the Localization Industry Standard Association (www.lisa.org), which in turn is conformant with the ISO 166642 standard (Terminological Markup Framework). The future chapter would limit its scope to a set of core descriptors and would explemplify extension mexhanisms for whoever would want an enlarged model.
Natasha Smith

I was part of the of the SGML to XML conversion task force I also give presentations on TEI and TEI-related projects, etc at the UNC School of Infomation and Library Science every semester in different classes.

Edward Vanhoutte

As a new member to the Council, I mainly listened in on the activities, debates, and plans, ready to take action in my second year of service. However, I worked for the TEI in the following areas over the last year:

Christian Wittern

I have worked for the TEI as chair of the TEI council and as chair of the Character Encoding working group this past year. In addition to that, I have given tutorials and introductions to TEI and markup in general, as well guidance to TEI projects in Taiwan and Japan.