Notes on the TEI Class System


Licensed under

From TEI Council email March 2006

Converted to P5Syd BaumanAdd section on dateStruct and timeStructLouRemoved invalid type attribs; added some notes for dispositions agreed at Kyoto council meeting. Syd BaumanAdded phrase-level markup and a couple of notes.

This is a summary of input received from TEI Council Members on additional class related changes, together with quick comments by LBand a few by SB.

Corpus

  • textDesc and settingDesc and all their children should remain part of this module. However, particDesc and all their children (person-related) should have their inclusion in this module given serious reconsideration by the personography taskforce. At very least particDesc and children need to make sure that their descriptions are rephrased to not say these elements only apply to those in a linguistic interaction. Comment: for discussion at the PTF: the proposed change in wording is easily effectedReword descriptions as needed
  • Where should prosopographic elements go: in CC, in ND, or in their own module? Is there to be a single module for personography? module dependency issues: should modules have explicit dependencies? Council agreed: not to leave in limbo. Move it into ND: agreed by 10 votes.Move prosopographic material to ND
  • How should the dependency of msdescription on ND be resolved? Should we introduce mechanisms for supporting module dependency or should the class system do this job? Council agreed we should fix the ms-desc/ND dependency and see if there are any others. Provide a facility for indicating that one module should be used with others. Rethinking of meaning of TEI classes. Revise discussion of classes in TD and test sequential classes with msDesc
  • Around half a dozen of these elements have a content model of “att.datable model.personPart”. I can see people wanting to add things that are like this. I guess it is best for them to add both, but I point it out just in case it comes up elsewhere or a class should be made. Obviously this depends on the outcome of the personography taskforce. In general, we don’t mix attribute and model classes within a single class
  • Similarly a number of elements all are members of “att.declarable model.profileDescPart”. I think all this really is indicates that the class system is working properly. Phew!
  • A large number of the elements in corpus have a content model of macro.phraseSeq. I really think it is improbably that many of these elements *really* need all/most of the possible content which this enables for them. Of those which are macro.phraseSeq, those which I think could do with a much tighter content model would include: affiliation, channel, constitution, derivation, education, factuality, firstLang, langKnown, locale, nationality, occupation, preparedness, purpose, residence, socecStatus. For most of these one only needs a small bit of limited text without the various in-depth phrase markup that macro.phraseSeq seems to give. I was less sure about wanting to constrict the macro.phraseSeq content model for activity for some reason. Similarly with domain and interaction, but there I vacillated on whether they (as perhaps some of those above) could be made empty or something more constrained (macro.glossSeq didn’t seem right.) I want to argue that birth and death (or however they are reformed by the personography taskforce) should have a tighter content model as well. This is probably true of several other things in the header too: do we want to define a restricted subclass of phrase elements?yes, I think soWe can distinguish data-like things and text-like things, but what about things that can be either? how would we make their content context-dependent?Implement reduced-phrase class proposed in
  • The most complex content models are in: listPerson, particDesc, setting, and textDesc. textDesc does not use classes so possibly a candidate for model.textDescPart. I can see someone wanting to add something to textDesc. Agreed: we should implement model.textDescPart Otherwise, setting is the only one I would consider re-examining. Its content model is: model.pLike+ | ( model.nameLike.agent | model.dateLike | model.timeLike | activity | locale | rs )* I looked to see if: model.nameLike.agent | model.dateLike | model.timeLike appears elsewhere as a group (it doesn’t really). What’s the proposal here? model.settingPart?

    • make model.settingPart as superclass of model.nameLike.agent, model.dateLike, model.timeLike, containing activity and locale and forget about rs

Notes on Reduced Phrase Class

——— what we have now ——— model.phrase = model.graphicLike = binaryObject eg egXML formula graphic model.hiLike = distinct emph foreign gloss hi mentioned soCalled term title model.lPart = caesura rhyme model.oddPhr = att code gi ident specDesc specList tag val model.pPart.data = address model.dateLike = date dateRange dateStruct model.measureLike = measure num model.nameLike = geogName lang placeName rs model.nameLike.agent = name orgName persName model.timeLike = time timeRange timeStruct model.pPart.edit = abbr add app choice corr damage del expan orig reg restore sic space supplied unclear model.pPart.msdesc = catchwords dimensions handShift heraldry locus material origDate origPlace secFol signatures watermark model.ptrLike = ptr ref model.ptrLike.form = oRef oVar pRef pVar model.segLike = c cl m phr s seg w ——— some possible changes ——— model.pPart.edit = abbr add app choice corr damage del expan orig reg restore sic space supplied unclear becomes: model.pPart.edit = model.pPart.edit.??? = abbr choice expan model.pPart.edit.transcribe = add app corr damage del orig reg restore sic space supplied unclear ——— model.hiLike = distinct emph foreign gloss hi mentioned soCalled term title becomes: model.hilighted = model.hiLike = hi model.emphLike = distinct emph foreign gloss mentioned soCalled term title ——— model.oddPhr = att code gi ident specDesc specList tag val becomes: model.specDescLike = specDesc specList model.xmlPhrase = att code gi ident tag val arguably <code> and <ident> should be in model.emphLike instead. ——— summary of possible changes: model.headerPhrase = model.graphicLike [1] model.xmlPhrase model.pPart.data model.pPart.msdesc model.ptrLike model.emphLike model.pPart.??? I.e., model.headerPhrase does *not* have hi [2] model.lPart add app corr damage del orig reg restore sic space supplied unclear [3] model.ptrLike.form model.segLike Notes —– [1] An argument could be made to divide model.graphicLike so that <binaryObject> and perhaps <formula> and <graphic> were not in model.headerPhrase. [2] And perhaps not distinct, either? [3] And perhaps not reg, either?

Gaiji

  • char and glyph have desc and gloss children. These should be replaced by macro.glossSeq, which adds equiv, and makes these elements behave like *Spec in the TD chapter. The equiv will be useful one day. Agreed: done
  • char and glyph refer to graphic explicitly. This should be replaced by (model.graphicLike)*, to allow for binaryObject and whatever else comes up in future (eg sound recordings? video clips?). Whatever it takes to represent the character. Agreed: done
  • mapping has (text | g)*. This should be macro.xtext, like valueAgreed: done

Figures/Tables/Graphics:

  • the role, rows, and cols attributes of cell and row should be managed via an attribute class, att.tableDecoration, allowing for future development. Agreed: done
  • figure refers to graphic and binaryObject explicitly. This should be replaced by (model.graphicLike)* Agreed: done

Nets

[no change]

Spoken

  • The elements event, kinesic, pause, vocal, and u are all members of att.ascribed, att.timed, and model.divPart.spoken. Is there any benefit in making a class which provides all of these in case someone wanted to add a similar spoken element? Not a good idea to mix attribute and model classes in a single class
  • Should eventkinesic get att.typed? Possibly. What values would you suggest? add event and kinesic to att.typed
  • A number of elements in the spoken module have macro.glossSeq as their content model, but this seems sufficient for them Is there a “not” missing here? If so, in what ways? If not, what’s the recommendation? LR says we should define new class model.descLike containing desc to replace some occurrences of macro.glossSeq with it
  • The only element with a more complex content model is u which contains ( text | model.gLike | model.phrase | model.divPart.spoken | model.global )* I was wondering if it needs model.phrase or should be model.pLike, but that is just because of the way I think about it. I think model.phrase was included to permit model.pPart.edit or model.choicePart members: certainly there’s scope for simplification hereWe should certainly re-examine whether or not model.phrase or something else belongs in there, but I’m not sure model.pPart would be a good idea — at the very least it requires some serious thought. u elements are somewhat pLike themselves already. I.e., although they are not part of model.pLike, they are siblings of p inside div. (They are not pLike because a u can go inside a u.) The implications of putting model.pLike as part of the content of u includes having p elements next to u elements that have p elements inside.

Manuscript Description

Overall, this is a module which makes use of the class system fairly well.

  • Content models: We’re concerned that many elements’ contents are macro.specialPara; this really isn’t necessary in all cases. Why, in support for example do you need: another msDescription, address, app, caesura, castList, cit, eTree, classSpec, elementSpec, gap, eg, egXML, rhyme, etc. etc. This is a general problem, of course. Would the more limited set of elements proposed here be the same as that needed for header/corpus elements?Use reduced phrase class as for other header elements
  • There are many elements whose content model contains sets of elements which might be more usefully expressed as a class:

    • msIdentifier has a similar content model to altIdentifier. Create new msIdentifierPart? A class wouldn’t help here, since order is important. There’s a case for a macro thoughDefine a macro: no they are insufficiently similar
    • Other elements with possibilities for new content model classes include: adminInfo (adminInfoPart) Or maybe add recordHist, availabity, custodialHist to model.Notelike and change content to model.Notelike+?Use new class mechanism
    • dimensions (dimensionsPart) What other possible members of the class can you suggest? where else might the class be used?
    • history (historyPart) If you don’t use model.pLike, the child elements here are intended to be supplied in the order given, and it’s hard for me at least to see how you’d extend their number
    • msItemStruct (msItemStructPart) Again, order is significant here
    • msPart (msPartPart (!) )Again, order is significant here
    • physDesc (physDescPart) Agreed: the components here should definitely be expandable, and the order is completely arbitrary
    • supportDesc (supportDescPart) I’m less sure about this one, but sounds plausible…Define non-repeatable model class

  • Attributes: Possible new classes:
    • msContents, msItem, msItemStruct all have class and defectiveAgreed: I suggest att.classified (for class) and att.state (for defective) could be useful elsewhere
    • explicit and incipit have defective
    • binding and seal have contemporaryshould this be a part of att.state?No action agreed

Transcription

Susan and I have continued our discussion on the attribute classes in transcr, specifically looking at the attributes for add, addSpan, del and delSpan. As they stand, these elements all have slightly different attributes assigned to them, but the editors may want to consider adding attributes currently defined for del and delSpan to add and addSpan and creating new classes. Yes, there is definitely scope for improvement here!

Is there such thing as a “faulty addition” – an addition that contains too little or too much text (as a result of eyeskip, for example)? If so then we may want to define status on add and addSpan. type could be useful on add as it is for del (primary and secondary additions, for example). Agreed: type is needed on addSpan

There are still two differences amongst the attributes for these four elements: addSpan and delSpan share to, and add and addSpan share place. We really don’t think place makes any sense for del. So, we can consider either new attribute classes for:

addSpan, delSpan (hand, type, status, to, with place separate on addSpan) and add,del (hand, type, status, with place separate on add)

OR for

addSpan,add (place, hand, type, status, with to separate on addSpan) and delSpan,del (hand, type, status, with to separate on delSpan.

Or an attribute class with hand, type, status, and declare place separate for the adds and to separate for the spans.

Can we distinguish the functions of resp (from att.editLike) and that of hand? What is status for — how does it differ from type? We should probably add these elements to the att.spanning class so that they all get the same “horse-like” treatment.

While I don’t like how att.spanning works (mostly because it is not “horse-like” enough), I agree completely that addSpan and delSpan belong in that class. I like Dot & Susan’s third suggestion: hand, type, status in new class att.subEditStuff of which all 4 are members; place in new class att.placement of which add and addSpan are members (and can note be a member of this class, too?); and addSpan and delSpan get added to the already existing att.spanning.Implement Syd’s proposal

Linking

This module seemed pretty well classified already. As far as I can see, applying the procedure which Christian gave, there are no changes required.

I saw a mention in edw87 of the ab element being removed - is this still on the cards? It says moved not removed! and it has been…

Linking and Analysis

has an example which mentions the attribute targOrder, but there seemed to be no definition of this attribute. I guess the revision of the examples is another item on a future agenda 🙂 Done

  • character: element c { att.global.attributes, text } should include model.gLike? element c { att.global.attributes, text | model.gLike } Yes: should be macro.xtext
  • morpheme: There’s a commonality between m (morpheme) and w (word): element m {att.global.attributes, attribute baseForm { data.word }?, ( text | model.gLike | model.blockLike | c | model.global )* = model.mPart }
  • word: element w { att.global.attributes, attribute lemma { data.word }?, ( text | model.gLike | model.blockLike | w | m | c | model.global )* (model.mPart | w | m)*}
  • I suggest defining m‘s content model as model.mPart, and w‘s content would then be: (model.mPart | w | m)* Though this commonality exists, I’m not sure that introducing an additional superclass would simplify things particularly. Where else might it be used?Isn’t this one commonality sufficient excuse for a new class?make common class of elements shared by w and m ; proposals as per syd
  • Finally, the element span has attributes from and to (data.pointer type) which might be worth grouping into an attribute class. I thought these might be usable by elements in other modules which attach some metadata to a span of text, though I couldn’t find any such elements myself. Yes, we need to do a thorough overhaul of the way stand-off markup is implemented, and some new classes will be needed for this purpose: see comment above about the att.spanning class.Ask standoff SIG to consider the options and propose something!

textcrit

  • Content models: rdg and lem share the same content model; perhaps create a new class.
  • Attributes: Only two elements in this module have explicit attributes, and there is not enough overlap to justify a new class.
  • Class Memberships: no suggestions
  • witness, wit, and witDetail all contain macro.paraContent. This seems very liberal, and the editors might want to consider a more specific class. Use reduced phrase class

transcription

  • 1. Content models: Content models for all transcr elements are by model class.
  • Attributes: hand and handList share @style, @ink, @writing, @resp. New class? (handPart) It is a semi-open question as to whether we should go on supporting both hand and handDesc It is probably not worthwhile to make attribute classes of one, but anyway:

    addSpan, damage, delSpan, restore, supplied: @hand damage, delSpan, fw, restore: @type damage, supplied: @agent
  • Class membership: Should fw be a member of model.milestoneLike? It is not empty. I would say not, though there is scope for making it the (sole) member of a class of elements for page decoration

Dictionary Module

Element content models with possible class issues:

  • dicteg “q | quote | cit” should be replaced by model.qLike (“agree” – LR) Ok
  • entry: “hom | sense” are referenced directly in the content model. Are they likely to be extended with similarly behaving elements? If so, then they may be candidates for a class. (“No. They have to be explicitly stated as part of entry. Leave as-is.” LR)
  • etym: Lots of individual elements in content model: “usg | lbl | def | trans | tr | dicteg | xr”. These are a subset of model.entryParts. Are they likely to be extended with similarly behaving elements? If so, then they may be candidates for a class. (“Yes. Make a class out of those elements.” LR) OK
  • hom “sense” is referenced directly in the content model. It is also referenced in re. If similar-behaving elements are likely as extensions, perhaps it should be in a class of one.(“cf. above. Leave as-is”. – LR)
  • re: “sense” is referenced directly in the content model. It is also referenced in hom>. If similar-behaving elments are likely as extensions, perhaps it should be in a class of one. (“idem.” LR)
  • xrusg and lbl are referenced directly in the content model. Are they likely to be extended with similarly behaving elements? If so, then they may be candidates for a class. (“Yes.” LR) OK

Drama

Element content models with possible class issues:

  • castGroup: References ( castItem | castGroup ), which are also referenced in castList. Perhaps they should become a class. I think that something within a castList is either an item or a group of items: I can’t imagine wanting to add anything else. So I am less inclined to agree that a class is appropriate here
  • castItem: “role | roleDesc | actor” are part of the content model. They should be grouped together as a class for easy extension and addition of similarly behaved elements. Agreed
  • castList: References ( castItem | castGroup ). See castGroup above.

Names and Dates

  • Is dateStruct necessary?

    <dateStruct> has ( text | model.gLike | model.datePart | model.global )* <date> has ( text | model.gLike | model.phrase | model.global )*

    (date‘s model is via macro.phraseSeq). But if we instead give date the union of these two content models: ( text | model.gLike | model.phrase | model.datePart | model.global )* Couldn’t we consolidate by getting rid of dateStruct? Note that the “Struct” in dateStruct does not mean “no PCDATA”, unlike other *Struct elements.

    Council voted to implement this change. The content models have been merged, the dateStruct and timeStruct elements have been deleted; furthermore time was added to class att.editLike, and both date and time were added to att.datePart.

    That last change (using att.datePart) changes the datatype of value from data.temporal to either data.temporalordata.duration, and it has not been decided that this is a good thing.

    The full and zone attributes are under consideration for removal.