Draft circulated to TEI Council 1 April 08
Email from Brett Zamir of , tagged in TEI Lite by LB
While TEI makes no claims about how a document is to be rendered–except to the extent that it allows description of the original formatting and to the extent that publishers wish to reflect that formatting–it is a general expectation that the TEI documents a project is creating can be rendered at some point in a more human-readable manner. While the desired formatting should not prompt one to violate the TEI Abstract model by altering the semantics of a document in order to adjust output formatting, the process of creating formatting output (or even well before doing so), may lead a project to reconsider some aspects of its semantic markup, as well as the more conventional means of adjusting a stylesheet, so the issues raised here go beyond the work of a designer. This chapter will discuss various issues related to formatting a TEI document.
In order to fully respect the original formatting of a document, it is
necessary to consider (and thoroughly use) the global @rendition
attribute and/or the @render attribute on
While semantic hooks and reliance on the typical formatting of
specific tags for cases where @rend and @rendition are not needed to
override typical behaviors might be sufficient to create an output
document reflecting the formatting of the original document, if the
original rendering is important information to preserve, this can be
done more explicitly by ensuring that all elements are given a
However, note that
For maximum specificity in encoding formatting details, the content
supplied within
While it might be sufficient in most cases, there is no formal means
at present to express that a particular style rule should target CSS
pseudo-elements (like :before, :after, :first), such as one might wish
to do in specifying the addition of distinct content at the beginning
or end of a tag (e.g., adding left and right curly quotes to a
One might find oneself tempted to force the rendition mechansims in
TEI (@rend or @rendition or
This might necessitate a different stylesheet or, as is probable for most cases, modifications to the default stylesheets provided by TEI, if the parameter options (assuming XSL stylesheets are used) are insufficient to express a project’s output requirements. It is, for example, possible to allow the stylesheets to recognize multiple attributes, even at the same time.
The stylesheets from TEI are evolving with the TEI project, however, so it may be possible that the TEI project might be open to certain changes (whether optional or required) to its default stylesheets, if the changes offered may be of interest to a wider audience and TEI has the resources to implement the changes. Given that the code of these stylesheets is open source, it may benefit both TEI and its users as well as an individual project for stylesheet improvements (or other TEI resources for that matter) to be returned back to the community, as it precludes the individual project from needing to make modifications each time an update occurs.
With more standard (but not standardized) styling expectations (and stylesheet), the more likely that TEI processing applications might be used to render TEI in a familiar format (such as when obtained directly off the web, etc.), even while allowing publishers’ full freedom to deviate from such common conventions if they wish.
While one should not subvert the semantics of a TEI document in order to control formatting, besides customizing a stylesheet alone, the viewing of a formatted document might prompt a project to consider changes to the original TEI documents, such as:
- giving a more detailed encoding of the original rendering (as that
information can be used to produce output rendering, assuming again
that the original document being represented indeed possesses this
rendering), using
rendition ,tagUsage render, @rendition, and @rend - adding more semantic “hooks” whether this is the use of hitherto
unused elements or attributes such as @n or @type (potentially with
the adding of generic elements like
seg ) which can provide more semantic detail about certain text that can in turn be targeted by a stylesheet to provide more granular control in output formatting. This may also have the benefit of providing more semantic richness to the document (ideally using the more specific elements already recommended for this purpose). Such semantic ‘hooks’ can also be of the variety that ensures that the output formatting includes sufficient accessibility features such as to make available alternative text along with any graphics or images that could not otherwise be interpreted by a speech browser.
Besides formatting concerns leading one to add additional semantic distinctions into a TEI document, one may also wish to encode a certain degree of semantic information (to the extent allowed in the output formatting language) into one’s formatting output and consider the extent to which output formatting markup is separated from any more generic output structural markup (e.g., creating CSS to hold styles with XHTML used to present the structure or encoding structural and formatting markup together). These are both discussed below.
While it may often be the case that TEI will be converted to a formatted output in which semantic information is lost, certain output formats allow some if not all semantic information to be retained in some manner. For example, XHTML can use the approach of microformats (http://microformats.org) to use the global and generic XHTML @class attribute to contain information such as the original TEI tag name. While it would likely be too cumbersome to originate documents in such a format (assuming all TEI semantics could be encoded with such an approach), it offers the advantage that one might, for example, use a web browser to obtain a document already pre-rendered, yet use a microformat processor within the browser (possibly available as a browser extension) to search for semantic information.
It has become a generally recommended practice for even XHTML
documents on the web to separate their formatting content (as with
CSS) into a separate file from the structural content (of paragraphs,
generic divisions, etc.). This offers various advantages such as speed
in downloading (by browser caching for repeatedly used stylesheet
files or by those using speech browsers being able to avoid
downloading visually-oriented stylesheets), or flexibility in
subsequent style changes. While one might define an XSL stylesheet to
create specific XHTML @class attribute values which are associated
with those classes targeted in a predefined CSS stylesheet, XSLT 2.0
might be employed to utilize information such as contained within
@rend or
Despite the generally recommended practice of separating styles from
structure and semantic information, given the present absence of a
means of making queries which utilize style information contained in
separate files, it may be conceivable for some to wish to have their
formatting output mixed in with structural output (the @style
attribute might be harder to parse in a query than if specific XHTML
formatting structures were used–even though these may be deprecated
in later versions)–just as one might prefer to encode say italic
emphasis using
Before considering the usual rendering and default rendering options for specific elements (along with any specific attributes), it is worth considering some general issues pertaining to certain types of elements.
Elements differ in the likelihood a project will wish to render them
in a formatted output document. They range from editorial information
which might never be printed out for common viewing, to elements which
will sometimes be printed out (such as a
For the case of those which will always be printed out, one can make
their styling explicit by using
Moreover, a stylesheet might wish to depend on element-specific or global attributes (whether semantic or rendition-related) to target elements with or without these attributes or with specific values to display or not display them selectively.
Elements which occur in the header, will generally not be printed out, though for some project’s purposes, display of this information (e.g., bibliographic data) may be useful to include in the formatting output.
While other elements that occur within the running text will generally be printed out, it is important to understand that with TEI–which, as cannot be emphasized enough, is not a formatting language–this will not always be the case. If one has editorial information that should not be printed out within the running text (or at least should not appear alongside the running text), as a project might not wish the encoder-added information to disrupt the flow of the text (e.g., of a narrative) and for which it might even be considered irreverent by some viewers (such as for scriptural works), it will be important to be aware of all such tags that a project might not want printed out so that the stylesheet (possibly in conjunction with special semantic TEI markup if not markup indicating original rendering) does not display those tags’ content.
Elements which are defined by the following macros are generally not to be displayed:
- macro.limitedContent (http://www.tei-c.org/release/doc/tei-p5-doc/html/ref-macro.limitedContent.html ): desc, fDescr, figDesc, fsDescr, meeting, rendition, tagUsage, witness
- macro.phraseSeq.limited (http://www.tei-c.org/release/doc/tei-p5-doc/html/ref-macro.phraseSeq.limited.html ): activity, age, authority, channel, classCode, constitution, creation. derivation, domain, factuality, funder, interaction, interp, langKnown, language, locale, metSym, preparedness, principal, purpose, resp, span, sponsor, valDesc
Moreover, there are some elements such as those in model.noteLike
(
Still others include items such as may be contained within
Likewise with elements belonging to model.pPart.transcriptional: add,
app, corr, damage, del, orig, reg, restore, sic, supplied, unclear .
One may or may not wish to indicate
(any others????)
Since there is no way of knowing whether some of the elements
mentioned above such as
Note that despite its being listed above, an element such as
One exception to @rend, @rendition (or their absence) indicating
original formatting is that within a
The addition of a @rend within
Likewise, especially if the default rendering of
Some elements or elements with certain attributes may need special
consideration for output such as @copyOf or
Most attributes are used with coded values, as they are not mean to represent human language or to be displayed. Text attributes represent the exception to this, though it is commonly preferred for an XML language to represent these attributes as elements so that further nested subelements representing markup at the phrasal level, etc. can be added within as needed.
Text attributes have generally been removed from TEI, and some of the ones that remain one might not wish to output in a formatted version anyways, but if one wishes to include, for example, @reason in one’s output, one will be unable to add styling which depends on child elements for more specific formatting since the information is expressed within an attribute (but one can style differently depending on the element’s @xml:lang, as that does apply for text attributes, as well as any other attributes on the element). Likewise for the dictionary attributes, @expand, @norm, @split, @value, and @orig which represent the remaining text attributes????.
((((Syd prepared a list of potential text attributes to review to see if they were still text attributes–it’d be nice to be able to give such an exhaustive list here.))))
I think the element-specific details might be logically incorporated as documentation elements within XSL that could be extracted for automatic inclusion within the TEI reference pages, making clear that the formatting discussed is only that of the default behavior used in TEI-provided stylesheets (though also discussing the range of options that the stylesheet makes available through parameters). I really think giving awareness of these formatting issues in the context of considering these elements would be more helpful than waiting for people to discover them separately in the stylesheets.
While, as mentioned earlier, there is no required mapping of TEI elements and attributes to specific output document structures (e.g., XHTML/CSS, LaTeX, etc.), the fact that TEI provides a default set of stylesheets to work with (albeit a parameterized one) and that these are presumably well-used [by the number of downloads????] indicates that there are a general set of expectations about how most TEI structures will appear when output. The effort required to create one’s own stylesheets from scratch for such a large vocabulary as TEI provides, or even to significantly modify existing stylesheets (no less each time as improvements and adjustments are made to the default files), also makes the understanding of how documents will be transformed an imperative for many projects. Thus, it becomes necessary to understand how TEI might commonly be transformed (or understood to be transformed), even beyond the extent to which the stylesheets themselves are documented and express (mostly in technical language) the templates used to transform TEI into a formatting language.
The default stylesheets provided by TEI serve as a good basis for discussion on how formatting can be performed and are documented here for the sake of those who wish to know how each structure they might use in a TEI document might be rendered by default. The stylesheets nor this discussion should be taken as any kind of requirement to use these stylesheets as a base, or even at all.
(to be displayed on reference pages?)
(to be compiled after reference pages have their information fleshed out)