Formulas, Tables, Figures

Tables, figures and formulas are most frequently found as floating features within the body of technical and scientific documents. They do however occur in other forms of writing and are therefore discussed here. They are not formatted in the same way as the text of a document with respect to page breaks, text alignment etc. and generally require special coding conventions. Paragraphs are generally written and formatted on a line-by-line basis, but these features are written and frequently formatted as independent blocks of text, requiring special functions of a formatter, for example, alignment of cells in a formula or operators in a formula, drawing of elements in a figure etc.

Three methods may be used to encode these objects:

The first method can take advantage of the SGML SUBDOC feature, which allows a special purpose Document Type Declaration to be invoked during processing of a text. It also enables the TEI to benefit from work already carried out by other standards bodies and industry groups without the need to attempt an integration of their proposals with ours. Both the Reports of the American Association of Publishers (AAP) and the SGML Standard itself contain detailed proposals for the SGML tagging schemes designed specifically for mathematical formulae and tabular materials Markup of Tabular Material version 2.0 and Markup of Mathematical Formulas version 2.0Bib ref goes here and it would be foolish to attempt to replicate these.

The second method and third methods can take advantage of other, perhaps more appropriate notations which have been developed outside the context of SGML. Examples include `EQN' for formulas or `Computer Graphics Metafile' for graphics. Another SGML feature, the NOTATION declaration, may be used to specify publicly available encoding schemes, which may then be invoked by elements within a document. The content of such elements if embedded within a document are ignored by the parser, within some limits.

The third method is the simplest, in that the object concerned is replaced in the document by an empty element. It also simplifies the task of interchanging documents, in that only one notation will be used in any given object to be transmitted rather than a mixture.

No specific information is provided in these guidelines about the use of either the SUBDOC or the NOTATION features. They are however standard parts of SGML, and therefore may be added to any TEI Document Type Declaration as described in section . Much work needs to be done in assessing the applicability of currently available notations and alternate DTDs to the kind of materials of interest to the TEI, which is left to the second cycle.

Formulas

A formula uses a special notation to express a statement in mathematics, chemistry, linguistics, etc. It may be given inline, that is embedded in the running text, or displayed, that is set off from the surrounding text. As discussed above, it may be encoded using SGML tags from some DTD other than that proposed by the TEI: a suitable candidate would be the AAP standard Markup of Mathematical formulas, version 2.0, 1989. Alternatively, it may be expressed using some other notation entirely.

We propose two tags, formula and ext.formula. The former may be used in simple cases where it is possible to embed the text of the formula within the document. The content of this element should be expressed using an external notation of some kind, as specified by its `notation' attribute.

The formula tag takes the following attributes: type Specifies the type of the formula (inline or display) notation Specifies a notation name which has been declared in the current DTD and which specifies an external coding scheme and ext.formula. Example: The ext.formula is to be preferred in most cases. It is an empty tag, the content of the formula being expressed either using some external DTD or in some other external notation and stored in a separate file. The ext.formula tag takes the following attributes: type Specifies the type of the formula (inline or display) notation Specifies a notation name which has been declared in the current DTD and which specifies an external coding scheme to be used with the file. subdoc Specifies the name of an entity declared in the current DTD which has the `SUBDOC' property and specifies an external document declaration to be used with the file. file Specifies the name of a file containing the description of the formula, held externally from the current document. Note that either `notation' or `subdoc' must be supplied, but not both. Example:

Tables

Tables may be encoded in exactly the same way as formulas, using the tags table and ext.table, with attributes as described above.

In addition to the AAOP recommendations on SGML tagging of tables, there is an SGML dtd in Annex E to the standard which might be used. No work on defining a TEI specific set of tags to cater for tables has been defined during the current cycle. More work is needed in this area.

Figures

Figures or graphics may be encoded in exactly the same way as formulas, using the tags figure and ext.figure, with attributes as described above.

&winita;