8.2.3 Alignment of Multiple Analyses
In addition to representing isolated linguistic analyses, it is
often necessary to represent multiple analyses of the same text and
relate them to each other---a task referred to in what follows as
alignment
. The analyses in question may be
- at distinct levels of representation, as when a phonological
transcription or the syntactic representation of a text
is to be related to its orthographic transcription
- at the same level of representation, as in the case of
structural ambiguity, where two or more syntactic
analyses are to be related to the same input string.
Linguists, anthropologists, literary scholars, and others who deal with
large corpora of foreign-language text have traditionally used
interlinear annotation as a mechanism both for developing and for
presenting the analysis (including literal translation) of running text.
To deal with the needs of this kind of analysis, we propose a single,
recursive markup element for annotated units, which provides implicit
alignment for different levels of analysis. Such implicit alignment of
levels can be used if each level of analysis divides the text (or
another level of analysis) into identical series of segments, or into
segments which nest cleanly.
The alignment must be made explicit whenever the different levels of
analysis require:
- incompatible segmentations of the text (crossing segments,
non-nesting segments)
- re-ordering of the segments of the text
- discontiguous segments (e.g. the separable prefixes of
German verbs) which must be re-combined at some level
of analysis
Accordingly, methods for implicit and explicit alignment of multiple
analyses are provided in this section. Any markup using implicit
alignment can be rewritten using explicit alignment.