This section concerns the encoding of general purpose links
between one text and another. However, because the term
Cross-references must refer to distant documents and
document portions, and so those document portions must have
names, to which the cross-references refer. Once a naming method is
established, (see hdref refid=z6a3>) supporting linkage is
not difficult, although providing optimal data structures and
interfaces are matters for much continued research.
In some hypertext systems, texts are subdivided in a very simple
way into
A cross reference is a way of specifying a location in a document.
A cross-reference recorded at its origin location in a document, must
somehow specify the location of its destination. If, on the other hand,
a cross-reference is recorded somewhere other than its origin or
destination, as in a database of endpoint pairs, it must specify both.
We do not consider this method of implementing cross-references further
here. In either case, the central problem is to find a uniform way of
identifying target locations in a text. This is discussed in section
Cross-references may be categorised in many different ways and
may be distinguished by creator, date, and other properties.
Users may wish to view or suppress cross-references based on any of these
properties. A processor may wish to act upon different cross-references
in different ways, either generically or individually.
We propose the tag
The `target' and `target.end' attributes can only be used to
point to locations in the current document, because their values are
defined as IDREF. If they are intended to point to an arbitrary point or
span, the empty element
External documents may not already contain suitable ID values or
anchor elements, and it may not be possible to insert them,
if for example the target text is on a read-only medium. A method of
locating parts of a text without changing it is therefore necessary.
A number of such methods can be used, and are listed below. They are,
for the most part,
We list below a number of such methods, in decreasing order of
fragility, together with a recommended name for each. The list is
not intended to be exhaustive, but the names proposed should be used
if the corresponding method is used.
As a further example of how these methods can be used, consider the
following fragment, which is the start of a document with system
identifier `foo':
Methods which depend on the physical structure of the file are
almost guaranteed to fail if the file changes. Also, they fail
perniciously: the system cannot in general report an error,
but will merely return the wrong destination. For archival
data on read-only media, this may be less of a problem, in
that updates are less frequent and more controlled. The cost
of recreating indexes of cross-references is none the less
expensive and tedious. For other data the problem will be a
constant one. The simple path method is essentially similar,
though somewhat less likely to fail.
Methods depending on the use of either SGML ids or a known
canonical reference scheme have clear advantages as regards
stability. The other methods may have attractions as regards
precision, when used in combination with these, in which
circumstances they are also less likely to fail. For example,
to target the third word (`synergistic') of the second paragraph,
Wherever possible however, the ID mechanism should be preferred. The
uniqueness of ID attribute values within a document can be enforced
by SGML; they use an SGML mechanism designed for the purpose; and both
intra- and inter-file cross-references are implemented with the same
mechanism. Most importantly, these methods are robust in the face of
file changes. IDs very rarely change as a document is edited.
Therefore cross-references remain valid across edits. If an element
is entirely deleted, then of course its ID is absent and cannot
be retrieved. Crucially, however, this is not a pernicious error;
the system can in general detect and report it, rather than
simply retrieving the wrong destination.
The disadvantage of these methods is that most files do
not have canonical references or unique SGML IDs encoded
on all elements. To refer to any element lacking such,
either an identifier must added, or an alternate method
of referring must be used. Given the other advantages,
it is desirable to encode element IDs on each element
of new SGML files as they are created, and to have
software ensure IDs are not re-used after deletion.
So far we have discussed cross-referencing methods which start from
a single point in a text and point either to another single point (an
A further problem is introduced by the possibility that the target
of a reference may not be a continuous segment, and cannot thus be
specified simply by a start target and an end target. One method might
be to use the alignment map mechanism discussed in section
No recommendation is made concerning these two areas, as much &winita;
cross-reference
throughout this section
to specify any form of link which connects non-adjacent portions of
a text. Note also that special tags are proposed for some specific
types of cross references, notably bibliographic citations, which
are discussed in section Information constituting a cross-reference
start link
. Like every other element in the
TEI scheme, it may be identified by the value of an `id' or an `n' attribute.
If a processor is to be able to return easily to the cross-reference origin,
obviously at least one of `id' and `n' must be available to it.
Since cross-references may originate from any point in a document,
the target
attribute also) must correspond with the ID
attribute of an Referring to document locations
4
thus
indicates that the desired paragraph is the fourth child
of the containing element (in this case, Span-to-span references and discontinuous targets