 |
- The current WSD is broken:
- it requires subdoc, which is not available in XML
- It assumes a glyph registry
- It duplicates documentation (e.g. entity tables
need to be maintained separately)
- It can not be processed in a viable way by
existing processors
- It bundles language, orthography(script) and
character identification
- Unicode provides properties for its characters,
therefore the WSD does not need to provide them
- It is too unflexible in the character properties
it allows to define
- It is required, but this requirement is not
enforced by the DTD.
- Requirements for a new Charset Extension Mechanism
to replace the WSD.
- Limited scope: identify only
differences/additions/extensions to the Unicode
Database of properties.
- Make it optional
- Placed in the header, not subdoc, not
auxiliary document.
- Ensure that the constructs are actually useful
and usable in the processing of the document.
- Dont assume a straightforward relationship
between Characters and glyph as in Latin and East
Asian scripts, allow for n to n mappings of
characters/strings to glyphs/glyph sequences.
- Do not attempt to produce a imaging model, glyph
description language or some such. Stay within the
scope of TEI.
- The unbundling of language and writing system
might require a mechanism to identify the writing
system in TEI documents, for example a new global
attribute for this purpose.
|