MULTEXT/EAGLES - Corpus Encoding Standard
Document MUL/EAG-CES 1. Annex 1. Version 0.1. Last Modified 14 December 1995
Copyright (c) Centre National de la Recherche Scientifique, 1995.
This document is only a draft and should be cited as such. Creators of WWW documents pointing to it are warned that its content and location may change without notice. This document is provided as is without any express or implied warranties. While every effort has been taken to ensure the accuracy of the information contained, the authors assume no responsibility for errors or omissions, or for damages resulting from the use of the information contained herein.
Permission is granted to make and distribute verbatim copies of this document for non-commercial purposes provided the copyright notice and this permission notice are preserved on all copies.
This is a list of standards referred to in the CES or relevant to text encoding generally.
Information Processing--Text and Office Systems--Standard Generalized Markup Language (SGML)
Information Technology -- Text and Office Systems -- Conformance Testing for Standard Generalized Markup Language (SGML) Systems
Sperberg-McQueen, C.M., Burnard, L. (Eds.) (1994) Guidelines for Electronic Text Encoding and Interchange, TextEncoding Initiative, Chicago and Oxford. Available online at
<URL:http://etext.virginia.edu/TEI.html>
Hypermedia/Time-based Document Structuring Language (Hytime)
Standardized SGML document type definitions for books, articles with tables, formulaes, etc.
Representation of dates and times.
"This standard defines a lot of details of the calendar. E.g. the ISO definition of the week numbers is that the first day (day number 1) of a week is Monday and that the first week in a year (week number 1) is the week that includes the first Thursday in January, i.e. the first week that has at least four days in January. Other definitions are, e.g., that hours of a day are counted from 0 to 24 and that the international notation of dates is the Bigendian format year-month-day, e.g. 1993-04-17 and that for time is e.g. 20:36:04 (hh:mm:ss). There are also string formats for computer applications specified that have to represent date and time in files and protocol packets. (See
<URL:ftp://ftp.uni-erlangen.de/pub/doc/ISO/ISO8601.ps.Z>
for a very
detailed
summary.)"
Codes for the representation of currencies and funds
Notation for international telephone numbers (a '+' followed by the country code, followed by a space, ...).
Code for the representation of names of languages
Provides two-letter codes for about 140 languages and is intended primarily for use in terminology, lexicography and linguistics.
The list is available
online
at
<URL:http://www.stonehand.com/unicode/standard/iso639.html>
Code for the representation of names of languages--Alpha-3 code
Three-letter codes for the representation of names of languages for information interchange", developed by a Joint Working Group of ISO TC37/SC2 and TC46/SC2. Covers a wider range of the world's languages than ISO 639.
The list is available online at
<URL:http://www.stonehand.com/unicode/standard/cd639-2.html>
Codes for the representation of names of countries
This standard defines a 2-letter, a 3-letter and a numeric code for each country on this planet. E.g. US/USA/840=United States, DE/DEU/276=Germany, GB/GBR/826=United Kingdom, FR/FRA/250=France, ...). The 2-letter codes are well known in the Internet as top-level domain names. The 3-letter versions are often used at international sports events.
The current Internet-Draft of HTML 3.0 (29-Mar-95) provides a LANG Attribute, whose value is composed from the two letter language code from ISO 639, optionally followed by a period and a two letter country code from ISO 3166., e.g. "en.uk" for the variation of English spoken in the United Kingdom
<URL:http://www.hpl.hp.co.uk/people/dsr/html/CoverPage.html>
<URL:http://www.lpl.univ-aix.fr/projects/multext/LSD/LSD.html>