MASTER

MASTER
Manuscript Access Through Standards for Electronic Records

Key: DMU-CTA De Montfort; IRHT Paris; NLP Prague; OU Oxford; KB Royal Library, the Hague; EAMSS US partners; VL Vatican Library; AP other associated partners; EG Expert group; BFM Marburg

Fifth Project Meeting. October 6^th to October 8th. Netherlands Institute for Advanced Studies, Wassenaar, and the Royal Library, The Hague, Netherlands

Present:

Anne Korteweg (KB), Lou Burnard, Richard Gartner (OU), Zdenek Uhlir (NLP), Elisabeth Lalou (IHRT), Merrilee Profitt (DS/EAMSS), Peter Robinson (DMU-CTA), Matthew Driscoll (AMI), Natasha Kuchynskaya (DMU-CTA), Malcolm Bothwell (IRHT), Anne Mette Hansen (AMI), Eva Wedervang-Jensen (AMI), Anne Seerup (AMI), Jill Seal (Nottingham), Robert Sanderson (Liverpool), Robert Giel (Berlin), Klass van er Hoek (KB),

From the Expert Group: Ian Doyle and Gilbert Ouy attended the meetings on October 6 and 7.

Meeting Theme: Using MASTER

Notes: I am grateful once more to Merrilee Proffitt for her excellent notes on the meeting, which I have shamelessly pillaged for this document. All misinterpretations are all my own!

Friday 6 October: the Royal Library

After formal greetings, the meeting began at 10.10 am.

AM: reports from the partners

Royal Library; Klaas van der Hoek began by outlining the cataloguing situation in KB. There are several catalogues of part of the collection, beginning with the 1922 catalogue of theological mss. This was described more elaborately 1988 but remains incomplete, and supplemented in 1985 by a catalogue of figurative and non-figurative mss. Thus, no overall view of mss is available and the partial catalogues which are available are incomplete, inconsistent. At present, KB is undertaking a digitization project aimed at publishing all information from mss on the internet with ICONCLASS classification. In the 1980s there was a decision to make stc cat of mss ordered by shelfmark. These should be compiled into booklets, and very pragmatic guidelines were established for these. The descriptions were stored in computer database in INMAGIC form and volumes published from this 1988 and 1993.

This leads to the question: could the descriptions in INMAGIC be converted into MASTER? What advantages etc are there in MASTER format over INMAGIC?

KVDH: recorded problems with using MASTER. He stated that he felt the need to record MORE information with textLang etc than he wanted: that is, to specify the textLang of msItems. He also felt the need to record ALL msItems, even though in a typical manuscipt only a very few items might be of any interest. The solution proposed was:

-We use textLang in msHeader once to describe all languages

-we do not need to use textLang for each item

-we need to record information for different texts - scribe for colophon, or incipit,

Other points made:

-we need to scale descriptions to allow short descriptions to grow into large descriptions

-we need to state clearly that NOT all items are described and to say this close to the description, within msContents

-Use of name authority: these are very valuable but the process of finding them is very time-consuming. Can we have a speedier way of finding them? ID: suggests that it might be better NOT to try to standardize the owners names.

Problems of bibliography..need to build or insert a bibliographic reference. This needs to be explained: AMI are doing it and the rest of us need to know how they are doing it

MJD: points out that <title><q> is actually <rubric>

We should ALWAYS provide a title in every msItem. Should we do so? To be discussed later..

Czech National Library: Zdenek

Most catalogues in the NLP are from early C20; around 1960 they felt the need for more elaborarte descriptions; College of Arts and Sciences took on this task. In 1983 they issued set of rules which are really too elaborate and impractical. This school does not like electronic catalogues and likes printed catalogues. After four years ZU is able at last to talk to these people!! A second school of study is oriented to study of mss as mass phenomena; historically oriented not codicologically interested. This group is interested in digitization. So they have a need for short descriptions only, concentrated on the content of the mss.

Hence: ZU is interested in the actual citation of mss: the incipit rubric colophon elements are ideal for this.

There are problems with the author title summary elements. Summary is an abstract made by the cataloguer in the course of the description: for example, 'Devotional texts': is this a summary or a title? Most of us feel this is a supplied title, not a summary (which ought to be a transcription? Or not?)

Author: often the author's name is used as a title. How do we deal with this? How do we record it? Problem of item and subitems… Example, six collections of quaestiones, each collection containing various combinations of item. Suggestion: type attribute on msItem to say that this is a collection

We agree that the <author> content should be given in the form the cataloguer wishes it to appear. We need to deal better with questions of genre type and subject. We should use attested to indicate that the author's name actually appears in the ms. We should use <title type=supplied> to be used for all cases where the cataloguer is supplying a collective description (not summary). And we need to come back to discussion of use of name, of the reg attribute (not available on author!)

IRHT: Are making descriptions from microfilms and books; different databases for music, binding, illumination, rubrics etc. They are interested in making ONE database only for ALL this material. Needed to face this problem and decided to use MASTER as a possible route to this end. IRHT has experimented with three different routes of input.

There is a problem with msheading: as KB indicated, we need some means (overview?) for msContents to indicate that not all contents are indicated.

IRHT use their own authority files to find standard forms of names, and have made guidelines to indicate presence of music. More definition is required for decoration and binding elements. Ian Doyle: for binding the crucial things are period and not date,

and region (place) of binding. origDate should do this, though it is not clear

from the examples in the guidelines, and origPlace needs a reg attribute.

Gilbert Ouy would like to distinguish decoration/illustration/binding elements in description. We need more examples to show this, and we need consistent examples.

Discussion of decoNote. Made clear this can be included in msItem

Attributes on decoNote. GO suggested that we need illustrative/decorative distinction. Passed by acclaim. Q: do we need figurative/no switch? Agreed to retain this, but with less conviction. . Final decision: add illustrative as an attribute on decoNote, keep figurative.

AMI: The background of their work is the aim to achieve the virtual reunification of the AM collection. MASTER is a key route to this end. They have made 500 descriptions of medieval manuscripts; they are cataloguing the contents much more closely, and undertaking transcription of contents too.

AMI have problems of markup of fragments: we want to indicate whether it is deficient in the quotation. They want to say inicipit/explicit type=def to show this

They want to say that msItem is fragmentary etc: msItem type=frag. We agreed to add a type attribute to msItem. And now we need a typology of msItem: we should discuss this later.

<paraText> came up for discussion. We looked for examples: all we found (rubrics!) could go elsewhere. Nobody knew what <paraText> is for; why did we need it; why did we think of it in the first place. Each blamed the other for its existence. So, we kill it.

Blank pages: should be recorded in extent. We agree.

Should we record dimensions in <support> or <extent> ? does it matter? We recommend that dimensions should go in extent, not in support.

We need a better mechanism for clarifying which hand is which, for referring to recurrent instances of different hands in different manuscripts. There are mechanisms for doing this: we have them, let’s explain how to use them (Document hand, handid idref)

MsIdentifier: we can point to other manuscripts using MsIdentifier

PM: conclusion of AMI testimony. Lou showed the Oxford system, using mySQL + PHP. PR showed the DMU system. This led to discussion: should all records look the same? Should libraries be able to customize for their own materials?

Jill Seal: the Perdita project.

She stressed the need to indicate the status of the work and the need to foreground gender of the person involved in the creation/otherwise of the work; indication of deletion or otherwise of the work in msItem.

Need key words to indicate class of the item -- genre and subject: where do we put this? How do we do this? We need to integrate physDescription into msItem accounts. We need some way of saying: this is a uncertain date (this is provided for).

Saturday 7 October: at NIAS, Wassenaar

PR demonstrated the forthcoming Hengwrt Chaucer Digital Facsimile of the Canterbury Tales (henceforth: the Book of the Tales of

Canterbury!) and the De Monfort version of the MASTER catalog. Decision not to

distribute URL for catalog to EAMMS list or a broader unsuspecting world just

yet. There was a plea to send updated catalog records to Peter (or rather to the now-working web submission tool). In theory, Lou will get a copy of the records from Peter (but he also has a system for taking in records too). Please also send feedback about how records should appear.

We then moved onto a list of outstanding matters for decision. We all reminded ourselves that We Were NOT Discussing the DTD. So this is what we discussed and decided:

1. We need a way of saying: we are NOT describing all items in this ms. We agreed to do this by having an <overview> element, which should appear first in msContents (ie preceding all msItems). We do not need an attribute on msContents to say we have not described everything: it is enough just to look at the overview element and get the info out of that

2 Name authority: how do we record:

standard titles

transcribed titles text here titled?

standard authors

standard names whereever they are

the exact relationship of the name to the manuscript

For regularized forms of names: use <name reg="">

A whole series of decisions were made.

a. we should use the NAME element as the universal system of saying: this is a name; it is this kind of name (the type attribute); the role of the person/thing named is (the role attribute); the regularized form of the name is (reg attribute). The name element has proved enormously popular. Cataloguers like it. So let’s give them better tools and guidelines in their use.

b. thus: we have <name type=?? role=?? reg=??>

Examples:

<name type="place">Villingaholt
<name type="person" role="scribe">Hoccleve</name>

c. we established the following typologies

role scribe binder artist owner scholar translator annotator compiler

adapter dedicatee patron commissioner (OPEN set but these are recommended)

type place person org male female (DEFAULT: person; a CLOSED list)

d. We will add the reg attribute to <author> will get reg attribute. We will add the type attribute to <author> with a subset of the same values as type for <name>

e. <respStmt> should continue to be used in formal context of msHeading and msItem,where the <respStmt> specifies, and will contain <name> with the same attributes as elsewhere. However, in all other contexts, we recommend the use of <name>

<name> is available within prose at all points in the description

g. Establishing index terms, or linking to biographical databases, etc: for this, Use KEY attribute on <name> to refer to a <person> element which needs to be

defined elsewhere (the header). In <person> you put the reg form of the name

(if desired), sex of the person. This list needs to be maintained centrally! We can also use this author.

Taken together, this provides a simple but very powerful set of tools: you can say what kind of name this is, state the role of the thing named, give a regularized form, and link it to a names database. (Some of us worry that this is Too Simple and Too Powerful: overenthusiastic and naïve cataloguers could create mounds of illdisciplined data with these. But so they could with anything. And good cataloguers will make very good catalogues)

3. After this quite astonishing set of agreements, we moved to <title>. We specified that in ALL cases <title> is to contain a uniform title, as it appears or might appear in a name authority file. We should use the type attribute to indicate this is a uniform OR supplied title.

The ACTUAL titles given in the manuscript: should be given within <rubric>. No reg attribute is necessary for rubric (you give the regularized form in <title>!! That’s what it is for!!)

We recommend that EVERY msitem MUST carry a title element. The title could be <title type=supplied>C14 sermons</title>

<title type=supplied>Devotional texts</title>

We do NOT need to state an author in every msItem.

4. If the author is part of the title (the NLP situation): give the author in <author> and repeat the author’s name (without tagging) within the title

5. Handling bibliographic references: we will provide documentation of how to do this, just as AMI have done it

6. We agree that the TEI class attribute is appropriate to carry the definition of class of the msItem/msContents. This can be used to provide info on genre, text type, keyword classification, according to a pre-existing or other typology). Also available on msContents. We need to document this.

7 modified attributes on decoNote: AGREED. That is: we add a decorative/illustrative switch (the Gilbert Ouy memorial attribute!!)

8 We do not need a type attribute on msItem to specify that this item contains a collection: this is not necessary.

We should use a status attribute on msItem to indicate whether it is fragmentary/defective. For this we should use the same values as for msDescription

9. Dimensions should go in extent not support

We need to document the system for identifying hands across manuscripts. Some of us know how to do this! We should tell the others how!

Summary is to be an abstract or brief summary provided by the cataloguer. It is NOT a transcription of a summary in the ms, nor is it a supplied title.

We finally nailed Incipits right down. They are the first words/lines of the text proper, and are ALWAYS transcribed text. The explicit is always the last words/lines of the text proper. The incipit is NOT, repeat NOT, the statement ‘Hic incipit the book of O’: this is a rubric. All titles found in manuscripts are to be transcribed within <rubric>. The type attribute on <rubric> indicates whether this is initial or final.

We have the full weight of the expert group behind this set of definitions.

We also decided what to do with fragmentary incipits. Take the case where the cataloguer judges there is text missing, but wishes to include the missing text in the transcription. Do it this way:

<incipit type=def><supplied>this incipit is not here</supplied>this is the first

real words</incipit>

Typology of incipit: values for type: DEFECTIVE recommended for indicating

that in some way this is defective. (DO NOT say incipit type=supplied.)

We then seto down rules for the TEXTLANG element:

Langcodes to be given in capitals: langkey=LAT not langKey="lat". New codes: add to the MASTER list at Oxford

b. We have a problem: should we always say <q lang= and <incipit lang=?

Or should we have a rule: when only one language is specified in langKey there is no need to specify the language of the <q>? (But this creates problems for indexing of

content. But people will not want to put <q lang=> everywhere)

The solution: use langKey in textLang in msHeading to indicate the MAIN LANGUAGE of the manuscript. By default, all q incipit rubric etc (ie all the specialized transcription elements) will take this language as their language.

Add an OtherLang attribute to textLang: this will list the IDREFS for all the other languages

And lang spec in langKey is the default value of lang in <q> <incipit> <explicit>

14 morphology palaeography etc: keep on reserve list. We are not convinced.

15. watermarks: now a paragraph type containing discussion of watermarks, set out in <P> elements. We suggest we use term or ref to describe actual watermarks (ref to refer to Briquet: but it is far from clear which we should use when! Is a Briquet classification a term or a reference to the classification?

16. We did not discuss controlled access points. In fact, we think we have lots and lots of them now, with this architecture for the key elements <name> <title> <author>

17. we are not keen on inventing our own surrogates system for recording and

listing digital images and associated metadata. We do need a way of

pointing at those images and this information, especially in the case where we are providing a full set of digital images for the whole manuscript.

For metadata: Richard and Merrilee discussed work done under Harvard-Bodleian project and for the MOA2 DTD (http://sunsite.berkeley.edu/moa2/). We suggested that the TEI/MASTER DTD should point towards other DTDs.

For full facsimiles: these should be contained in a separate document which lists all the images using <div> etc etc to structure them, with the msDescription placed within the

teiHeader of this document. Work has been done on this already: see the TEI Text Encoding in Libraries: Guidelines for Best Encoding Practices‰ http://www.indiana.edu/~letrs/tei/).

If we are NOT providing a full digital facsimile, but only giving some images (say):

--could use <figure> in <surrogates> to point at each image:

<p>

containing

<figDesc>text to display when you cannot see the image

<header>caption for the image

<xptr> this points at metadata

And: one should use <figure> within the body of the description to refer to

illustrative material.

Paratext is dead. Long die paratext.

This was a pretty amazing set of decisions. We did all this between 10 am and 4.30 on Saturday, with a good break for lunch. A key reason why we were able to decide so much so quickly was the presence of two members of the Expert Group, Ian Doyle and Gilbert Ouy. Discussions which had gone nowhere for months were suddenly resolved, through the very clear and authoritative explanations from the two experts. We owe them a very considerable debt.

Sunday: we discussed the content and timing of the MASTER dissemination workshops.