| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Downloading and Installing NoteTab Using NoteTab Notes on using NoteTab Pro Creating and parsing MASTER records Using a parser on your own computer | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Downloading and Installing NoteTab
The first step in making Master records with NoteTab is to retrieve the software. A good site to download the software from is www.tucows.com:
(A note on NoteTab Light and NoteTab Pro. NoteTab Light is freeware. However, it is a cut-down version of NoteTab Pro. It does not allow tag highlighting and several other features. If you'd like to download NoteTab Pro you should go to HTML Tools: HTML Editors Advanced. This will give you a copy of NoteTab Pro that will last for 30 days. After the 30 days the program is disabled and a copy may be bought at a cost of $19.95) Once you have located NoteTab Pro or Light hit the 'Download now' button on the page and download the program to an appropriate location on your hard drive. This will give you a compressed zip file. (If you do not have a decompression program, Winzip is the most popular and this can also be downloaded from the tucows site under General Tools: Compression Utilities.) The next stage is to unzip and install the program to your hard drive. The easiest way to do this is to click on the install icon in Winzip. Follow the installation instructions and close Winzip after the program has been installed successfully. You should have now installed NoteTab on your computer, and an icon for NoteTab will have been added to your Start: programs menu. Using NoteTab Once you have started NoteTab you will see a group of tabs running across the bottom of the screen. (see fig 1 below)
The next stage is to add the Master specific tab (as developed by Matthew Driscoll, refined by Adrian Welsh and Peter Robinson) to your copy of NoteTab. This will let you access the Master SGML/XML tags in the same way we have just done for the HTML tags on the HTML tab. The MASTER project has developed two tab files: one for preparing SGML encoded descriptions, one for XML encoded descriptions. We recommend that you download the XML tab file, as the MASTER project (from 1 January 2001) now gives a priority to support for XML. You can download the XML tab file for MASTER from the Master site, masterx.clb. (If for any reason you would prefer to make SGML descriptions, you can obtain the SGML tab file from the same place, msdescription.clb.) You should place this file in the libraries directory, located within the NoteTab directory. If you have done a standard installation of NoteTab you will find the NoteTab directory in the program files directory on your C: drive, (see fig 3 below)
Once the masterx.clb file has been placed in the libraries directory, a tab called masterx will appear along the bottom of NoteTab. Click on this and the list of tags on the left of the screen are now the tags to be used for creating your Master records according to the current DTD Notes on using NoteTab Pro Highlighting TagsHighlighting tags can only be done in NoteTab Pro and can be achieved in the following manner: Under the View: Options Menu item click on the HTML files tab. In the empty box under HTML file extensions, add the suffix for your xml master files (ie sgm or xml or add both) and ensure that the highlight HTML tags checkbox is ticked.(see fig 4 below)
Creating and parsing MASTER records Creating and parsing a simple Master record.Below is an example of a very simple manuscript record before it has been marked up in XML:
This manuscript record consists of just two parts:
This can be encoded as follows using the MASTER system. Firstly, we must start a new <msDescription> element to contain the whole description. In NoteTab, double click the msDescription start tag in the left hand window. A dialogue box will appear asking you to enter your status type. Choose one of the following:
Now, we are ready to state the manuscript identifier. This is contained in a <msIdentifier> element. Double click on msIdentifier in the left hand window. A dialogue box will ask for details of the number, country code, country, city, repository, collection and shelfmark: all of these except shelfmark are optional. Fill in the number ('1'), country ('Great Britain'), country code ('GB'), settlement ('Oxford'), city ('Corpus Christi College') and shelfmark ('MS 198') boxes, then click OK. The following will appear in the right hand window:
Next, we wish to encode the statement of what the manscript contains and its date. We use the msHeading element to do this. Select 'Summary description (msHeading)' in the left and fill in the appropriate boxes, so that you see in the right hand window:
Observe the possibilities for precise searches which this encoding allows. We could find this manuscript as dated between 1395 and 1420; as being written in Middle English (with language key "ENM"), and more. We have now got a complete, though rather short, manuscript description. This should appear as follows:
You can now parse this manuscript description, to make sure that everything is in order:
The parser will then parse the file. If there are errors, it will tell you where the errors are. If there are no errors, you will receive an encouraging message. Creating and parsing a more complex Master record.The above example is rather simplistic. The real power of the MASTER encoding is its ability to deal with very complex manuscript records. Here is a slightly fuller description of the same manuscript:
We can now give more detail about the manuscript. The <msContents> element allows us to give details of the texts contained by the manuscript, in a series of <msItem> elements (one for each text). This manuscript has only one text (the Canterbury Tales), and so for this description we have only one <msItem>. Note that this msItem is defective, and we can use the 'defective' attribute to indicate this. Within <msItem> we have a <locus> element, to state the folios or pages on which this text is found, and a <title> element to state its title. (You could also use <author> to state the author again). You state the exact text contained in the manuscript (A274-I290) by using a <biblScope> element within a <bibl>, and use a <note> element to state that the text is defective:
We now state the form of the manuscript (<form>), its material (<support>), its dimensions (<dimensions>), its collation (<collation>) and its writing (<msWriting>), all within a <physDesc> element:
Notice the use of the <name> element here to create a formal and indexable statement of the name of the scribe. From this, we could include this in indices of all persons mentioned in the descriptions, all scribes, and find all instances of the scribe labelled 'DPhandD'. We use the <history> element to give the history of the manuscript, within <origin>, <provenance> and <acquisition> elements:
This example record in its entirety would look thus:
You can now parse this fuller record online, in the same way as before, by double clicking on 'validate document using external validator' in the left hand window (in 'Validation') This example gives only a brief account of the MASTER encoding. A more detailed description of the tags can be found in the formal reference documentation for Master. Clearly, too, there are many different ways of applying the MASTER encoding. The draft set of cataloguing rules agreed by the MASTER project participants, prepared by Richard Gartner, attempts to define how the tags should be used: see http://www.hcu.ox.ac.uk/TEI/Master/Cataloguing/catrules.html. Using a parser on your own computer Setting up the parser Setting up the parserThe system outlined above depends on you using the on-line parser maintained by the MASTER partners at De Montfort University. However, under some circumstances this will not be satisfactory. For example: the online parser is designed to parse a single record at a time. When you have hundreds of manuscript records, this will not be convenient. Also, you may want to include additional materials (bibliographic records, information about particular scribes and other people) alongside the manuscript descriptions, and in this case the 'single record' parser will not be satisfactory. You will then need to set up a parser on your own machine, and to parse the records with this. In order to parse the manuscript records on your own machine, you will need:
You are now ready to parse files on your own machine. You do this as follows:
You may notice some inconsistencies between the results of the on-line parser and the internal parser. This is because the on-line parser makes certain assumptions about your files. For example: it assumes you may want to refer to language and country identifiers, and so includes all ISO language and country identifiers as it parses. Thus, when you refer to <textLang langKey="ENM"> it knows about the langKey ENM and does not indicate any error. The internal parser makes no such assumptions, and so does not know about language and country identifiers, and so may indicate an error here. Adding power to the internal parser: referring to entitiesThe above installation of XML will give you only basic facilities. For example: if you refer to special characters such as é using the standard XML system of encoding them as é you will find that the parser will send you an error message. In order to have the parser process these properly, you will need to set up the parser. Information concerning special characters is contained in documents known as 'entity sets'. For the parser to understand these characters, you need to direct the parser to load the relevant entity set documents. For the character é, SGML/XML entity é, the relevant entity set is that known as 'ISO Latin 1'. This is a public entity set, and is available in many places. You can download an ISO Latin 1 entity set file from www.cta.dmu.ac.uk/projects/master/ISOlat1.ent. To configure your parser to use this entity set:
The first of these added lines '<!ENTITY % ISOlat1 SYSTEM "ISOlat1.ent">' gives the name of the entity file, and associates it with an entity name 'ISOLat1'. The second added line '%ISOlat1;' calls this entity: the effect of this is to tell the parser to load the file 'ISOlat1.ent' Notice that these lines are contained in square brackets, before the '>' which closes the DOCTYPE statement. In SGML/XML language, this is known as the 'document type subset'. By convention, statements contained within the document type subset are processed before the document type definition is read. Thus, you can use this to extend or refine the processing instructions given to the parser before it parses the document itself. Configuring the parser to use standard dtd and entity filesWhen you have many different manuscript descriptions, you may need to keep these in many different directories. In the simple system here outlined, you would need to include separate dtd and entity files in each directory. This is inconvenient and might lead to serious problems with different versions of the DTD and entity files in different directories. You can avoid this problem by configuring the parser so that it always points at one and only one set of dtd and entity files. You do this by altering the statements in the document type subset so that they point at the same files, wherever the file you are parsing happens to be. For example:
Here, you are pointing the parser at the masterx.dtd and ISOlat1.ent files contained in the pubtext directory in the Sp directory on the C drive. (You will have to put these files there first!). If your computer is online, you can use files on the internet. This may be useful if you want to be sure (for example) that you are always using the very latest version of the dtd. This form of the doctype statement will include the XML form of the DTD held on the Oxford MASTER partner's website:
Similarly, the public entity file for the ISO Latin 1 entities can also be referenced externally. As this is a 'public' entity set (that is, a set which has undergone a formal registration and acceptance process) this should be referenced slightly differently:
This page was last updated 14 January 2001 |