Syntax In Sections 8.3.1 and 8.3.2, the feature structure and tree structure declarations presented in Section 8.2 are illustrated for the domain of syntax. Special conventions are proposed for encoding derivations within the framework of categorial grammar in Section 8.3.3.

Feature Structures Consider the following feature structure, which is a fragment of a feature structure describing the sentence She smashed a brick. The analysis appeared in N.N., Feature and Function Grammars, in Blorts and Bleeps and Other Bloopers (New York: Yamaha, 1999), p. 330. The structure has a name, Test Sentence, and the features and other features not included here. The feature Subject, in turn, has as its value a feature structure containing other features (some of them the same as those on the sentence itself) and another nested feature structure. (Note: this feature structure is intended to be illustrative only; a fully specified feature structure for a sentence would include much more information.) __ __ | | | TEST-SENTENCE | | Category = S | | Patterns = (Subject Predicator Direct-Object) | | Tense = Past | | Voice = Active | | __ __ | | Subject = | | | | | Category = NP | /1/ | | | Patterns = (Head) | | | | Head = __ __ | | | | | | | | | | | Category = Pronoun | | | | | | Gender = Feminine | | | | | | Lex = she | | | | | | (etc. ...) | | | | | |__ __| | | | | | | | | (etc. ...) | | | |__ __| | | | | Actor = /1/ | | | | (etc. ...) | |__ __| Alternate format follows. -Ed. TEST-SENTENCE Category = S Patterns = (Subject Predicator Direct-Object) Tense = Past Voice = Active Subj /1/ = Category = NP Patterns = (Head) Head = Category = Pronoun Gender = Feminine Lex = she Actor = /1/ (etc. ...) This feature structure is encoded as follows: <![ CDATA [ <f.struct> <f.struct.name>TEST-SENTENCE</f.struct.name> <feature> <f.name> Category </f.name> <f.struct> S </f.struct> </feature> <feature> <f.name>Patterns</f.name> <f.list><f.struct>Subj</f.struct> <f.struct>Pred</f.struct> <f.struct>DO </f.struct> </f.list> </feature> <feature> <f.name> Tense </f.name> <f.struct> Past </f.struct> </feature> <feature> <f.name> Voice </f.name> <f.struct> Active </f.struct> </feature> <feature><f.name>Subj</f.name> <f.struct id=fs43> <feature> <f.name>Category</f.name> <f.struct>NP</f.struct> </feature> <feature> <f.name>Patterns</f.name> <f.list> <f.struct>Head</f.struct> </f.list> </feature> <feature> <f.name>Head</f.name> <f.struct> <feature> <f.name> Category </f.name> <f.struct> pronoun </f.struct> </feature> <feature> <f.name> Gender </f.name> <word> Feminine </word> </feature> <feature> <f.name> Lex </f.name> <word> she </word> </feature> (etc. ...) </f.struct> </feature> (etc. ...) </f.struct> </feature> <feature><f.name>Actor</f.name><f.ptr target = fs43> </feature> etc. ... </f.struct> ]]> Let us consider a further simple example---the representation of the German pronoun sie, which translates as either she, her (acc.), they or them (acc.). A possible feature structure representing the relevant information is given below. __ __ | | | Lex = sie | | Cat = NP | | __ __ __ __ | | | | | | | | | Num = Sg | | Num = Pl | | | | Gender = Fem | OR | Gender = Masc OR Fem OR Neut | | | |__ __| |__ __| | | | | Case = Nom OR Acc | |__ __| <box> Lex = sie Cat = NP <box> Num = Sg Gender = Fem </box> OR <box> Num = Pl Gender = Masc OR Fem OR Neut </box> Case = Nom OR Acc </box> The encoding of this feature structure is as follows. <![ CDATA [ <f.struct> <feature> <f.name> Lex </f.name> <f.struct> sie </f.struct> </feature> <feature> <f.name> Cat </f.name> <f.struct> NP </f.struct> </feature> <feature> <f.s.OR> <f.struct> <feature> <f.name> Num </f.name> <f.struct> Sg </f.struct> </feature> <feature> <f.name> Gender </f.name> <f.struct> Fem </f.struct> </feature> </f.struct> <f.struct> <feature> <f.name> Num </f.name> <f.struct> Pl </f.struct> </feature> <feature> <f.name> Gender </f.name> <f.s.OR> <f.struct> Masc </f.struct> <f.struct> Fem </f.struct> <f.struct> Neut </f.struct> </f.s.OR> </feature> </f.struct> </f.s.OR> </feature> <feature> <f.name> Case </f.name> <f.s.OR> <f.struct> Nom </f.struct> <f.struct> Acc </f.struct> </f.s.OR> </feature> </f.struct> ]]>

Tree Structures Consider the following tree structure, which partially describes the structure of the sentence Who did John persuade Bill to invite? The analysis appeared in N.N., Name of Paper, in Name of Volume (New York: Yamaha, 1999), p. 14. &gamma1 and &alpha26 (or: G1 and A26) are alternative names for the tree (used to identify it in discussion and in other tree diagrams) and the dotted-line arrow connecting WH and the terminal node NP represents coindexing. (The node labeled NP points to the node labeled WH). &gamma<sub>1</sub> = &alpha<sub>26</sub> = G1 = A26 = S' / \ ..>WH S . / \ . NP VP' . | / \ . PRO TO VP . / \ . V NP. . . | | . . invite e . . . . . . . . . . . . . who PRO to invite This tree structure is encoded as follows: <![ CDATA [ <tree> <tree.name> G1 </tree.name> <tree.name> A26 </tree.name> <f.struct> S' </f.struct> <subtrees> <tree> <f.struct id = wh1> WH </f.struct> </tree> <tree> <f.struct> S </f.struct> <subtrees> <tree> <f.struct> NP </f.struct> <subtrees> <terminal> PRO </terminal> </subtrees> </tree> <tree> <f.struct> VP' </f.struct> <subtrees> <tree> <f.struct> TO </f.struct> </tree> <tree> <f.struct> VP </f.struct> <subtrees> <tree> <f.struct> V </f.struct> <subtrees> <terminal> invite </terminal> </subtrees> </tree> <tree> <f.struct> NP </f.struct> <subtrees> <terminal> e </terminal> <n.ptr target = wh1> </subtrees> </tree> </subtrees> </tree> </subtrees> </tree> </subtrees> </tree> </subtrees> </tree> ]]>

Derivations in Categorial Grammar In categorial grammar, syntactic categories are either simple categories (atomic symbols), complex categories formed by recursively combining atomic symbols using operators, or derived categories---that is, categories that result from using rules of combination. All categories have an index in order to allow the alignment of the derivation with the input string. In the interests of notational economy, input and output categories are not distinguished in the relevant SGML declarations: <![ CDATA [ <!-- Operators are forward slash, backslash, and vertical bar. --> <!ENTITY % op "(fslash | bslash | vbar)" > <!-- Categories --> <!ELEMENT cat - - ((#PCDATA) | (cat, %op;, cat) | (cat, conj, cat) | (cat.derived)) > <!ELEMENT conj - - (#PCDATA) > <!ELEMENT dcat - - (cat, rule, cat+) > <!-- Rules and operations --> <!ELEMENT rule - - (#PCDATA) > <!-- All categories can be pointed at with IDref --> <!ATTLIST (cat | cat.derived | conj) ID ID #IMPLIED > ]]> Two sample encodings are given below. The first example is an instance of type-raising, a rule that yields a derived category from a single input category. NP ========>type-raise S/(S\NP) The SGML encoding is as follows: <![ CDATA [ <dcat> <cat> <cat> S </cat> <fslash> <cat> <cat> S </cat> <bslash> <cat> NP </cat> </cat> </cat> <rule> type-raise </rule> <cat> NP </dcat> ]]> The second example is the derivation of a simple sentence: I cooked and ate the beans ________ _________ ____ _________ _________ NP (S\NP)/NP conj (S\NP)/NP NP ____________________________conjoin (S\NP)/NP ____________________________>apply S\NP _____________________________________>apply S The encoding, which corresponds to reading the derivation from the bottom up, is as follows: <![ CDATA [ <dcat> <cat> S </cat> <rule> apply </rule> <cat> NP </cat> <cat> <dcat> <cat> <cat> S </cat> <bslash> <cat> NP </cat> </cat> <rule> apply </rule> <cat> <dcat> <cat> <cat> <cat> S </cat> <bslash> <cat> NP </cat> </cat> <fslash> <cat> NP </cat> </cat> <rule> conjoin </rule> <cat> <cat> <cat> <cat> S </cat> <bslash> <cat> NP </cat> </cat> <fslash> <cat> NP </cat> </cat> <conj> and </conj> <cat> <cat> <cat> S </cat> <bslash> <cat> NP </cat> </cat> <fslash> <cat> NP </cat> </cat> </cat> </dcat> </cat> <cat> NP </cat> </dcat> </cat> </dcat> ]]>

Syntactic and Lexical Ambiguity

This section discusses some of the special problems for linguistic markup posed by lexical and structural ambiguity, giving as an example a feature-structure representation of the English sentence Wash sinks. Each word of this sentence can be considered either a noun or a verb each with different senses; as a sentence it is structurally ambiguous, being either a declarative or an imperative sentence; in addition, the word Wash can also be considered a proper noun. For purposes of this illustration, we assume that this sentence has four interpretations: two as a declarative sentence, and two as an imperative sentence. The four interpretations are paraphrasable as follows.

  1. Laundry goes to the bottom.
  2. Someone named Wash goes to the bottom.
  3. I order you to clean basins used for washing.
  4. I order you to clean depressions in a land surface.

The actual structure of the example is given in the f.struct whose id is F1. This f.struct contains one f.s.OR, which encloses pointers to the analyses of the sentence as respectively declarative and imperative. The f.struct for the declarative interpretation does not itself contain an f.s.OR, but it does have a pointer to one that does, namely to the f.struct identified as f3. The f.s.OR in this f.struct encloses subanalyses for Wash as a proper noun phrase and as a common noun phrase. Similarly, the f.struct for the imperative interpretation does not contain f.s.OR. To find the alternative analyses, one has to follow a chain of pointers from the f.struct identified as f2 to f5 and from f5 to f9. This f.struct contains a f.s.OR which encloses the two glosses for the noun sinks. Note that no single f.struct directly reveals the four-way ambiguity assumed to be associated with the sentence, but the ambiguity is nevertheless represented by the collection of f.structs in the markup.The illustration also makes use of a path on the f.ptr tag. Its purpose is to express the disambiguation of an ambiguous subpart of a larger structure. The basic idea is that if a f.ptr points to an ambiguous structure (a structure that has an f.s.OR in it), and one of its interpretations is to be selected, then a path attribute is specified for it, whose value is a list of ID attributes of feature structures within the ambiguous structure. These IDs indicate which of the alternative f.structs inside the f.s.OR are the interpretations of the structure being pointed to by the enclosing f.ptr.

First the linguistic analysis itself is described, and then the sample markup is given. The illustration is broken down into text, a series of f.structs and alignment. The markup of the text presumably does not conform to the recommendations of chapter 6, since I have not checked them for conformity thereto. --TL

<![[ CDATA [ <!-- Here is the text markup. --> <text id=sample1> <sentence id=s1> <word id=w1> <character id=c1> W </character> <character id=c2> a </character> <character id=c3> s </character> <character id=c4> h </character> </word> <character id=c5> &blank; </character> <word id=w2> <character id=c6> s </character> <character id=c7> i </character> <character id=c8> n </character> <character id=c9> k </character> <character id=c10> s </character> </word> <character id=c11> &period; </character> </sentence> </text> <!-- Here begins the series of <tag>f.struct</tag>s. --> <f.struct id=f1> <f.s.name> Analysis of sentence 'Wash sinks.' </f.s.name> <feature> <f.name> Category </f.name> <f.struct> Sentence </f.struct> </feature> <feature> <f.name> Alternative subanalyses </f.name> <f.s.OR id=f1o> <f.ptr target=f1a1> <f.ptr target=f1a2> </f.s.OR> </feature> </f.struct> <!-- The next <tag>f.struct</tag> --> <f.struct id=f1a1> <f.s.name> One subanalysis of 'Wash sinks.' </f.s.name> <feature> <f.name> Mood </f.name> <f.struct> Indicative </f.struct> </feature> <feature> <f.name> Voice </f.name> <f.struct> Stative </f.struct> </feature> <feature> <f.name> Subject </f.name> <f.ptr target=f3> </feature> <feature> <f.name> Predicate </f.name> <f.ptr target=f6> </feature> </f.struct> <!-- The next <tag>f.struct</tag> --> <f.struct id=f1a2> <f.s.name> Another subanalysis of 'Wash sinks.' </f.s.name> <feature> <f.name> Mood </f.name> <f.struct> Imperative </f.struct> </feature> <feature> <f.name> Voice </f.name> <f.struct> Active </f.struct> </feature> <feature> <f.name> Subject </f.name> <f.struct id=f1a2f1> <feature> <f.name> Category </f.name> <f.struct> NP </f.struct> </feature> <feature> <f.name> Number </f.name> <f.struct> Unspecified </f.struct> </feature> <feature> <f.name> Person </f.name> <f.struct> 2 </f.struct> </feature> </f.struct> </feature> <feature> <f.name> Predicate </f.name> <f.ptr target=f2> </feature> </f.struct> <!-- The next <tag>f.struct</tag> --> <f.struct id=f2> <f.s.name> Analysis of VP 'Wash sinks' </f.s.name> <feature> <f.name> Category </f.name> <f.struct> VP </f.struct> </feature> <feature> <f.name> Head </f.name> <f.ptr target=f8> <f.s.choice target=f8o> <f.ptr target=f8a2> </f.s.choice> <!-- </f.ptr> See <fnref refid=choice> --> </feature> <feature> <f.name> Direct Object </f.name> <f.ptr target=f5> </feature> </f.struct> <!-- The next <tag>f.struct</tag> --> <f.struct id=f3> <f.s.name> Analysis of NP 'wash' </f.s.name> <feature> <f.name> Category </f.name> <f.struct> NP </f.struct> </feature> <feature> <f.name> Alternative subanalyses of NP 'wash' </f.name> <f.s.OR id=f3o> <f.struct id=f3a1> <feature> <f.name> Proper </f.name> <plus> </feature> <feature> <f.name> Number </f.name> <f.struct> Singular </f.struct> </feature> <feature> <f.name> Object-type </f.name> <f.struct> Person </f.struct> </feature> </f.struct> <f.struct id=f3a2> <feature> <f.name> Mass </f.name> <plus> </feature> <feature> <f.name> Number </f.name> <f.struct> Singular </f.struct> </feature> <feature> <f.name> Head </f.name> <f.ptr target=f7> </feature> </f.struct> </f.s.OR> </feature> </f.struct> <!-- The next <tag>f.struct</tag> --> <f.struct id=f4> <f.s.name> Analysis of VP 'wash' </f.s.name> <feature> <f.name> Category </f.name> <f.struct> VP </f.struct> </feature> <feature> <f.name> Head </f.name> <f.ptr target=f8> <f.s.choice target=f8o> <f.ptr target=f8a1> </f.s.choice> <!-- </f.ptr> See <fnref refid=choice> --> </feature> </f.struct> <!-- The next <tag>f.struct</tag> --> <f.struct id=f5> <f.s.name> Analysis of NP 'sinks' </f.s.name> <feature> <f.name> Count </f.name> <plus> </feature> <feature> <f.name> Number </f.name> <f.struct> Plural </f.struct> </feature> <feature> <f.name> Head </f.name> <f.ptr target=f9> </feature> </f.struct> <!-- The next <tag>f.struct</tag> --> <f.struct id=f6> <f.s.name> Analysis of VP 'sinks' </f.s.name> <feature> <f.name> Category </f.name> <f.struct> VP </f.struct> </feature> <feature> <f.name> Head </f.name> <f.ptr target=f10> <f.s.choice target=f10o> <f.ptr target=f10a1> </f.s.choice> <!-- </f.ptr> See <fnref refid=choice> --> </feature> </f.struct> <f.struct id=f7> <f.s.name> Analysis of Noun 'wash' </f.s.name> <feature> <f.name> Category </f.name> <f.struct> Noun </f.struct> </feature> <feature> <f.name> Inflected </f.name> <minus> </feature> </f.struct> <!-- The next <tag>f.struct</tag> --> <f.struct id=f8> <f.s.name> Analysis of Verb 'wash' </f.s.name> <feature> <f.name> Category </f.name> <f.struct> Verb </f.struct> </feature> <feature> <f.name> Inflected </f.name> <minus> </feature> <feature> <f.name> Alternative subanalyses of Verb 'wash' </f.name> <f.s.OR id=f8o> <f.struct id=f8a1> <feature> <f.name> Subcat </f.name> <f.struct> Intransitive </f.struct> </feature> </f.struct> <f.struct id=f8a2> <feature> <f.name> Subcat </f.name> <f.struct> Transitive </f.struct> </feature> </f.struct> </f.s.OR> </feature> </f.struct> <!-- The next <tag>f.struct</tag> --> <f.struct id=f9> <f.s.name> Analysis of Noun 'sinks' </f.s.name> <feature> <f.name> Category </f.name> <f.struct> Noun </f.struct> </feature> <feature> <f.name> Gloss <f.s.OR id=f9o> <f.struct id=f9a1> basins for washing </f.struct> <f.struct id=f9a2> depressions in a land surface </f.struct> </f.s.OR> </feature> <feature> <f.name> Inflected </f.name> <plus> </feature> <feature> <f.name> Number </f.name> <f.struct> Plural </f.struct> </feature> </f.struct> <!-- The next <tag>f.struct</tag> --> <f.struct id=f10> <f.s.name> Analysis of Verb 'sinks' </f.s.name> <feature> <f.name> Category </f.name> <f.struct> Verb </f.struct> </feature> <feature> <f.name> Inflected </f.name> <plus> </feature> <feature> <f.name> Number-of-subject </f.name> <f.struct> Singular </feature> <feature> <f.name> Person-of-subject </f.name> <f.struct> 3 </feature> <feature> <f.name> Alternative subanalyses of Verb 'sinks' </f.name> <f.s.OR id=f10o> <f.struct id=f10a1> <feature> <f.name> Subcat </f.name> <f.struct> Intransitive </f.struct> </feature> </f.struct> <f.struct id=f10a2> <feature> <f.name> Subcat </f.name> <f.struct> Transitive </f.struct> </feature> </f.struct> </f.s.OR> </feature> </f.struct> <!-- The <tag>alignment</tag> follows. --> <alignment> <al.map> <al.ptr id=s1> <al.list> <al.ptr id=f0> <al.ptr id=f1> </al.list> </al.map> <al.map> <al.range al.start=w1 al.end=w2> <al.ptr id=f2> </al.map> <al.map> <al.ptr id=w1> <al.list> <al.ptr id=f3> <al.ptr id=f7> <al.ptr id=f8> </al.list> </al.map> <al.map> <al.ptr id=w2> <al.list> <al.ptr id=f6> <al.ptr id=f9> <al.ptr id=f10> </al.list> </al.map> </alignment> ]]>