Reducing the Complexity of Parsing by a Method of Decomposition.

Lyon, C. and Dickerson, R. (1997) Reducing the Complexity of Parsing by a Method of Decomposition. In: UNSPECIFIED.

Copy

The complexity of parsing English sentences can be reduced by decomposing the problem into three subtasks. Declarative sentences can almost always be segmented into three concatenated sections: pre-subject, subject, predicate. Other constituents, such as clauses, phrases, noun groups, are contained within these segments, but do not normally cross the boundaries between them. Though a constituent in one section may have dependent links to elements in other sections, such as agreement between the head of the subject and the main verb, once the three sections have been located, they can then be partially processed separately, in parallel. An information theoretic analysis is used to support this approach. If sentences are represented as sequences of part-of-speech tags, then modelling them with the tripartite segmentation reduces the entropy. This indicates that some of the structure of the sentence has been captured. The tripartite segmentation can be produced automatically, using the ALPINE parser, which is then described. This is a hybrid processor in which neural networks operate within a rule based framework. It has been developed using corpora from technical manuals. Performance on unseen data from the manuals on which the processor was trained are over 90%. On data from other technical manuals performance is over 85%.

Item Type	Conference or Workshop Item (UNSPECIFIED)
Date Deposited	26 Jul 2024 11:01
Last Modified	26 Jul 2024 11:01

Atom

BibTeX

OpenURL ContextObject in Span

OpenURL ContextObject

Dublin Core

MPEG-21 DIDL

EndNote

HTML Citation

METS

MODS

RIOXX2 XML

Reference Manager

Refer

ASCII Citation

Export

Downloads

picture_as_pdf: 901892.pdf
: Available under Creative Commons: 4.0

View

Download

Reducing the Complexity of Parsing by a Method of Decomposition.

Explore Further