论文部分内容阅读
目前广泛采纳的最长名词短语定义是以其句法功能划分出的短语子集,导致了近30%的边界识别错误,同时由于对基本特征缺乏一致认识,不同研究的定义结果也有所不同。本文讨论了最长名词短语的长度约束、名词性认定、外延范围和层次构造等问题,提出按照句法位置特征划分出最长名词短语全集,定义为句子中不被其他名词短语直接包含的名词短语,其中包括单词结构、名词性短语和离心式名词短语。新定义的最长名词短语具有功能上的一致性和分布的相似性,减少了边界歧义,它呈现多层分布,但集中的层级分布倾向也预示了高效识别的可能性。
Currently, the definition of the longest noun phrase widely adopted is a subset of the phrases divided by its syntactic function, resulting in nearly 30% border identification errors. At the same time, due to the lack of common understanding of the basic features, the definition results of different studies are also different. This paper discusses the problems of length constraint, nominal recognition, extensional range and hierarchy construction of the longest noun phrase. It proposes that the complete set of the longest noun phrase be divided according to the syntactic location features, defined as the noun phrase not directly contained by other noun phrases in the sentence , Which includes word structure, noun phrases and centrifugal noun phrases. The newly defined longest noun phrases have functional consistency and distributional similarity, reduce border ambiguity, which presents multi-level distribution, but the concentration level of concentration also proves the possibility of efficient identification.