论文部分内容阅读
人们为什么能够在他们所得到的稀少信息基础上获得那么多的知识?对这个柏拉图问题有各种各样的回答。潜伏语义分析(Latent Semantic Analysis, LSA)使用了奇异值分解的线性代数的方法说明减少维数有助于揭示语义的潜伏关系,本文举了两个事例来加以说明:一个是对包括了计算机人机对话和数学图论两个内容的九篇文章题目进行分析,两个原来无甚联系的词经处理后却有很高的相关(.90)。另一个是对中国学生英语失误的关系的分析,减少维数后能更好地解释五种水平不同的学习者的拼写失误、用词失误和句法结构的发展趋势。LSA在文本处理方面有广泛的应用范围。
Why can people gain so much knowledge based on the scarce information they receive? There are various answers to this Platonic question. Latent Semantic Analysis (LSA) uses the method of linear algebra of singular value decomposition to illustrate that reducing the number of dimensions helps to reveal the latent relationship of semantics. This article gives two examples to illustrate: one is to include computer Machine dialogue and mathematical graph theory of two articles nine articles analysis of the topic, the two original contactless word processed but there is a high correlation (.90). The other is to analyze the relationship between English errors in Chinese students. After reducing the number of dimensions, we can better explain the development tendency of spelling mistakes, word errors and syntactic structures in five different levels of learners. LSA has a wide range of applications in text processing.