英语翻译This question will introduce you to a method for compari

问题描述:

英语翻译
This question will introduce you to a method for comparing documents
based on ideas in linear algebra.
Suppose you are given a set of n documents numbered 1 through n.Suppose there is
a list of m words numbered 1 through m that are of interest to you.De\x0cne vectors x1
through xn as follows:the jth entry of xi is the number of times the jth word appears
in the ith document.You can think of xi as a summary of the information in the ith
document.
I selected three articles from Wikipedia and counted the number of times the words
fur,blood,bone,feather appeared in each.Here are the results:
Article fur blood bone feather
Mammal 6 4 15 0
Reptile 1 5 0 3
Bird 0 7 5 43
You are encouraged to use a calculator or a computer for the following questions.
a.Write down the vectors x1,x2,x3,for the words given.
b.We de\x0cne the similarity of documents i and j as
xi \1 xj
kxik kxjk
Compute the similarity between documents 1 and 2,between documents 2 and 3,
and documents 1 and 3.According to this measure is document 2 more similar to
document 1 or to document
c.Is it possible to have negative similarity between two documents?Why or why not?
1个回答 分类:英语 2014-11-21

问题解答:

我来补答
这道题目会教你一种用线性代数的知识比较数据的方法.
假设现有一组从1到n的数据和一组对照数据从1标记至m.定义如下向量从x1至xn:第j个xi的数据次数是第j个词在第i个文档的位置.可以把xi数据表当作是以i为标记的文档摘要.
我从维基百科中选择了3篇文章并计算了词组:皮毛,血液,骨头,羽毛在这些文章中分别出现了多少次.以下是我的计算结果:
文章名 皮毛 血液 骨头 羽毛
哺乳动物 6 4 15 0
爬行动物 1 5 0 3
鸟类 0 7 5 43
你可以使用计算器或者电脑来帮助解决以下问题.
a.写下与给出的词组对应的向量 x1,x2,x3
b.我们把数据i 和数据j 的相似处定义为
xi \1 xj
kxik kxjk
计算出数据1 和数据2,数据2和数据3,以及数据1和数据3之间的相似结果.根据你得出的结果,2号数据组与1号数据组更相似还是与3号数据组更相似呢?
c.两组数据间有可能出现负数值的相似结果吗?如有可能,请简要回答;如果没可能,请解释原因.
 
 
展开全文阅读
剩余:2000
上一页:减术分裂
下一页:语文学习与巩固
也许感兴趣的知识