问题描述:
英语翻译
This question will introduce you to a method for comparing documents
based on ideas in linear algebra.
Suppose you are given a set of n documents numbered 1 through n.Suppose there is
a list of m words numbered 1 through m that are of interest to you.De\x0cne vectors x1
through xn as follows:the jth entry of xi is the number of times the jth word appears
in the ith document.You can think of xi as a summary of the information in the ith
document.
I selected three articles from Wikipedia and counted the number of times the words
fur,blood,bone,feather appeared in each.Here are the results:
Article fur blood bone feather
Mammal 6 4 15 0
Reptile 1 5 0 3
Bird 0 7 5 43
You are encouraged to use a calculator or a computer for the following questions.
a.Write down the vectors x1,x2,x3,for the words given.
b.We de\x0cne the similarity of documents i and j as
xi \1 xj
kxik kxjk
Compute the similarity between documents 1 and 2,between documents 2 and 3,
and documents 1 and 3.According to this measure is document 2 more similar to
document 1 or to document
c.Is it possible to have negative similarity between two documents?Why or why not?
This question will introduce you to a method for comparing documents
based on ideas in linear algebra.
Suppose you are given a set of n documents numbered 1 through n.Suppose there is
a list of m words numbered 1 through m that are of interest to you.De\x0cne vectors x1
through xn as follows:the jth entry of xi is the number of times the jth word appears
in the ith document.You can think of xi as a summary of the information in the ith
document.
I selected three articles from Wikipedia and counted the number of times the words
fur,blood,bone,feather appeared in each.Here are the results:
Article fur blood bone feather
Mammal 6 4 15 0
Reptile 1 5 0 3
Bird 0 7 5 43
You are encouraged to use a calculator or a computer for the following questions.
a.Write down the vectors x1,x2,x3,for the words given.
b.We de\x0cne the similarity of documents i and j as
xi \1 xj
kxik kxjk
Compute the similarity between documents 1 and 2,between documents 2 and 3,
and documents 1 and 3.According to this measure is document 2 more similar to
document 1 or to document
c.Is it possible to have negative similarity between two documents?Why or why not?
问题解答:
我来补答展开全文阅读