Research Issues

Netflix data set : 有好幾十億的資料,若要轉成 user-item matrix

做分析,則會因為 scale 太大而難以處理。

Solution : Apply SVD ( Singular Vector Decomposition) to the matrix.

SVD 可以把原始的 matrix 降至更低維度,並且保持一些性質 (待check)。

想法 : 幫每個movie定義一個 feature vactor,vector中的每個tuple代表

此movie某方面的性質,比如動作成分佔多高要素。相對的每個user也有

preference vector,vector中的每個tuple代表對movie中的每個性質之喜

好程度。

So let user A’s preference vector is = ( 1 , 2 , -1 )

let Movie M’s feature vector is = ( 1 , 4 , -1 )

then user A’s rating on Movie M is 1*1 + 2*4 + -1*-1 = 1 + 8 + 1 = 10

Paper Recommendation :

讓使用者把自己電腦中的 paper 交給 bibagent去取得 bibtex 後,並紀

錄每篇 paper在使用者電腦中的save time。如此就可以apply time weighted

技術來 identify目前user的working set是哪些領域的paper。為了不讓user

親自去對每篇paper做rating的動作,我們可以由網路上抓此篇paper的citation

number來當做客觀的rating,or we can do it in this way :

rating of a paper = citation number / published date

So , we can designe a web for user to upload the result of the retriving data made

by bibagent , then also let user to create the FOAF-like data sheet so that we can

use those infomation to apply CF-Algo .

Another point : 針對某篇 paper,我們是否能找出對此篇paper有興趣的user set,

so that we can get recommendations more specific to the paper from the user set .

類似找專家的概念。

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s