Time issue in designing a RS

之前看了學長的 Time weighted CF 演算法成功了改進傳統

CF演算法的Accuracy。心中還是有些疑惑,依照學長的論

點是,一個人早期的 rating 資料的參考價值,不會比現在的

rating 資料的參考價值來的高。因此他設計了一個參考價值

依照時間飄逝 decrese 的曲線,也就是距離目前時間點愈近

則參考價值就比較高,愈遠就參考價值愈低。比如今天某位

使用者所做的 rating 參考價值是 1 ,則昨天此位 user 所做的

rating參考價值可能就是 0.9,如此把參考價值乘上實際的rating

數值所得除來的數字是比較具有參考性的。利用這種想法稍微

改了一下 CF 演算法的核心部份,準確度就上升了。但是目前

在 RS 系統上改善 Accuracy 的主要盲點是在於大家都會想出一

套新的方法來改變舊有的演算法,之後宣稱這個方法是可以提

升準確度的。但是卻沒辦法去證明說如果要改善準確度用這些

新方法就可以真的達到目的。 As known as if p then q ( p -> q) ,

we can not guarantee that q -> p 。其實這和 overfitting 有很大

的關係,儘管我們用很複雜的 model 來讓這份萬年 training data

( MovieLens ) 得到最好的 training 效果,並不意味我們拿到一份

全然不同的 training data時也會有同樣的好效果。其實真正要能

評論 Accuracy 的只有使用者才算是最客觀的標準。 但是人又是

主觀的動物,就算兩個人個性都差不多,但也並不意味著他們兩個

人的想法就都一樣。對於Accuracy的看法也不同吧。

Advertisements

Mail 相關術語,分析 Mail 所建立的 social networks

TO : 郵件的主要接收人

CC : 全名是 Carbon Copy ,副本。

BCC : 全名是 Blind Carbon Copy,密件副本。

在分析 Mail 所建立的 social networks 時,郵件彼此傳送之間

傳送者與接收者的重要性可以由 TO , CC , BCC 看出。 比如一

封 MAIL 如果 TO 很多人,則這封郵件對於每個人的重要性就

沒有來比TO 單一個人來的高。當然,BCC給某個人,則代表

這封郵件對那個BCC接收者顯得更唯重要,或是說寄信者相信

BCC的接受者才會利用BCC這欄位。

在分析 Social network 所做的 Visualization 目的是為了讓使用

者能更直觀的看出人與人之間的關係,比如兩個 Nodes之間的遠

近就可以表示這兩個Nodes之間的相似度或是親密度或是相關度等。

但是當 Network 的 Scale 很大時,做Visualization可能就會遭遇呈現

的問題吧,想像著要把好幾千人所組成的 Network 秀在一個小螢幕上

又可以同時讓使用者了解每個 node 是代表哪位人物的確不是很容易

做得到的。

Recommender Systems – Trust

How to make user trust our recommender system is a crucial

design goal . Image that if a RS always recommend items that

a user have never seen , so the items will be strange to the user.

In other words , the RS will not be so friendly to the user . So

how to make RS seems that it can understand the user’s psychology

is a very important point. It’s just like that RS play a friend role of the

user . In fact , it has simple way to do this : try to recommend items

that the user have ever seen , then let user feel friendly toward our

systems just like his/her friend . So , accuracy is not enough in designing

a RS , we must consider other factors like HCI perspectives in order to

let users easily control our RS.

PR Notes

Pattern recognition — the act of taking in raw data and taking an action based on the “category” of the pattern .

Segmentation operation in which the images of different fish are somehow isolated from one another and from the background. 比如把某隻魚從其他的魚中獨立出來並且去除背景,此動作叫做Segmentation。

Feature 的好壞 : 當一個 feature f1 和 feature f2 有相關性 ( 比如是正相關 ) .
則同時選擇 f1 和 f2 當 feature並沒有什麼意義,對 classification 的效果也不會提升。

用太複雜的 model 來做 decision 並不一定是好的,很難有 generalization 的效果。
也就是會造成Overfitting。簡單就是好。

Given that there truly are differences between the population of sea bass and that
of salmon, we view them as having different models — different descriptions, which
are typically mathematical in form. 盧魚和鮭魚顯然有著不同的特質,比如外觀,
長度,顏色等。諸如此類的特性,我們稱之為 Model。我們常常會用數學式來描述
Model,比如說長度 > 50 以上就是盧魚,<=50就是鮭魚。此為簡單的Model,
因此為了要區分盧魚和鮭魚所做出的decision boundary就很簡單。

Syntactic pattern recognition, where rules or grammars describe our decision. For ex-
ample we might wish to classify an English sentence as grammatical or not, and here
statistical descriptions (word frequencies, word correlations, etc.) are inapapropriate.
Syntactic的方式比較類似定 rule 來描述我們的 decision。

Pattern 表示法可以採用 vector 表示。

CLIPS V.S. JESS

原來 JESS 的前身就是 CLIPS ,JESS 是從 CLIPS 的核心去改出來的版本。

所以語法上幾乎都相通,所以理論上CLIPS可以跑的程式套到JESS上應該

也可以RUN才對。大四下修的Expert System整學期幾乎都在練 CLIPS。

每次上課完都會來一次刺激的上機小考,從 Sort 考到 Permutation ,從

Tree hierarchy relationship 考到 Parsing grammar,幾乎一般程式(C C++)

能辦到的事情拿來CLIPS做都是秒殺。寫 Rule-based程式最大的難度在於

常常我們無法用頭腦去 TRACE 程式的執行流程,比如哪個 Rule 會被先

Activate 等等的一些不確定性,但這也是寫 Rule – based 語言的一種新思維

也就是我們不需要從 sequential 的角度來設計程式流程,而是要用另外一種

思維(不會形容,Design Rule的思維 @@||)來設計程式。

CLIPS V.S. JESS 相關文獻:

http://www.comp.lancs.ac.uk/~kristof/research/notes/clipsvsjess/index.html

HMM structure

Image that we close our eyes , and choose a coin between the two

biased coin that we have prepared . Assume that the first coin have

80 % for us to get head and 20% for us to get tail , and the second

coin have 70 % for us to get head and 30% for us to get tail . While

we choose the first(second) coin , next time we have 30% (50%)

to choose the same coin again , and 70%(50%) to choose another coin .

Then we can use Hidden Makov Model to describe phenomenon

of choosing a coin and toss it to see what we get (head or tai) .

The model we describe above can be drown below :
HMM

Example :

If we observe ” HHHTTTTHHTTTTTTTHHHH ” sequence of doing

above activity . Then we can use this model to predict the Coin ( first or second)

– Toss pair for the sequence . Since we close our eyes , we can not see which

coin we choose to make a toss ( States are hidden for us , so it was called

Hidden Markov Model) .

Example 2 : In activity recognition , activities can be modeled the states , and

the things we touch while we are doing some specific activity can be modeled as

the observations . Then while we collect enough data , we can use them to

train the HMM and learn the parameters (transition probabilities, etc) of HMM

iteratively.

Anytime Algorithm

Concept :
Common programs will end in a few seconds (ex: 5 secs) and finally give the outputs to the user .But if the programs didn’t run completely and end in the middle ( ex: 3 secs) , what will happen ?The most common answer is that the program will just quit and do not thing . So , if we use the concept of anytime algorithm , it means that even though the programs didn’t run completely and end in the middle,it also can give the partial outputs to the user . For example , if the program run completely , we can get 100% answer ; If the program just run half of it’s execution time , we also can get 50% answer .Hints : Thinking about a student doing his/her exam , and the score he/her get will related to how much time he/she take the exam .