Progress report (8/17 ~ 9/21)
Written by Yu-Chung Shen
After my presentation at the group meeting recently, I’ve considered my research topic for a period of time. The research I would like to do is about information searching and recommendation. Here the information I means are the technical papers, and therefore I will use them as the experimental data for my research. The system requirements I proposed can be listed as the following:
1. The overall system structure is built on the distributed environment. It means that each user has an agent to help them manage their technical papers for further sharing and recommendation.
2. Each agent has to communicate with its acquaintance agents so that they can exchange and share repository.
3. When some user give a query such as finding some topic (e.g. recommendation systems), their agents must automatically propagate the query to those acquaintance agents that have the high information provision ability.
4. Each agent maintains a profile which describes the interests of its acquaintances. Based on the profile, each agent can recommend items to the acquaintances who are interested in them.
I think finding the right agents in a large and dynamic network to provide the needed information in a timely fashion is an ambitious task. I would like to propose a method which enables agents to search effectively, furthermore, to identify their acquaintances’ interests and then actively make recommendations for them. The following briefly describe some of the implementation details I’ve considered for now.
1. First, when some user gives a query, his agent must propagate the query to their acquaintances that are most likely to provide the answers. Here is some method to define the information provision ability. (1) Item Vector method: we can define item vectors for each agent’s repository and use cosine similarity measurement to calculate the interest similarity between two agents. When some user gives a query, his agent can propagate the query to the more similar agents, that is, the more similar means the more high information provision ability (2) Content-based method: we can use the well known TF-IDF technique to construct a vector for a query and for each agent’s repository, then to calculate the similarity between a query and a user’s repository. We finally propagate the query to the agents which have high similarity. (3) Routing indices method: each agent maintains a global ontology that represents category of information (Here I mean the technical papers). When some user give a query which want to retrieve information of a specific category, his agent will propagate the query to the agents who have much more information of that specific category or similar category based on the ontology previously defined.
2. Second, in order to provide information recommendation capability, each agent should maintain an interest profile for some of their acquaintances. Agents construct profiles when they receive queries. For example, when agent A receives a query from agent B, it is obviously that agent B is interested in the query now, so we can update agent A’s profile for agent B to record that agent B have some probability to be interested in the query. Next time when agent A receives the same query again from agent B, agent A must also update the profile for agent B to add more probability to imply that agent B is more likely interested in the query. When the probability exceed some predefine value, agent A will automatically recommend items to agent B, which the items are similar to the query that agent B have requested before.
The system described above exploits users’ queries to derive their interests for the recommendation purpose, and it use some of the information provision ability measurement to route the queries to the right agents. To sum up briefly, I want to design agents that can effectively route the queries to the right agents and have the learning ability to learn each user’s interests profile to actively make recommendations.