Deriving concept-based user profiles From search engine logs

In this paper, we focus on search engine personalization and develop several concept-based user profiling methods that are based on both positive and negative preferences. We evaluate the proposed methods against our previously proposed personalized query clustering method. Experimental results show that profiles which capture and utilize both of the userís positive and negative preferences perform the best.

An important result from the experiments is that profiles with negative preferences can increase the separation between similar and dissimilar queries. The separation provides a clear threshold for an agglomerative clustering algorithm to terminate and improve the overall quality of the resulting query clusters.

Existing System:

User profiling is a fundamental component of any personalization applications. Most existing user profiling strategies are based on objects. A good user profiling strategy is an essential and fundamental component in search engine personalization. We studied various user profiling strategies for search engine personalization, and observed the following problems in existing strategies.

Existing click through-based user profiling strategies can be categorized into document-based and concept based approaches. They both assume that user clicks can be used to infer usersí interests, although their inference methods and the outcomes of the inference are different. Document-based profiling methods try to estimate usersí document preferences (i.e., users are interested in some documents more than others).

On the other hand, concept based profiling methods aim to derive topics or concepts that users are highly interested. These two approaches will be reviewed in Section 2. While there are document-based methods that consider both usersí positive and negative preferences, to The best of our knowledge, there are no concept-based methods that considered both positive and negative preferences in deriving userís topical interests. Most existing user profiling strategies only consider documents that users are interested in (i.e., usersí positive preferences) but ignore documents that users dislike (i.e., usersí negative preferences).

Proposed System:

We proposing and studying seven concept-based user profiling strategies that are capable of deriving both of the userís positive and negative preferences. The entire user profiling strategies is query-oriented, meaning that a profile is created for each of the userís queries. The user profiling strategies are evaluated and compared with our previously proposed personalized query clustering method. Experimental results show that user profiles which capture both the userís positive and negative preferences perform the best among all of the profiling strategies studied.

Moreover, we find that negative preferences improve the separation of similar and dissimilar queries, which facilitates an agglomerative clustering algorithm to decide if the optimal clusters have been obtained. We show by experiments that the termination point and the resulting precision and recalls are very close to the optimal results. Most concept-based methods automatically derive usersí topical interests by exploring the contents of the usersí browsed documents and search histories.

A proposed a user profiling method based on usersí search history and the Open Directory Project (ODP). The user profile is represented as a set of categories, and for each category, a set of keywords with weights. The categories store dint he user profiles serve as a context to disambiguate user queries. If a profile shows that a user is interested in certain categories, the search can be narrowed down by providing suggested results according to the userís preferred categories.

Modules:

  • Transaction table

  • Support count

  • Frequent item set

  • Deriving concept-based transaction

  • Analysis graph result