Details
Originalsprache | Englisch |
---|---|
Titel des Sammelwerks | Proceeding of the 17th International Conference on World Wide Web 2008, WWW'08 |
Herausgeber (Verlag) | Association for Computing Machinery (ACM) |
Seiten | 1067-1068 |
Seitenumfang | 2 |
ISBN (Print) | 9781605580852 |
Publikationsstatus | Veröffentlicht - 21 Apr. 2008 |
Veranstaltung | 17th International Conference on World Wide Web 2008, WWW'08 - Beijing, China Dauer: 21 Apr. 2008 → 25 Apr. 2008 |
Publikationsreihe
Name | Proceeding of the 17th International Conference on World Wide Web 2008, WWW'08 |
---|
Abstract
Although most of existing research usually detects events by analyzing the content or structural information of Web documents, a recent direction is to study the usage data. In this paper, we focus on detecting events from Web click-through data generated by Web search engines. We propose a novel approach which effectively detects events from click-through data based on robust subspace analysis. We first transform click-through data to the 2D polar space. Next, an algorithm based on Generalized Principal Component Analysis (GPCA) is used to estimate subspaces of transformed data such that each subspace contains query sessions of similar topics. Then, we prune uninteresting subspaces which do not contain query sessions corresponding to real events by considering both the semantic certainty and the temporal certainty of query sessions in each subspace. Finally, various events are detected from interesting subspaces by utilizing a nonparametric clustering technique. Compared with existing approaches, our experimental results based on real-life click-through data have shown that the proposed approach is more accurate in detecting real events and more effective in determining the number of events.
ASJC Scopus Sachgebiete
- Informatik (insg.)
- Computernetzwerke und -kommunikation
Zitieren
- Standard
- Harvard
- Apa
- Vancouver
- BibTex
- RIS
Proceeding of the 17th International Conference on World Wide Web 2008, WWW'08. Association for Computing Machinery (ACM), 2008. S. 1067-1068 (Proceeding of the 17th International Conference on World Wide Web 2008, WWW'08).
Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review
}
TY - GEN
T1 - Using subspace analysis for event detection from web click-through data
AU - Ling, Chen
AU - Yiqun, Hu
AU - Nejdl, Wolfgang
PY - 2008/4/21
Y1 - 2008/4/21
N2 - Although most of existing research usually detects events by analyzing the content or structural information of Web documents, a recent direction is to study the usage data. In this paper, we focus on detecting events from Web click-through data generated by Web search engines. We propose a novel approach which effectively detects events from click-through data based on robust subspace analysis. We first transform click-through data to the 2D polar space. Next, an algorithm based on Generalized Principal Component Analysis (GPCA) is used to estimate subspaces of transformed data such that each subspace contains query sessions of similar topics. Then, we prune uninteresting subspaces which do not contain query sessions corresponding to real events by considering both the semantic certainty and the temporal certainty of query sessions in each subspace. Finally, various events are detected from interesting subspaces by utilizing a nonparametric clustering technique. Compared with existing approaches, our experimental results based on real-life click-through data have shown that the proposed approach is more accurate in detecting real events and more effective in determining the number of events.
AB - Although most of existing research usually detects events by analyzing the content or structural information of Web documents, a recent direction is to study the usage data. In this paper, we focus on detecting events from Web click-through data generated by Web search engines. We propose a novel approach which effectively detects events from click-through data based on robust subspace analysis. We first transform click-through data to the 2D polar space. Next, an algorithm based on Generalized Principal Component Analysis (GPCA) is used to estimate subspaces of transformed data such that each subspace contains query sessions of similar topics. Then, we prune uninteresting subspaces which do not contain query sessions corresponding to real events by considering both the semantic certainty and the temporal certainty of query sessions in each subspace. Finally, various events are detected from interesting subspaces by utilizing a nonparametric clustering technique. Compared with existing approaches, our experimental results based on real-life click-through data have shown that the proposed approach is more accurate in detecting real events and more effective in determining the number of events.
KW - Click-through data
KW - Event detection
KW - GPCA
KW - Subspace estimation
UR - http://www.scopus.com/inward/record.url?scp=57349100301&partnerID=8YFLogxK
U2 - 10.1145/1367497.1367659
DO - 10.1145/1367497.1367659
M3 - Conference contribution
AN - SCOPUS:57349100301
SN - 9781605580852
T3 - Proceeding of the 17th International Conference on World Wide Web 2008, WWW'08
SP - 1067
EP - 1068
BT - Proceeding of the 17th International Conference on World Wide Web 2008, WWW'08
PB - Association for Computing Machinery (ACM)
T2 - 17th International Conference on World Wide Web 2008, WWW'08
Y2 - 21 April 2008 through 25 April 2008
ER -