HUSCAP logo Hokkaido Univ. logo

Hokkaido University Collection of Scholarly and Academic Papers >
Graduate School of Information Science and Technology / Faculty of Information Science and Technology >
Peer-reviewed Journal Articles, etc >

Finding Significant Web Pages with Lower Ranks by Pseudo-Clique Search

Files in This Item:
DSP3735.pdf236.45 kBPDFView/Open
Please use this identifier to cite or link to this item:http://hdl.handle.net/2115/5773

Title: Finding Significant Web Pages with Lower Ranks by Pseudo-Clique Search
Authors: Okubo, Yoshiaki1 Browse this author →KAKEN DB
Haraguchi, Makoto Browse this author →KAKEN DB
Shi, Bin Browse this author
Authors(alt): 大久保, 好章1
Issue Date: 8-Oct-2005
Publisher: Springer
Journal Title: Discovery Science
Volume: 3735
Start Page: 346
End Page: 353
Publisher DOI: 10.1007/11563983_30
Abstract: In this paper, we discuss a method of finding useful clusters of web pages which are significant in the sense that their contents are similar or closely related to ones of higher-ranked pages. Since we are usually careless of pages with lower ranks, they are unconditionally discarded even if their contents are similar to some pages with high ranks. We try to extract such hidden pages together with significant higher-ranked pages as a cluster. In order to obtain such clusters, we first extract semantic correlations among terms by applying Singular Value Decomposition(SVD) to the term-document matrix generated from a corpus w.r.t. a specific topic. Based on the correlations, we can evaluate potential similarities among web pages from which we try to obtain clusters. The set of web pages is represented as a weighted graph G based on the similarities and their ranks. Our clusters can be found as pseudo-cliques in G. We present an algorithm for finding Top-N weighted pseudo-cliques. Our experimental result shows that quite valuable clusters can be actually extracted according to our method.
Rights: The original publication is available at www.springerlink.com
Type: article (author version)
URI: http://hdl.handle.net/2115/5773
Appears in Collections:情報科学院・情報科学研究院 (Graduate School of Information Science and Technology / Faculty of Information Science and Technology) > 雑誌発表論文等 (Peer-reviewed Journal Articles, etc)

Submitter: 大久保 好章

Export metadata:

OAI-PMH ( junii2 , jpcoar_1.0 )

MathJax is now OFF:


 

 - Hokkaido University