Deprecated: Implicit conversion from float 209.6 to int loses precision in C:\Inetpub\vhosts\kidney.de\httpdocs\pget.php on line 534
Deprecated: Implicit conversion from float 209.6 to int loses precision in C:\Inetpub\vhosts\kidney.de\httpdocs\pget.php on line 534
Warning: imagejpeg(C:\Inetpub\vhosts\kidney.de\httpdocs\phplern\28345077
.jpg): Failed to open stream: No such file or directory in C:\Inetpub\vhosts\kidney.de\httpdocs\pget.php on line 117 Proc+Int+Conf+Web+Search+Data+Min
2016 ; 2016
(ä): 615-624
Nephropedia Template TP
gab.com Text
Twit Text FOAVip
Twit Text #
English Wikipedia
Improving Website Hyperlink Structure Using Server Logs
#MMPMID28345077
Paranjape A
; West R
; Zia L
; Leskovec J
Proc Int Conf Web Search Data Min
2016[Feb]; 2016
(ä): 615-624
PMID28345077
show ga
Good websites should be easy to navigate via hyperlinks, yet maintaining a
high-quality link structure is difficult. Identifying pairs of pages that should
be linked may be hard for human editors, especially if the site is large and
changes frequently. Further, given a set of useful link candidates, the task of
incorporating them into the site can be expensive, since it typically involves
humans editing pages. In the light of these challenges, it is desirable to
develop data-driven methods for automating the link placement task. Here we
develop an approach for automatically finding useful hyperlinks to add to a
website. We show that passively collected server logs, beyond telling us which
existing links are useful, also contain implicit signals indicating which
nonexistent links would be useful if they were to be introduced. We leverage
these signals to model the future usefulness of yet nonexistent links. Based on
our model, we define the problem of link placement under budget constraints and
propose an efficient algorithm for solving it. We demonstrate the effectiveness
of our approach by evaluating it on Wikipedia, a large website for which we have
access to both server logs (used for finding useful new links) and the complete
revision history (containing a ground truth of new links). As our method is based
exclusively on standard server logs, it may also be applied to any other website,
as we show with the example of the biomedical research site Simtk.