Warning: file_get_contents(https://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&id=27170830
&cmd=llinks): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests
in C:\Inetpub\vhosts\kidney.de\httpdocs\pget.php on line 215
MapAffil: A Bibliographic Tool for Mapping Author Affiliation Strings to Cities
and Their Geocodes Worldwide
#MMPMID27170830
Torvik VI
Dlib Mag
2015[Nov]; 21
(11-12
): ? PMID27170830
show ga
Bibliographic records often contain author affiliations as free-form text
strings. Ideally one would be able to automatically identify all affiliations
referring to any particular country or city such as Saint Petersburg, Russia.
That introduces several major linguistic challenges. For example, Saint
Petersburg is ambiguous (it refers to multiple cities worldwide and can be part
of a street address) and it has spelling variants (e.g., St. Petersburg,
Sankt-Peterburg, and Leningrad, USSR). We have designed an algorithm that
attempts to solve these types of problems. Key components of the algorithm
include a set of 24,000 extracted city, state, and country names (and their
variants plus geocodes) for candidate look-up, and a set of 1.1 million extracted
word n-grams, each pointing to a unique country (or a US state) for
disambiguation. When applied to a collection of 12.7 million affiliation strings
listed in PubMed, ambiguity remained unresolved for only 0.1%. For the 4.2
million mappings to the USA, 97.7% were complete (included a city), 1.8% included
a state but not a city, and 0.4% did not include a state. A random sample of 300
manually inspected cases yielded six incompletes, none incorrect, and one
unresolved ambiguity. The remaining 293 (97.7%) cases were unambiguously mapped
to the correct cities, better than all of the existing tools tested: GoPubMed got
279 (93.0%) and GeoMaker got 274 (91.3%) while MediaMeter CLIFF and Google Maps
did worse. In summary, we find that incorrect assignments and unresolved
ambiguities are rare (< 1%). The incompleteness rate is about 2%, mostly due to a
lack of information, e.g. the affiliation simply says "University of Illinois"
which can refer to one of five different campuses. A search interface called
MapAffil has been developed at the University of Illinois in which the longitude
and latitude of the geographical city-center is displayed when a city is
identified. This not only helps improve geographic information retrieval but also
enables global bibliometric studies of proximity, mobility, and other geo-linked
data.