Deprecated: Implicit conversion from float 211.6 to int loses precision in C:\Inetpub\vhosts\kidney.de\httpdocs\pget.php on line 534
Deprecated: Implicit conversion from float 211.6 to int loses precision in C:\Inetpub\vhosts\kidney.de\httpdocs\pget.php on line 534
Deprecated: Implicit conversion from float 211.6 to int loses precision in C:\Inetpub\vhosts\kidney.de\httpdocs\pget.php on line 534
Deprecated: Implicit conversion from float 211.6 to int loses precision in C:\Inetpub\vhosts\kidney.de\httpdocs\pget.php on line 534
Deprecated: Implicit conversion from float 211.6 to int loses precision in C:\Inetpub\vhosts\kidney.de\httpdocs\pget.php on line 534
Deprecated: Implicit conversion from float 211.6 to int loses precision in C:\Inetpub\vhosts\kidney.de\httpdocs\pget.php on line 534
Deprecated: Implicit conversion from float 245.2 to int loses precision in C:\Inetpub\vhosts\kidney.de\httpdocs\pget.php on line 534
Deprecated: Implicit conversion from float 245.2 to int loses precision in C:\Inetpub\vhosts\kidney.de\httpdocs\pget.php on line 534
Warning: imagejpeg(C:\Inetpub\vhosts\kidney.de\httpdocs\phplern\27613112
.jpg): Failed to open stream: No such file or directory in C:\Inetpub\vhosts\kidney.de\httpdocs\pget.php on line 117 J+Biomed+Semantics
2016 ; 7
(1
): 52
Nephropedia Template TP
gab.com Text
Twit Text FOAVip
Twit Text #
English Wikipedia
Gene Ontology synonym generation rules lead to increased performance in
biomedical concept recognition
#MMPMID27613112
Funk CS
; Cohen KB
; Hunter LE
; Verspoor KM
J Biomed Semantics
2016[Sep]; 7
(1
): 52
PMID27613112
show ga
BACKGROUND: Gene Ontology (GO) terms represent the standard for annotation and
representation of molecular functions, biological processes and cellular
compartments, but a large gap exists between the way concepts are represented in
the ontology and how they are expressed in natural language text. The
construction of highly specific GO terms is formulaic, consisting of parts and
pieces from more simple terms. RESULTS: We present two different types of
manually generated rules to help capture the variation of how GO terms can appear
in natural language text. The first set of rules takes into account the
compositional nature of GO and recursively decomposes the terms into their
smallest constituent parts. The second set of rules generates derivational
variations of these smaller terms and compositionally combines all generated
variants to form the original term. By applying both types of rules, new synonyms
are generated for two-thirds of all GO terms and an increase in F-measure
performance for recognition of GO on the CRAFT corpus from 0.498 to 0.636 is
observed. Additionally, we evaluated the combination of both types of rules over
one million full text documents from Elsevier; manual validation and error
analysis show we are able to recognize GO concepts with reasonable accuracy (88
%) based on random sampling of annotations. CONCLUSIONS: In this work we present
a set of simple synonym generation rules that utilize the highly compositional
and formulaic nature of the Gene Ontology concepts. We illustrate how the
generated synonyms aid in improving recognition of GO concepts on two different
biomedical corpora. We discuss other applications of our rules for GO ontology
quality assurance, explore the issue of overgeneration, and provide examples of
how similar methodologies could be applied to other biomedical terminologies.
Additionally, we provide all generated synonyms for use by the text-mining
community.