Deprecated: Implicit conversion from float 217.6 to int loses precision in C:\Inetpub\vhosts\kidney.de\httpdocs\pget.php on line 534
Deprecated: Implicit conversion from float 217.6 to int loses precision in C:\Inetpub\vhosts\kidney.de\httpdocs\pget.php on line 534
Warning: imagejpeg(C:\Inetpub\vhosts\kidney.de\httpdocs\phplern\24578357
.jpg): Failed to open stream: No such file or directory in C:\Inetpub\vhosts\kidney.de\httpdocs\pget.php on line 117 J+Am+Med+Inform+Assoc
2014 ; 21
(5
): 850-7
Nephropedia Template TP
gab.com Text
Twit Text FOAVip
Twit Text #
English Wikipedia
Learning regular expressions for clinical text classification
#MMPMID24578357
Bui DD
; Zeng-Treitler Q
J Am Med Inform Assoc
2014[Sep]; 21
(5
): 850-7
PMID24578357
show ga
OBJECTIVES: Natural language processing (NLP) applications typically use regular
expressions that have been developed manually by human experts. Our goal is to
automate both the creation and utilization of regular expressions in text
classification. METHODS: We designed a novel regular expression discovery (RED)
algorithm and implemented two text classifiers based on RED. The RED+ALIGN
classifier combines RED with an alignment algorithm, and RED+SVM combines RED
with a support vector machine (SVM) classifier. Two clinical datasets were used
for testing and evaluation: the SMOKE dataset, containing 1091 text snippets
describing smoking status; and the PAIN dataset, containing 702 snippets
describing pain status. We performed 10-fold cross-validation to calculate
accuracy, precision, recall, and F-measure metrics. In the evaluation, an SVM
classifier was trained as the control. RESULTS: The two RED classifiers achieved
80.9-83.0% in overall accuracy on the two datasets, which is 1.3-3% higher than
SVM's accuracy (p<0.001). Similarly, small but consistent improvements have been
observed in precision, recall, and F-measure when RED classifiers are compared
with SVM alone. More significantly, RED+ALIGN correctly classified many instances
that were misclassified by the SVM classifier (8.1-10.3% of the total instances
and 43.8-53.0% of SVM's misclassifications). CONCLUSIONS: Machine-generated
regular expressions can be effectively used in clinical text classification. The
regular expression-based classifier can be combined with other classifiers, like
SVM, to improve classification performance.
|*Algorithms
[MESH]
|*Natural Language Processing
[MESH]
|Artificial Intelligence
[MESH]
|Electronic Data Processing
[MESH]
|Humans
[MESH]
|Medical Records Systems, Computerized/*classification
[MESH]