Warning: file_get_contents(https://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pubmed&id=28213343
&cmd=llinks): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests
in C:\Inetpub\vhosts\kidney.de\httpdocs\pget.php on line 215
Checking Questionable Entry of Personally Identifiable Information Encrypted by
One-Way Hash Transformation
#MMPMID28213343
Chen X
; Fann YC
; McAuliffe M
; Vismer D
; Yang R
JMIR Med Inform
2017[Feb]; 5
(1
): e2
PMID28213343
show ga
BACKGROUND: As one of the several effective solutions for personal privacy
protection, a global unique identifier (GUID) is linked with hash codes that are
generated from combinations of personally identifiable information (PII) by a
one-way hash algorithm. On the GUID server, no PII is permitted to be stored, and
only GUID and hash codes are allowed. The quality of PII entry is critical to the
GUID system. OBJECTIVE: The goal of our study was to explore a method of checking
questionable entry of PII in this context without using or sending any portion of
PII while registering a subject. METHODS: According to the principle of GUID
system, all possible combination patterns of PII fields were analyzed and used to
generate hash codes, which were stored on the GUID server. Based on the matching
rules of the GUID system, an error-checking algorithm was developed using set
theory to check PII entry errors. We selected 200,000 simulated individuals with
randomly-planted errors to evaluate the proposed algorithm. These errors were
placed in the required PII fields or optional PII fields. The performance of the
proposed algorithm was also tested in the registering system of study subjects.
RESULTS: There are 127,700 error-planted subjects, of which 114,464 (89.64%) can
still be identified as the previous one and remaining 13,236 (10.36%,
13,236/127,700) are discriminated as new subjects. As expected, 100% of
nonidentified subjects had errors within the required PII fields. The possibility
that a subject is identified is related to the count and the type of incorrect
PII field. For all identified subjects, their errors can be found by the proposed
algorithm. The scope of questionable PII fields is also associated with the count
and the type of the incorrect PII field. The best situation is to precisely find
the exact incorrect PII fields, and the worst situation is to shrink the
questionable scope only to a set of 13 PII fields. In the application, the
proposed algorithm can give a hint of questionable PII entry and perform as an
effective tool. CONCLUSIONS: The GUID system has high error tolerance and may
correctly identify and associate a subject even with few PII field errors.
Correct data entry, especially required PII fields, is critical to avoiding false
splits. In the context of one-way hash transformation, the questionable input of
PII may be identified by applying set theory operators based on the hash codes.
The count and the type of incorrect PII fields play an important role in
identifying a subject and locating questionable PII fields.