De-identification of Protected Health Information: How to Anonymize PHI

By | January 1, 2023

Healthcare organizations and their business associates that want to share protected health information in a HIPAA-compliant way must do so in accordance with the HIPAA Privacy Rule, which limits the possible uses and disclosures of PHI, but de-identification of protected health information means HIPAA Privacy Rule restrictions no longer apply.

HIPAA Privacy Rule restrictions only covers individually identifiable protected health information. If you de-identify PHI so that the identity of individuals cannot be determined, and re-identification of individuals is not possible, PHI can be freely shared.

The de-identification of protected health information enables HIPAA covered entities to share health data for large-scale medical research studies, policy assessments, comparative effectiveness studies, and other studies and assessments without violating the privacy of patients or requiring authorizations to be obtained from each patient prior to data being disclosed.

HIPAA-Compliant De-identification of Protected Health Information

HIPAA-compliant de-identification of protected health information is possible using two methods: Safe Harbor and Expert Determination. Neither method of de-identification of protected health information will remove all risk of re-identification of patients, but both methods will reduce risk to a very low and acceptable level. Use either of the two methods below and PHI will no longer be considered ‘protected health information’ and will therefore not be subject to HIPAA Privacy Rule restrictions.

1.     Safe Harbor – The Removal of Specific Identifiers

The first HIPAA compliant way to de-identify protected health information is to remove specific identifiers from the data set. The identifiable data that must be removed are:

  • Names
  • Geographic subdivisions smaller than a state
  • All elements of dates (except year) related to an individual (including admission and discharge dates, birthdate, date of death, all ages over 89 years old, and elements of dates (including year) that are indicative of age)
  • Telephone, cellphone, and fax numbers
  • Email addresses
  • IP addresses
  • Social Security numbers
  • Medical record numbers
  • Health plan beneficiary numbers
  • Device identifiers and serial numbers
  • Certificate/license numbers
  • Account numbers
  • Vehicle identifiers and serial numbers including license plates
  • Website URLs
  • Full face photos and comparable images
  • Biometric identifiers (including finger and voice prints)
  • Any unique identifying numbers, characteristics or codes

In the case of zip codes, covered entities are permitted to use the first three digits provided the geographic unit formed by combining those first three digits contains more than 20,000 individuals. When that geographical unit contains fewer than 20,000 individuals it should be changed to 000. According to the Bureau of the Census, that means 17 zip codes must have the first three digits changed to zero:

036, 692, 878, 059, 790, 879, 063, 821, 884, 102, 823, 890, 203, 830, 893, 556, 831

Covered entities should not that the above list of zip codes may change after future censuses. The list is based on 5-digit zip codes from the 2000 census.

For further information on de-identification of protected health information using the safe harbor method see 45 CFR § 164.514(b)(2).

2. Expert Determination

The expert determination method carries a small risk that an individual could be identified, although the risk is so low that it meets HIPAA Privacy Rule requirements.

This method of de-identification of protected health information requires a HIPAA covered entity or business associate to obtain an opinion from a qualified statistical expert that the risk of re-identifying an individual from the data set is very small. In such cases, the methods used to make that determination and justification of the expert’s opinion must be documented and retained by the covered entity or business associate and made available to regulators in the event of an audit or investigation.

The expert must be a person with appropriate knowledge and experience of using generally accepted statistical and scientific principles and methods for removing or altering information to ensure that it is no longer individually identifiable.

When those methods and principles have been applied, the expert must determine that the risk of reidentification of an individual is very small. In such cases, the risk of reidentification must be very small when the information is used alone, and must remain very small should the data be combined with other reasonably available information by an anticipated recipient to identify an individual who is a subject of the information.

HIPAA does not define the level of risk of re-identification other than to say it should be ‘very small’. The expert should define ‘very small’ in relation to the context of the data set, the specific environment, and the ability of an anticipated recipient to be able to reidentify individuals.

Experts may come from a number of different fields and do not require any specific qualifications. What is important is experts have experience of deidentifying data. It is that experience that regulators will look at in the event of an audit, not specific qualifications or certifications.

For further information on de-identification of protected health information by expert determination see 45 CFR § 164.514(b)(1).

The U.S. Department of Health and Human Services’ Office for Civil Rights has issued guidance on de-identification of protected health information which can be viewed on this link.

De-identification of Protected Health Information FAQs

Why is the list of Safe Harbor identifiers the same as many definitions of PHI?

This is because the Privacy Rule defines Protected Health Information as individually identifiable health information, with the only further guidance about what individually identifiable health information consists of being “a subset of health information, including demographic information collected from an individual” when it is created or received by a Covered Entity and when it relates a past, present, or future condition, treatment, or payment for treatment.

As the Safe Harbor method of de-identification lists what identifiers need to be removed from a designated data set before the data set is no longer subject to Privacy Rule protections, many compliance experts use this list as an example to answer the question “what is PHI?” The final item on the list (“unique identifying numbers, characteristics, and codes”) is open to interpretation, but generally includes (for example) occupations, familial relationships, and social media usernames.

However, this does not mean that identifiable information such as occupations, familial relationships, and social media usernames by themselves are PHI. It is only when these types of identifiable information are maintained in a data set that includes health data that they become individually identifiable health information and subject to the provisions of the Privacy Rule.

Do doctors´ names have to be removed from a data set for PHI to be de-identified?

This depends on the relationship between the doctor and the patient. If – for example – a doctor attends only one patient, and the patient could be identified by disclosing the name of the doctor, then the doctor´s name must be removed. If there is very little chance of a patient being identified by a doctor´s name, then the name can remain in the de-identified data set subject to any state laws or confidentiality concerns.

Generally, with regards to the removal of names from designated data sets, the name of the patient (including nicknames, pet names, and any other names they may be known by) have to be removed, along with the names of relatives, employers, and household members. There is no requirement in HIPAA to remove the names of healthcare providers or any workforce members of the covered entity or business associate.

Must a Business Associate Agreement or Data Use Agreement be in place before disclosing de-identified health data to a business partner?

The Privacy Rule does require a Business Associate Agreement nor a Data Use Agreement (as required for disclosing a limited data set) when disclosing de-identified health data. However, covered entities can, if they wish, enter into a Data Use Agreement with the recipient of the data to specify how the recipient can use the data and prohibit its re-identification.

What is considered “appropriate knowledge and experience” for expert determination?

There is no specific qualification or certification required to be an “expert”; however, in the event of a HIPAA compliance audit, the Department of Health & Human Services´ Office for Civil Rights would review the expert´s professional experience and academic training of the expert and the processes used in the de-identification of the data set to assess their capabilities.

Is there an expiration date for de-identified health data?

Although the privacy Rule does not specify an expiration date for de-identified data, the Department for Health & Human Services recognizes that “technology, social conditions, and the availability of information changes over time” and has suggested that covered entities periodically review the chosen de-identification method to ensure it maintains the very low risk requirement.

The post De-identification of Protected Health Information: How to Anonymize PHI appeared first on HIPAA Journal.