by Jordan Harrod
figures by Dan Utter

In the modern era, maintaining the privacy of your personal information has become more challenging than ever. Cyberattacks and social media have resulted in the average person sharing more information than ever before, in ways that they may not be aware of. One area of data privacy that isn’t discussed often, however, is health data. In the past, laws were put in place to ward off the dissemination of your health information to the public, but advances in technology have left openings for information to either leak out or be removed by force. The main law that currently protects health information is the Health Insurance Portability and Accountability Act (HIPAA), which was signed into law by President Bill Clinton in 1996. However, in the twenty-three years since then, technology has advanced to the point where your health information is no longer truly private.

What is HIPAA?

HIPAA was initially designed to allow Americans to move more easily between doctors and to receive affordable treatment for pre-existing conditions through their health insurance coverage. Pre-existing conditions encompassed any medical condition that a person had before enrolling in a new medical insurance plan. In the decade that followed, additional regulations were added to HIPAA to protect the average person from having their medical data shared or stolen without their permission (Figure 1). These measures also prevented insurance companies and potential employers from discriminating against patients with certain medical conditions.

Figure 1: History of US health data privacy law. HIPAA has been updated several times since it was initially passed in 1996.

To start, even though it was passed in 1996, entities that were subject to HIPAA regulations had until 2003 to comply with the rules. As that deadline passed, many health care entities were still not complying due to the lack of repercussions. While HIPAA established the rules that health care entities were required to follow, it did not establish clearly defined penalties for failures of compliance. It wasn’t until 2006 that this lack of enforcement was remedied with the Enforcement Rule, which gave the Department of Health and Human Services the authority to investigate and bring criminal charges against entities that were not complying with HIPAA.

During this time, the HIPAA Privacy Rule was added in 2003. This addition defined Protected Health Information (PHI) as “any information held by a covered entity which concerns health status, the provision of healthcare, or payment for healthcare that can be linked to an individual” and created protocols for how to get permission to use or share PHI from patients. While this addition did help to protect a patient’s medical records by providing clear steps towards security, it did not solve one of the major challenges of medical recordkeeping: What if a patient has to move between healthcare systems?

Until this point, it was difficult to transfer a person’s medical information between doctors due to the lack of a unified record-keeping format across healthcare systems. To remedy this, two more rules were added in 2009: The Health Information Technology for Economic and Clinical Health Act (HITECH) and The Breach Notification Rule. HITECH was primarily designed to convince healthcare providers to start using Electronic Health Records (EHRs) to make it easier to share a patient’s health information between healthcare providers and reduce the dependence on paper records. Additional incentives to adopt EHRs came in the form of the Medicare and Medicaid EHR Reimbursement program in 2011.

The Breach Notification Rule acted as a kind of counterpoint to HITECH by requiring that any breach of EHRs affecting more than 500 people be reported to the federal government. In the decade since HIPAA was originally made law, cyberattacks and corporate data breaches had become more prevalent, so this new reporting requirement was designed to help affected individuals protect themselves in the event that their information was compromised.

Catching up to present day, the most recent addition to HIPAA was the Final Omnibus Rule of 2013. This addition filled some of the gaps left by the Breach Notification Rule and HITECH by specifying encryption standards for EHRs and clearing up the definitions of the entities protected and regulated under HIPAA. It also accounted for the increased use of mobile devices in healthcare by introducing new policies for healthcare professionals who used their phones or tablets to access and send PHI.

Why doesn’t HIPAA keep our data safe anymore?

Based on this history, it may seem like HIPAA should protect Americans from having their confidential health information shared or stolen. Unfortunately, this is not the case. New methods of storing and sharing data have created gaps in the regulatory framework that those with malicious intent can exploit. Federal and state laws designed to protect PHI, such as HIPAA, are only enforced on “covered entities” – health care providers, health care plans, and research institutions. They are not enforced internationally or on the Internet. International transmission of health information relies on voluntary agreements to adhere to terms of service, and private companies can solicit health data from users without having to conform to HIPAA regulations.

As a result, more and more of our personal information, including health data, is being collected by Internet Service Providers and third party analytics companies to be sold to marketing agencies. Gaining access to much of the Internet is now contingent on agreeing to “Terms of Service,” which are typically dense, confusing documents that are not read or understood by users. A new culture of social media and data sharing has encouraged Americans to willingly share personal information on Internet forums, which are not regulated under HIPAA. This information may or may not be medical in nature, but it can be used to tie anonymized medical data back to specific individuals.

Figure 2: Identifying people via their health data. Scientists have shown that it is possible to match anonymous health data back to patients using machine learning.

Scientists showed in 2018 that they could take a large set of health data, remove Protected Health Information, and use machine learning to re-identify 95% of individual adults and 80% of individual children (Figure 2).  They were able to do this by training an algorithm on demographic and physical activity data (from wearable fitness trackers) from the National Health and Nutrition Examination Survey, so that the algorithm would be able to predict data for a given person. They then reversed the algorithm and tried to predict the person based on their data, with considerable success. It is important to note that they did not use a test set – that is, they trained the initial algorithm with all of the data they had and then tested it on that same dataset – so the algorithm was more likely to be able to predict a person from their data than if it had never seen their information before. However, with the amount of identifying information available to the public on a given person, this method could be extended to predict the identity of a person using seemingly anonymous information based on their known past social media data.

What is next for health data privacy?

Although gaps in HIPAA regulations have left PHI vulnerable to attack and misuse, there are legislative avenues to prevent this from happening. Additional statutes have been periodically added to HIPAA to improve regulations, and further adjustments to HIPAA may address current threats to our health data privacy. New laws from Congress or new regulations from federal agencies might expand HIPAA-regulated entities to include any and all entities that gather personal health information, including companies like Facebook and Google. Encryption and anonymization protocols could be updated to combat the threat of machine learning re-identification. In addition, these entities could be required to disclose in clear language how that data will be used. General Data Protection Regulation (GDPR), which protects the data of all EU citizens, is one legislative model that the US might follow in pursuit of improved health data privacy. GDPR applies to all companies that handle an individual’s data, imposes significant financial penalties on companies that fail to do so, and requires all terms and conditions to be written using clear language. In fact, US companies operating in the EU restructured their data privacy protocols in response to GDPR when it was implemented in 2018.

While there may not be a legislative option that all lawmakers will agree with or that will prevent all health data from being compromised, continued discussion and progress towards updated statutes or regulations has the potential to make significant strides towards combating the challenges of data privacy in the 21st century.


Jordan Harrod is a Ph.D. student in Medical Engineering through the Harvard-MIT Health Sciences and Technology program. You can find her on Youtube at everydAI or on Twitter at @jordanbharrod.

Dan Utter is a fourth year Ph.D. student in Organismic and Evolutionary Biology at Harvard University.

For more information:

  • Information about HIPAA from the U.S. Department of Health and Human Services.
  • An article discussing HIPAA-compliant voice assistants and the consumerization of healthcare.
  • An overview of GDPR, including a timeline of key events.

2 thoughts on “Health Data Privacy: Updating HIPAA to match today’s technology challenges

  1. I want you to look good. Please fix “Why doesn’t HIPAA keep out data safe anymore?” changing “out” to “our.”

Leave a Reply

Your email address will not be published. Required fields are marked *