Feb 4, 2015

The Dossier: Anthem's 78 Million Records

Anthem: The Dossier

On February 4, 2015, Anthem Inc. — the second-largest health insurer in the United States, providing coverage to roughly 40 million Americans through its Blue Cross Blue Shield subsidiaries — notified the public that it had suffered a massive data breach.

The company’s CEO, Joseph Swedish, sent a personal email to members:

“I want to personally apologize to each of you for the concern this news may cause. Data security is one of our most critical responsibilities, and we take that responsibility very seriously.”

What had been taken was breathtaking in scale: the personal records of 78.8 million individuals. Names. Social Security numbers. Dates of birth. Home addresses. Email addresses. Employment information. Income data.

But something critical had not been taken. In a breach of a major health insurer that processed the medical records of tens of millions of Americans, no clinical data was stolen. No diagnoses. No prescription histories. No treatment records. No lab results. None of the health information that HIPAA exists to protect.

This detail — so counterintuitive that many early news reports initially got it wrong — was the first indication that whoever had breached Anthem was not a criminal gang seeking to sell medical data on dark web markets. The attackers had taken exactly what they came for. And what they came for had nothing to do with healthcare.

The Anthem breach was not cybercrime. It was an intelligence collection operation.

Threat Actor Profile: APT10 / Deep Panda / Black Vine

Designation: Black Vine (Symantec); Deep Panda (CrowdStrike); APT10 (Mandiant); Stone Panda (iDefense); BRONZE RIVERSIDE (Secureworks); MenuPass Group
Attribution: People’s Republic of China; assessed to operate under direction of China’s Ministry of State Security (MSS), specifically the Tianjin State Security Bureau
Origin: China; primary assessed location Tianjin
Primary Mission: Long-term strategic intelligence collection; workforce intelligence targeting US government and private sector employees; industrial espionage; systematic exfiltration of population-level personal data
Known Tradecraft: Spear-phishing with weaponized documents, custom RAT families (Sakula, Derusbi, PlugX), long dwell times, systematic exfiltration of structured databases, targeting of managed IT service providers to reach downstream clients

Notorious Operations:

Anthem Health Insurance (2015): Exfiltration of 78.8 million records — names, Social Security numbers, employment data, income, dates of birth — from America’s second-largest health insurer. The largest known healthcare data breach at the time of disclosure; notable for what was not stolen — no clinical or medical data.
OPM Breach (2015): Exfiltration of SF-86 security clearance application files for 21.5 million current and former federal employees and contractors — the most consequential intelligence breach of the US national security workforce in history. Attribution is contested between APT10 and a related Chinese MSS unit.
Operation Cloud Hopper (2016–2018): A global campaign targeting 45 managed IT service providers (MSPs) across 12 countries, using those providers’ access to infiltrate hundreds of downstream government and defense clients simultaneously. The DOJ indicted two Chinese nationals for Cloud Hopper in December 2018.
Marriott/Starwood (2018): Exfiltration of 500 million guest records — including passport numbers, travel histories, and payment data — from the Starwood Hotels database acquired by Marriott. Attribution to APT10 or a closely related Chinese intelligence unit is widely assessed.

Note: The DOJ indicted Chinese nationals Zhu Hua and Zhang Shilong for APT10 activities, including Cloud Hopper, in December 2018. Neither has been arrested. No Anthem-specific indictment has been filed.

The Value Proposition: Why a Health Insurer?

The question that observers asked immediately after the Anthem breach: why steal health insurance records and ignore the medical data?

A criminal operation would answer this differently than an intelligence operation. For a dark web carding ring, medical records have value — they enable insurance fraud, fraudulent prescription schemes, and identity theft built on healthcare billing. But the Anthem attackers ignored all of that. They took the administrative profile: the enrollment data that every insured person provides when they sign up for coverage through their employer.

The population covered by employer-sponsored insurance is not a random cross-section of America. It is the working population — employed adults, typically enrolled through large organizations. Technology companies. Defense contractors. Federal agencies. Financial institutions. Manufacturing firms. This is precisely the population an intelligence service would want to map.

Consider what the stolen data enabled. With 78.8 million records containing name, Social Security number, date of birth, home address, email address, employer, and income, a foreign intelligence service could:

Cross-reference against cleared employees: Identify individuals who work at defense contractors, government agencies, or intelligence-adjacent organizations — potential recruitment targets, potential surveillance subjects, potential leverage points.
Build targeting lists for spear-phishing: The combination of full name, employer, email address, and home address is sufficient to craft highly personalized phishing emails that appear to come from insurance providers, employers, or government agencies.
Map workforce composition at scale: Understand the structure of the American workforce — who works for whom, at what income level, in what locations — at a granularity no public data source provides.
Create persistent identity dossiers: The data remains accurate for years, unlike financial account numbers that can be changed. A Social Security number and date of birth are fixed identifiers for life.

Viewed in this context, the Anthem breach was population-level intelligence collection — the systematic acquisition of workforce data at a scale that would require years of individual surveillance to replicate through traditional intelligence methods.

The Intrusion: Spear-Phish, Sakula, and DISCO

Anthem’s internal investigation, led in part by Mandiant, reconstructed the likely attack chain based on forensic evidence gathered after the breach was discovered.

The intrusion almost certainly began with a spear-phishing email — the standard initial access mechanism for APT campaigns of this sophistication. An Anthem employee received a carefully crafted email designed to appear legitimate: a lure containing either a malicious document or a link to a site that delivered a drive-by download. The employee interacted with it.

The payload installed Sakula, a custom Remote Access Trojan linked exclusively to the Black Vine / APT10 cluster. Sakula established an encrypted command-and-control channel to attacker-controlled infrastructure, provided persistent remote access to the compromised workstation, and enabled the attackers to begin internal reconnaissance.

Forensic evidence suggests the initial compromise occurred in mid-2014, approximately six months before the breach was discovered. During that period, the attackers:

Mapped Anthem’s internal network architecture
Escalated privileges progressively toward domain administrator access
Identified the target: DISCO, Anthem’s internal data warehouse

DISCO was Anthem’s centralized repository for member enrollment data — the database that aggregated records for all current and former Anthem members across its Blue Cross Blue Shield subsidiaries. It was designed as an analytical platform, optimized for large-scale data queries by actuarial teams. That optimization — its ability to efficiently return large datasets — made it ideal for the attackers’ purposes.

Lateral movement from the initial foothold to DISCO took weeks or months. The attackers obtained credentials for database administrator accounts with the access privileges necessary to query DISCO directly. The exfiltration then proceeded systematically: structured database queries returning compressed data in manageable chunks, routed to external attacker-controlled servers.

Anthem’s security team detected the breach on January 27, 2015 — when a monitoring alert identified a data query running under the credentials of a database administrator who was not at work. The query was pulling data from DISCO. Mandiant was engaged within days; the public disclosure followed on February 4.

The attackers had maintained access for approximately six months before detection.

The Mosaic: Anthem in Context

The Anthem breach did not exist in isolation. It was one piece of a multi-year, multi-target intelligence collection operation that the US intelligence community would later assess as the most systematic effort to acquire population-level data on American citizens ever conducted by a foreign government.

In 2014 and 2015, the same period as the Anthem breach, Chinese-attributed actors were simultaneously executing:

The OPM Breach (2015): Exfiltration of SF-86 security clearance application files for 21.5 million individuals — the detailed questionnaires that every person applying for a US security clearance must complete, covering ten years of residential history, foreign contacts, financial obligations, family relationships, and personal history. The OPM data gave a comprehensive profile of the entire US national security workforce.

Premera Blue Cross (2015): In March 2015, Premera disclosed a breach affecting 11 million members, including Social Security numbers and some clinical information. Premera provided health insurance to major Pacific Northwest employers including Boeing, Microsoft, and Amazon — a significantly different employment profile from Anthem, providing a complementary population slice.

Carefirst BlueCross BlueShield (2015): In May 2015, Carefirst disclosed a breach of 1.1 million records with similar characteristics.

Health Net (2010): A prior breach with APT characteristics targeting another major health insurer.

The intelligence mosaic these breaches assembled, together, was staggering. Anthem provided the employment and administrative profile of 78.8 million working Americans. OPM provided detailed security clearance files for 21.5 million government employees and contractors — a subset who represented the most sensitive workforce in the country. The combination enabled cross-referencing: identify individuals who appear in both datasets, and you have a detailed profile of cleared government workers plus their broader administrative records.

Add Equifax (breached in 2017 by a separate PLA unit, yielding detailed financial data on 147 million Americans) and Marriott/Starwood (500 million international travelers including passport numbers), and the picture of what China’s intelligence services were building becomes clear.

They were constructing a comprehensive national intelligence database on the American population — not through mass surveillance, but through systematic exfiltration of the commercial databases that Americans routinely provide their personal information to as a condition of working, traveling, and obtaining insurance.

Attribution and Response

Symantec’s post-breach analysis identified the intrusion as the work of Black Vine through the distinctive Sakula malware — a RAT family linked exclusively to this threat actor cluster — and through C2 infrastructure patterns consistent with prior Black Vine campaigns. CrowdStrike attributed the breach to Deep Panda, their designation for the same actor. Mandiant assessed APT10.

The FBI opened an investigation. The Department of Health and Human Services conducted its own review. No criminal charges specific to the Anthem breach were ever filed. When the DOJ indicted Chinese nationals Zhu Hua and Zhang Shilong for Operation Cloud Hopper activities in December 2018, the indictment covered the broader APT10 organizational actor — neither has been arrested.

The $115 million class action settlement Anthem reached in 2017 was the largest healthcare data breach settlement in US history at the time. The settlement provided affected individuals with credit monitoring services and identity protection — useful against financial identity theft, but largely irrelevant against an intelligence collection operation interested in workforce mapping rather than credit card fraud.

China denied involvement, as it always does.

The Legacy: Healthcare as an Intelligence Target

The Anthem breach reframed how the national security community assessed the healthcare sector’s strategic importance — and exposed a profound blind spot in the regulatory framework designed to protect it.

HIPAA, the Health Insurance Portability and Accountability Act, was built to protect clinical health information — diagnoses, treatment records, prescription histories, the data that carries personal medical stigma and regulatory protection. HIPAA’s security standards focused healthcare organizations on protecting medical records. They were oriented almost entirely around the wrong threat model.

The Anthem attackers had no interest in medical records. They were after the administrative layer — names, SSNs, employers, addresses — the enrollment data that sits outside HIPAA’s most stringent protections and was stored in a large, queryable data warehouse designed for actuarial analysis. The data most valuable to an intelligence collection operation was precisely the data that healthcare security programs had treated as less sensitive.

For healthcare organizations, Anthem drove a forced reckoning with the difference between regulatory compliance and security. HIPAA compliance had oriented security programs around the data regulators explicitly protected. The adversary had targeted the data regulators had not. The result was that enormous databases of structured, queryable personal information had received less defensive investment than the clinical records stored beside them.

For US intelligence agencies, Anthem confirmed what OPM had already suggested and what analysts had begun to assess: China’s intelligence services had embarked on a deliberate, multi-year strategy to acquire population-level intelligence on the American workforce and government through systematic commercial data exfiltration. The goal was not financial gain or disruption. It was persistent, comprehensive human intelligence collection at a scale that no traditional espionage operation could achieve.

For the 78.8 million individuals whose data was stolen, the consequences were diffuse and potentially permanent. There was no credit card to cancel, no password to reset. The data — names, Social Security numbers, home addresses, employers, dates of birth — would remain accurate for years or decades. Its potential use — in targeted recruitment operations, in spear-phishing campaigns, in background intelligence for future operations against individuals who might someday hold clearances or influence — was essentially unlimited in time horizon.

The records did not expire.

No one received a phone call when their dossier was opened.

Attack Chain: Anthem — APT10 / Black Vine Intelligence Collection

graph TD
    A["🇨🇳 APT10 / Deep Panda / Black Vine\n(China MSS — Tianjin State Security Bureau)"] --> B["Strategic Intelligence Mission\nBuild comprehensive dossier on\nAmerican workforce population\nIdentify intelligence assets\nMap employer–employee relationships\nat scale"]

    B --> C["Target: Anthem Inc.\nSecond-largest US health insurer\n~40 million members\nEmployer-sponsored health plans\nDefense, government, tech workforce"]

    C --> D["Initial Access\nSpear-Phishing Email\nTargeted Anthem employee\nMalicious document or drive-by\ndelivery (~mid-2014, assessed)"]

    D --> E["Sakula RAT Installed\n(APT10-exclusive malware)\nEncrypted C2 channel established\nPersistent remote access granted"]

    E --> F["Reconnaissance Phase\n(Months)\nNetwork mapping\nIdentify privileged accounts\nLocate DISCO data warehouse"]

    F --> G["Lateral Movement\nPrivilege Escalation\nDatabase Administrator\nCredentials Obtained"]

    G --> H["DISCO Targeted\nAnthem's Central Data Warehouse\n78.8 million member records\nOptimized for large-scale queries"]

    H --> I["Exfiltration\nWeeks of structured extraction\nCompressed database dumps\nExternal attacker-controlled servers\nJuly–December 2014 (assessed)"]

    I --> J["Exfiltrated Data"]
    J --> J1["78.8 million records:\n• Full names\n• Social Security Numbers\n• Dates of birth\n• Home addresses\n• Email addresses\n• Employer details\n• Income information"]
    J --> J2["Deliberately NOT taken:\n• Medical diagnoses\n• Treatment records\n• Prescription histories\n• Clinical data\n(Intelligence op, not financial crime)"]

    G --> K["Detection\nJan 27, 2015\nDISCO query running under\ncredentials of absent admin\nMandiant engaged"]

    K --> L["Public Disclosure\nFeb 4, 2015\nCEO personal email to members\n(Dwell time: ~6 months)"]

    J1 --> M["Intelligence Mosaic Operation"]
    M --> M1["OPM Breach (same period)\n21.5M SF-86 clearance files\nUS national security workforce\nCross-reference with Anthem = full dossier\non cleared government employees"]
    M --> M2["Premera Blue Cross (Mar 2015)\n11M records, Pacific NW employers\n(Boeing, Microsoft, Amazon)"]
    M --> M3["Equifax (2017, PLA Unit 54891)\n147M financial profiles\nAdded to the mosaic"]
    M --> M4["Marriott/Starwood (2018)\n500M records + passport numbers"]

    L --> N["Attribution"]
    N --> N1["Symantec: Black Vine\n(Sakula RAT — exclusive to\nthis threat cluster)"]
    N --> N2["CrowdStrike: Deep Panda\nMandiant: APT10\nFBI: China MSS"]

    L --> O["Legal Response\n(Limited)"]
    O --> O1["No Anthem-specific indictment\nfiled"]
    O --> O2["DOJ APT10 Indictment\nDec 2018: Zhu Hua,\nZhang Shilong (Cloud Hopper)\nNeither arrested"]

    L --> P["💰 $115M Class Action\nSettlement (2017)\nLargest healthcare breach\nsettlement at time of filing\nCredit monitoring for victims"]

    L --> Q["🏛️ Policy Consequences"]
    Q --> Q1["HIPAA Exposed:\nAdministrative data outside\nstrongest protections —\nregulatory framework\ntargeted wrong threat model"]
    Q --> Q2["Intelligence Community:\nChinese population-level\ndata collection assessed as\nnational security threat\nNo precedent in scope"]
    Q --> Q3["Healthcare Investment:\nDatabase activity monitoring\nPrivileged access management\nNetwork segmentation\nData warehouse isolation"]

// Further Reading & Media

★ Recommended article

Anthem's 78 Million Records

Wikipedia

China's intelligence services didn't penetrate Anthem Health Insurance to steal medical records. They came for names, Social Security numbers, employment histories, and dates of birth — the building blocks of a comprehensive intelligence dossier on the American workforce, assembled one data breach at a time. Use this reference overview as a jumping-off point for deeper reporting, primary-source disclosures, and historical context.

→ View Resource