← All datasets

Att 2021

Aug 20, 2021

156,750,195
Records
5
Files
Jun 2, 2026
Added

In March 2024, approximately 70 million records allegedly breached from AT&T were posted to BreachForums by ShinyHunters. The data originally dates to August 2021 and was previously offered for sale before being freely released. AT&T initially denied a breach before later acknowledging data fields specific to their systems were present. The dataset contains AT&T customer records including full names, physical addresses, email addresses, phone numbers, dates of birth, US Social Security Numbers (encrypted), government-issued IDs, and account passcodes. The data is pipe-delimited and includes both current and billing address information for US consumers.

Data found in this dataset

EmailAddressCityStateGenderskipfullNamephonessnzipdobaddress2

Search this dataset

Scoped to this dataset. Fill any combination — results match if any field hits.

Source files

Expand any file to inspect its column headers and the LLM's field-mapping reasoning, recorded during ingestion.

Breached_Info.txt
0 rows

File structure

Notes: No actual data rows are present in the provided 50 lines — the entire sample is the BreachForums distribution preamble/advertisement block. No PII columns can be mapped from this sample alone. Based on the breach context (AT&T 2021, pipe-delimited), the known schema reportedly includes: firstName, lastName, address1, city, state, zip, email, phone, dob, ssn (encrypted), government ID, and account passcode fields — but these cannot be responsibly assigned column indices without seeing actual data rows. Provide data rows for accurate column mapping.

DOB1.csv
2 rows

File structure

Notes: File contains only encrypted/decrypted value pairs with no delimited column structure. This is NOT a structured data export. The data shown is a reference table or lookup listing encryption mappings for dates (1900-01-01 through 1900-07-14), not customer PII records. No customer names, emails, phones, addresses, SSNs, or other PII fields are present in this sample. This does not match the AT&T breach context description (which mentions full names, addresses, emails, phones, DOB, SSNs as structured columns). This appears to be an encryption key reference document rather than the actual customer data file. No importable PII columns can be identified.

Info.txt
28 columns12 rows

File structure

Source columnMapped fieldConfidenceLLM assessment
0skiphighNumeric account/customer ID, auto-increment style AT&T internal identifier
1fullNamehighValues are uppercase full names (e.g. 'JESSICA KEUREN', 'MARY KEYS') — current/service name
2phonehigh10-digit numeric values matching US phone number format (e.g. '3174428521')
3phonehighSecond 10-digit phone number, likely alternate/secondary phone (e.g. '3173531767'); empty in some rows
4skipmediumSmall integer values (1, 2, 3) — likely account type or service tier code, non-PII
5skiphighConstant value 'CONSUMER' across all rows — account segment/category label, non-PII
6skiphighConstant value 'CONSUMER' across all rows — duplicate segment label, non-PII
7passwordhighEncrypted/encoded strings beginning with '*1' pattern (e.g. '*1UexozmUvT7E=') — AT&T account passcodes, base64-encoded encrypted values; AT&T confirmed passcode reset after breach
8ssnhighLonger encrypted/encoded strings beginning with '*0' pattern (e.g. '*0Um4ZYEfz7NgDgk8rwbd7fQ==') — consistent with encrypted SSN per breach disclosure; AT&T confirmed SSNs were present in encrypted form
9address1highStreet address values (e.g. '30 MARIN AV', '5305 E 10TH ST') — current/service address line 1
10cityhighCity name values, sometimes abbreviated (e.g. 'NATCHEZ', 'FMT' for Fremont, 'ANH' for Anaheim) — current address city
11statehighTwo-letter US state abbreviations (e.g. 'MS', 'IN', 'GA') — current address state
12ziphigh5-digit US ZIP codes (e.g. '39120', '46219') — current address ZIP
13dobhighEncrypted/encoded strings beginning with '*0' pattern, shorter than SSN column — consistent with encrypted date of birth per breach disclosure
14skipmediumBinary flag values (0 or 1) — likely account status, opt-in flag, or paperless billing indicator, non-PII
15skipmediumSingle character values ('T' or 'C') — likely account type code (e.g. T=tablet, C=consumer), non-PII
16emailhighEmail address values where present (e.g. 'RONDYM@PBP1.COM', 'JERILEE2010@GMAIL.COM'); empty in some rows
17fullNamehighUppercase full names (e.g. 'JESSICA KEUREN', 'RONALD DYMOND') — billing/account name, mirrors column 1 in most rows
18address1highBilling address line 1 or address line 2 where unit/apt present (e.g. '30 MARTIN RD', 'APT C', 'APT 212') — billing address first line
19address2mediumSome rows contain a second address line (e.g. '1148 N CITRON ST', '39800 FREMONT BLVD'); other rows contain city+state+ZIP formatted string (e.g. 'NATCHEZ MS 39120-9199') — billing address line 2 or formatted city-state-zip continuation
20address2mediumValues appear to be city+state+ZIP formatted billing address strings (e.g. 'INDIANAPOLIS IN 46219-4311') or additional address line — billing address overflow field
21skipmediumMostly empty in sample rows — possible additional address or notes field, insufficient data to classify confidently
22cityhighCity name values for billing address (e.g. 'INDIANAPOLIS', 'MARIETTA', 'FREMONT') — billing address city, full form vs abbreviated column 10
23statehighTwo-letter US state abbreviations — billing address state
24ziphigh5-digit US ZIP codes — billing address ZIP
25skipmediumMostly empty in sample rows — possible secondary ZIP+4 or extension field
26skipmediumSingle character value 'W' consistent across all rows — likely rate code, market segment, or internal flag, non-PII
27skipmediumEmpty in all sample rows — unknown trailing field, insufficient data; likely padding or reserved column

Notes: File has a 6-line metadata/header block before data rows begin (date, description, compromised data summary, record count, HIBP link, forum thread link). Actual data starts at the 'SAMPLES:' marker, with the 'SAMPLES:' prefix on the first data row needing to be stripped. Delimiter is pipe ('|'), no column headers in data rows. Columns 7 and 8 contain AT&T-specific encrypted values: column 7 ('*1...' prefix) maps to encrypted account passcodes (AT&T confirmed resets post-breach), column 8 ('*0...' longer strings) maps to encrypted SSNs. Column 13 ('*0...' shorter strings) maps to encrypted dates of birth. Billing address spans columns 17-24 with some variation in field population depending on whether apt/unit info is present, causing a one-column shift in some rows. Column 10 contains abbreviated city names in current address (e.g. 'FMT'='Fremont', 'ANH'='Anaheim', 'ORM BCH'='Ormond Beach', 'TWN HRT'='Twain Harte', 'MRETA'='Marietta') while column 22 contains full city names for billing address. This dataset matches the AT&T August 2021 breach profile disclosed in March 2024 by ShinyHunters on BreachForums, confirmed to contain ~70M US consumer records.

MASTER.csv
26 columns73,481,539 rows

File structure

Source columnMapped fieldConfidenceLLM assessment
0skiphighnumeric identifiers, appears to be internal customer/account IDs
1fullNamehighcontains full names in uppercase format (e.g., JESSICA KEUREN, MARY KEYS)
2phonehigh10-digit phone numbers in standard format
3phonehighsecondary phone number field, also 10-digit format
4skiphighnumeric codes, appears to be account type or status flags
5skiphightext codes (CONSUMER, SMALL OFFICE), account classification
6skiphightext codes (CONSUMER, SMALL OFFICE), account classification duplicate
7ssnhighencrypted values prefixed with *, AT&T context indicates encrypted SSN per breach documentation
8passwordhighencrypted values prefixed with *, AT&T context indicates encrypted account passcode
9address1highstreet addresses (e.g., 30 MARIN AV, 5305 E 10TH ST)
10cityhighcity names (abbreviated and full format)
11statehighUS state abbreviations (MS, IN, GA, TX, CA, IL, etc.)
12ziphigh5-digit ZIP codes
13skiphighencrypted values prefixed with *, unknown AT&T internal data field
14skiphighbinary flags (0 or 1), internal status indicator
15skiphighsingle character flags (C, T, O), account status or type codes
16emailhighemail addresses with @ symbol or empty values
17fullNamehighfull names, duplicate/canonical entry of field 1
18address2mediumsecondary address lines (APT designations, PO BOX, UNIT numbers) or full street name
19address1highexpanded street addresses with full street names
20skiphighcombined city-state-zip or partial address components
21cityhighfull city names, canonical entry
22statehighUS state abbreviations, canonical entry
23ziphigh5-digit ZIP codes, canonical entry
24skiphighappears to be duplicate/alternate ID field or timestamp
25genderhighsingle character values (W, U, M, etc.) representing gender indicators

Notes: AT&T 2021 data breach from August 2021. Pipe-delimited format with 26 fields. Contains current and billing address information. Fields 7 and 8 are encrypted using AT&T's encryption (prefix *). Multiple fields appear duplicated for data redundancy/validation. Field 25 gender values: W=likely female, U=unknown, M=likely male. Records include both individual consumers and small office/business accounts.

PRODUCTION.csv
43,998,287 rows

File structure

Notes: Pre-LLM auto-detection: free-form text with visible emails / phones