substack.com
Oct 23, 2025
In October 2025, the publishing platform Substack suffered a data breach attributed to threat actor @w1kkid. The breach exposed approximately 689,756 records (663,145 unique email addresses) containing user account data including names, email addresses, bios, profile photos, usernames/handles, phone numbers, Stripe customer IDs, account creation timestamps, and various account settings. The data was subsequently circulated more widely in February 2026 and added to Have I Been Pwned.
Data found in this dataset
Source files
Expand any file to inspect its column headers and the LLM's field-mapping reasoning, recorded during ingestion.
Substack__Info.txt50 rows
File structure
Notes: Pre-LLM auto-detection: free-form text with visible emails / phones
Substack__data__substack.csv4 columns662,745 rows
File structure
Format: CSV·Delimiter: comma·Has header: yes·Quote: "
| Source column | Mapped field | Confidence | LLM assessment |
|---|---|---|---|
| 1 | fullName | high | [1] header 'name', values are full person names |
| 2 | high | [2] header 'email', values contain @ and are valid email addresses | |
| 27 | username | high | [27] header 'handle', values are alphanumeric usernames/handles (searchable identifiers) |
| 28 | phone | high | [28] header 'phone', values are international phone numbers with + prefix and 10-15 digits |
Notes: Substack breach (October 2025). 32 columns total, 4 contain PII. Remaining columns are internal IDs, timestamps (created_at, updated_at, etc.), internal flags (is_global_admin, is_ghost, etc.), URLs, and transactional data — all mapped to skip.