streeteasy.com
Jun 28, 2016
In June 2016, the New York real estate website StreetEasy suffered a data breach affecting approximately 990,000 user records. The compromised data includes email addresses, names, usernames, and SHA-1 hashed passwords. The data appeared for sale on a dark web marketplace in February 2019 and has been indexed by Have I Been Pwned.
Data found in this dataset
Source files
Expand any file to inspect its column headers and the LLM's field-mapping reasoning, recorded during ingestion.
StreetEasy__Info.txt5 columns22 rows
File structure
Format: CSV·Delimiter: comma·Has header: no·Quote: "
| Source column | Mapped field | Confidence | LLM assessment |
|---|---|---|---|
| 1 | username | high | [1] short alphanumeric identifiers (ms, sd, cj, msz, luis, nycnat, etc.) consistent with usernames/handles |
| 2 | skip | high | [2] SHA-1 hashes (40 hex chars), likely password hash or verification token |
| 3 | password | high | [3] SHA-1 hashes (40 hex chars) matching breach description of SHA-1 hashed passwords |
| 4 | high | [4] values contain @ signs and email domains (tribeca.com, arepalabs.com, cantv.net, etc.) | |
| 5 | fullName | high | [5] full person names (Michael Smith, Sebastian Delmont, Corey Johnson, Mary Scary, Luis Miguel Romero Varela, etc.) |
Notes: StreetEasy 2016 breach. File contains 51 rows (header row is comment). Only 5 PII columns identified; remaining columns are timestamps, internal flags, YAML-serialized settings, numeric IDs (all skip). Password column [3] contains SHA-1 hashes per breach description.
StreetEasy__data__streeteasy.sql6 columns990,288 rows
File structure
Format: CSV·Delimiter: comma·Has header: yes·Quote: "
| Source column | Mapped field | Confidence | LLM assessment |
|---|---|---|---|
| 1 | username | high | [1] header 'anon', values are short alphanumeric handles (ms, sd, cj, msz, luis) consistent with usernames |
| 2 | password | high | [2] values are 40-character hex strings consistent with SHA-1 hashes of passwords from StreetEasy breach context |
| 3 | password | high | [3] values are 40-character hex strings, second password hash column (may be alternate/backup hash) |
| 4 | high | [4] header '1', values contain @ symbol and email addresses ([email protected], [email protected], [email protected]) | |
| 5 | fullName | high | [5] header 'anon', values are full names (Michael Smith, Sebastian Delmont, Corey Johnson, Mary Scary, Luis Miguel Romero Varela) |
| 23 | username | high | [23] values are usernames/screen names (ms0, sd0, PowerBroker, MaryScary, natalie) — searchable identifiers |
Notes: 45 total columns. StreetEasy 2016 breach: columns 2–3 are SHA-1 password hashes, column 4 is email, column 5 is full name, column 23 is secondary username. Columns 0, 6–22, 24–44 are internal IDs, timestamps, flags, settings, or transaction data → skipped. No firstName/lastName parsed separately; fullName used.