← All datasets

DataCamp (datacamp.com)

Jan 27, 2017

760,598
Records
1
Files
Apr 24, 2026
Added

A breach of DataCamp (datacamp.com), an online learning platform for data science and programming education. The dataset contains user account records including numeric user IDs, email addresses, bcrypt-hashed passwords, password reset tokens, sign-in counts, last sign-in timestamps and IP addresses, account creation timestamps, authentication tokens, names, locations, education, biography, avatar file information, Coursera integration flags, Stripe/PayPal/Braintree customer IDs, payment method tokens, company names, group membership hashes, inviter IDs, first names, last names, and anonymous email flags. The earliest account creation dates are from May 2013, while the latest activity timestamps are from January 2017. The data includes accounts from DataCamp co-founders Dieter De Mesmaeker ([email protected]) and Jonathan Cornelissen ([email protected]). The archive was distributed via BreachForums.

Data found in this dataset

EmailFirst nameLast nameUsernamefullName

Search this dataset

Scoped to this dataset. Fill any combination — results match if any field hits.

Source files

Expand any file to inspect its column headers and the LLM's field-mapping reasoning, recorded during ingestion.

codecamp.txt
5 columns760,598 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
2emailhigh[2] header 'email', values are clearly email addresses ([email protected], [email protected])
15fullNamemedium[15] header 'name', maps to fullName; values are NA in sample but header and context indicate full name field
30usernamehigh[30] header 'slug', values are URL-safe username slugs (dieter, jonathancornelissen, dieterdm90-test) — searchable user identifiers
41firstNamehigh[41] header 'first_name', values are given names (Dieter, Jonathan)
42lastNamehigh[42] header 'last_name', values are family names (De Mesmaeker, Cornelissen)

Notes: 44 columns total, 5 contain PII. encrypted_password is excluded per rules (password_hash pattern). IP address columns (current_sign_in_ip, last_sign_in_ip) are skipped per exclusion rules. company_name is skipped per company exclusion rule. authentication_token, reset_password_token are internal tokens — skipped. location, education, biography, marketing_biography are all NA in sample and too vague to map confidently. All timestamp, counter, flag, and payment ID columns are skipped.

Articles about this breach