bhinneka.com
Apr 1, 2020
Data breach of Bhinneka.com, a major Indonesian e-commerce platform. The dataset contains user profile and account data including names, email addresses, birthdates, mobile/phone numbers, gender, addresses, hashed passwords with salts, and account metadata. The breach was reportedly disclosed on BreachForums and contains both profile and member/user tables.
Data found in this dataset
Source files
Expand any file to inspect its column headers and the LLM's field-mapping reasoning, recorded during ingestion.
Bhinneka_BF__data__bhinneka_profiles__profile.csv7 columns1,176,634 rows
File structure
Format: CSV·Delimiter: comma·Has header: yes·Quote: "
| Source column | Mapped field | Confidence | LLM assessment |
|---|---|---|---|
| 1 | firstName | high | [1] header 'first_name', values are common given names (Hendrik, felix, Lia, Tiar, Arief) |
| 2 | lastName | high | [2] header 'last_name', values are surnames (Tio, da cats, Herlia, Saja, Artanto) |
| 3 | high | [3] header 'email', values contain @ symbol and are valid email addresses | |
| 4 | dob | high | [4] header 'birth_date', values match YYYY-MM-DD date pattern |
| 5 | phone | high | [5] header 'mobile', values are 10-11 digit Indonesian mobile phone numbers |
| 6 | phone | high | [6] header 'phone', values are 10-11 digit Indonesian phone numbers |
| 7 | gender | high | [7] header 'gender', values are single letters (M/F) |
Notes: 11 columns total, 7 contain PII. Columns 0 (member_id), 8 (status), 9 (created_at), 10 (modified_at) are non-PII metadata and automatically skipped. Two phone columns (mobile and phone) both map as phone; this is common in e-commerce platforms where users may provide both mobile and landline numbers.
Bhinneka_BF__data__bhinneka_users__member.csv12 columns1,262,083 rows
File structure
Format: CSV·Delimiter: comma·Has header: yes·Quote: "
| Source column | Mapped field | Confidence | LLM assessment |
|---|---|---|---|
| 1 | firstName | high | [1] header 'firstName', values are given names (adrian, Kusmiadi, rachmad putra, David, Budi) |
| 2 | lastName | high | [2] header 'lastName', values are family names (oentoro, Jaelani, bahari, Maroval, Setiawan) |
| 3 | high | [3] header 'email', values contain @ and are valid email addresses | |
| 4 | gender | high | [4] header 'gender', values are MALE (gender codes) |
| 5 | phone | high | [5] header 'mobile', values are 11-digit Indonesian mobile numbers |
| 6 | phone | medium | [6] header 'phone', values are numeric phone numbers (some all-zeros or minimal data) |
| 10 | state | high | [10] header 'province', values are Indonesian province names (Jawa Timur) |
| 12 | city | high | [12] header 'city', values are city names (Bojonegoro, Lamongan) |
| 18 | zip | high | [18] header 'zipCode', values are 5-digit postal codes |
| 19 | address1 | high | [19] header 'address', values are street addresses |
| 22 | facebookId | high | [22] header 'facebookId', standard Facebook identifier field |
| 35 | dob | high | [35] header 'birthDate', values match YYYY-MM-DD date pattern for dates of birth |
Notes: 37 columns total, 12 contain PII. Columns [0] (id), [7] (ext), [8] (password—hashed, not plaintext), [9] (salt), [20] (jobTitle), [21] (department), [23] (googleId), [24] (azureId), [25–27] (flags), [28] (status), [29] (token), [30–34] (timestamps/metadata), [36] (ldapId) are skipped. Note: [8] password appears to be hashed/encrypted (long base64-like strings with 15000 iteration count prefix), typically not actionable as plaintext password.