← All datasets

yatra.com

Jan 1, 2019

10,647,315
Records
108
Files
Apr 24, 2026
Added

A breach of Yatra (yatra.com), a major Indian online travel booking platform. The archive contains 109 numbered CSV files with user account records including numeric user IDs, email addresses, salutation/title, first and last names, physical addresses (street, city, state, country, PIN code), mobile and alternate phone numbers. The data is entirely Indian in nature, evidenced by Indian addresses, Indian phone numbers, Indian email providers (rediffmail.com, yahoo.co.in), and references to Indian cities and states. One record explicitly references 'Yatra Office' in Gurgaon as an address, confirming the source. The archive was distributed via BreachForums.

Data found in this dataset

EmailFirst nameLast nameAddressCityStateCountrySuffixaddress2zipphoneskipfullName

Search this dataset

Scoped to this dataset. Fill any combination — results match if any field hits.

Source files

Expand any file to inspect its column headers and the LLM's field-mapping reasoning, recorded during ingestion.

1.csv
12 columns98,991 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] header '[email protected]' is an email; all values contain @ and are valid email addresses
2suffixhigh[2] header 'Mr.' and values are salutations/titles (Mr, Mr)
3firstNamehigh[3] header 'MOHAMED'; values are common given names (Shubham, Wilbert, NABI, Mithilesh, Sajmon)
4lastNamehigh[4] header 'RAFIQ'; values are surnames (Chadravanshi, Vaz, ANSARI, kumar, Rajamani)
5address1high[5] header 'METTUPALAYAM'; values are street addresses (81/G Risali Sector, Swan, 11 A Ashoka road new delhi, mariapuram, dispur)
6address2medium[6] header empty; values mostly empty or duplicate address info; secondary address field
7cityhigh[7] header 'COIMBATORE'; values are Indian city names (Bhilai, Siwan, New delhi, coimbatore, guwahati)
8statehigh[8] header 'Tamilnadu'; values are Indian states (Chhattisgarh, Bihar, Delhi, Tamil Nadu, Assam)
9countryhigh[9] header 'India'; all values are 'India' or 'IND'
10ziphigh[10] header '641301'; values are 6-digit Indian postal codes (490006, 841436, 110001, 641104, 781005)
11phonehigh[11] header '9943834335'; values are 10-digit Indian mobile numbers (9885332000, 8602184719, 9821238286, 9971268719, 9868301865)
12phonehigh[12] header empty; values are 10-digit numbers (23782736, 9500974345, 9864970759); secondary/alternate phone field

Notes: Yatra 2019 Indian travel booking breach. Column [0] is numeric user ID (skip). All PII identified: email, name components (firstName, lastName, suffix), full address (address1, address2, city, state, zip), country, and dual phone fields. Indian phone numbers (10 digits starting with 7-9) and Indian postal codes (6 digits) confirmed.

10.csv
12 columns98,981 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] All values contain @ signs; headers and sample values are clearly email addresses (yahoo.com, gmail.com, yahoo.co.in)
2suffixhigh[2] Header 'Mr'; all sample values are salutations/titles (Mr, consistent with Indian naming conventions)
3firstNamehigh[3] Values are common given names (Shashidhar, Amit, Dinesh, prabhakarkumar, aditya, syedfaiz)
4lastNamehigh[4] Values are surnames (mahavadi, Rishi, Namjoshi, singh, garg, hasan)
5address1high[5] Street-level address components (vishal nagar, B1/402 Supernal Garden, 7 d b gupta road, kharghar, raj nagar)
6address2high[6] Secondary address details (Kolshet RD., paharganj, sector23raj nagar gzb.)
7cityhigh[7] Indian city names (pune, THANE, new delhi, mumbai, ghaziabad, CHENNAI)
8statehigh[8] Indian state abbreviations and names (maharashtra, delhi, Maharashtra, Uttar Pradesh, TN)
9countryhigh[9] Country codes (IN, IND) — India
10ziphigh[10] Indian postal codes/PIN codes (411027, 110055, 410210, 201002, 600032)
11phonehigh[11] Indian mobile numbers, 10 digits (9850832696, 9810678244, 8128992377, 7498081429)
12phonehigh[12] Alternate/secondary phone numbers, 10 digits or variable length (9869266264, 9868208976, 666666 appears to be invalid/test data, 9958691421)

Notes: 13 columns total. Column [0] is a numeric user ID (skip). All other columns map to PII. Indian travel booking platform context confirms address/phone/email structure. Columns [11] and [12] are both phone fields (primary and alternate mobile/contact numbers, common in Indian databases).

11.csv
13 columns98,964 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
0skiphigh[0] numeric user IDs (3623523, 3623524, etc.), auto-generated internal identifiers
1emailhigh[1] all values contain @ symbol, valid email addresses (yahoo.com, rediffmail.com, gmail.com, hotmail.com)
2suffixhigh[2] header 'Mr', 'Mrs' — salutation/title values, matches suffix field type
3firstNamehigh[3] single given names (Nitesh, umesh, SEEMA, Sharmila, Nikhil, SAIF), typical first names
4lastNamehigh[4] single family names (Sahni, patil, PANDEY, Jain, Agarwal, ALI), typical last names
5address1high[5] street addresses (Camp road, 12-1-334/17/18 katariya niwas, a-62 new agra, 29 BAJRANG VIHAR COLONY,JAITPURA)
6address2medium[6] secondary address component (Lalapet), appears to be locality/area designation
7cityhigh[7] Indian city names (Amravati, secunderabad, agra, JAIPUR)
8statehigh[8] Indian states (Maharashtra, Andhra Pradesh, Uttar Pradesh, Rajasthan)
9countryhigh[9] country codes/names (IND, India)
10ziphigh[10] Indian postal codes/PIN codes (444605, 500017, 282005, 303704), 6-digit format
11phonehigh[11] Indian mobile phone numbers (9675850123, 7798055456, 9930015853, 9553906777, 9869033220, 9923453994), 10-digit format
12phonehigh[12] alternate phone numbers (9052989121, 9458815226), 10-digit Indian format

Notes: Yatra.com 2019 breach — 13 columns total, 12 contain PII (names, email, addresses, phone). Column [0] is auto-generated user_id (skip). Breach context confirms Indian travel booking platform with Indian address/phone data.

12.csv
12 columns98,931 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs and match email format (Gmail, Yahoo.co.in); clearly email addresses
2suffixhigh[2] values are 'Mr', 'Mr.', 'Ms' — salutation/title prefixes
3firstNamehigh[3] values are common given names (Subbiah, rajit, Rohit, saloni)
4lastNamehigh[4] values are surnames (Ramiah, venkataramanamurthy, singh, Kashyap, sinha)
5address1high[5] values are street addresses (371 SFS FLATS HAUZ KHAS, Flat No. 100, etc.)
6cityhigh[6] values are Indian city names (chembur — Chembur, Mumbai)
7cityhigh[7] values are Indian city names (NEW DELHI, Ranchi, mumbai)
8statehigh[8] values are Indian states (Delhi, Jharkhand, Maharashtra)
9countryhigh[9] values are 'India' and 'IND' — country codes/names
10ziphigh[10] values are Indian postal codes (110016, 834002, 400074) — 6-digit PIN codes
11phonehigh[11] values are 10-digit Indian mobile numbers (9830658682, 9979862072, etc.)
12phonehigh[12] values are 10-digit Indian mobile numbers (8797758664, 9386751758) — alternate/secondary phone

Notes: Yatra 2019 travel booking breach. Column [0] is a numeric user ID (skip). Columns [6] and [7] both contain city data — likely city appears twice or [6] is a secondary city field. All records are Indian with Indian addresses, phone formats (+91 country code implicit in 10-digit numbers), and Indian email providers (yahoo.co.in, rediffmail.com). Breach context confirms Indian travel platform data.

13.csv
12 columns99,169 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] Values contain @ signs and match standard email format (gmail.com, rediffmail patterns typical of Indian users)
2suffixhigh[2] Header 'Mr.' and values are salutations/titles (Mr, Mr.)
3firstNamehigh[3] Values are common given names (Tapeesh, Abhishek, ardhendu, sandeep, Shashank, KRISHAN)
4lastNamehigh[4] Values are surnames (Gupta, panda, patil, Chansoria, GOSWAMI)
5address1high[5] Values are street addresses and building identifiers (3043 Pocket B4 Vasant Kunj, plot 21, T7 Imperial Residency, D-486)
6address2high[6] Values are secondary address components (HSR Layout, TAGORE GARDEN EXTN) or empty
7cityhigh[7] Values are Indian city names (New Delhi, mau, bhubaneswar, baroda, Bangalore)
8statehigh[8] Values are Indian states/provinces (Delhi, Uttar Pradesh, Orissa, Gujarat, Karnataka)
9countryhigh[9] Values are country codes/names (India, IND, IN)
10ziphigh[10] Values are 6-digit Indian PIN codes (110070, 275101, 751012, 391775, 560034, 110027)
11phonehigh[11] Values are 10-digit Indian mobile numbers (9818657776, 9999979468, 9919661666, 8763216420)
12phonehigh[12] Values are 10-digit alternate/secondary phone numbers or empty (2667266608, 7381195997)

Notes: Yatra 2019 breach — Indian travel booking platform. 13 columns total, 12 contain PII (names, emails, addresses, phones, location). Column [0] contains numeric user IDs and is skipped as internal identifier. All addresses, phone numbers, and email providers confirm Indian origin. Record structure matches expected user account data from travel booking platform.

14.csv
13 columns98,992 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
0skiphigh[0] numeric sequence (3713924–3713930), internal user IDs
1emailhigh[1] all values contain @ symbol, email addresses from Indian providers (yahoo.com, rediffmail.com, gmail.com)
2suffixhigh[2] values are titles: 'Ms', 'Mr', 'Mr.' — salutation/suffix field
3firstNamehigh[3] common given names (SHEETAL, VINOD, SHASHI, ashok, Rohit, manish)
4lastNamehigh[4] surnames (RAJPUT, SHARMA, SHEKHAR, jain, Shinde, Chavan)
5address1high[5] street addresses (916 MAHALAXMI NAGAR, G137A Sector 10 DLF, 754 GULABI BAGH)
6address2medium[6] secondary address component, appears to be building/landmark (NEW NKS HOSP, vaiduwadi)
7cityhigh[7] Indian city names (Faridabad, DELHI, Pune, Indore)
8statehigh[8] Indian states (MADHYA PREDESH, Haryana, Delhi, Maharashtra)
9countryhigh[9] country codes/names: 'IN', 'India', 'IND'
10ziphigh[10] 6-digit Indian postal codes (452010, 121006, 110007, 411007, 411013)
11phonehigh[11] 10-digit Indian mobile numbers (9893091064, 9446917384, 9350447442, 9609802185)
12phonehigh[12] 10-digit Indian phone number (8959848488), alternate/secondary phone

Notes: Yatra 2019 breach: 13 columns, all contain PII. File structure is CSV without explicit headers but column positions match travel booking user records (ID, email, name components, full address with Indian states/cities/postal codes, dual phone numbers). All values consistent with Indian user accounts from yatra.com.

15.csv
11 columns97,611 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] header '[email protected]' is an email; all values contain @ and are email addresses
2suffixhigh[2] header 'Mr'; values are titles (Mr, Mr., Mr) — salutation/suffix field
3firstNamehigh[3] header 'Bhoj'; values are given names (Bhoj, Pulin Das, anurag, mahendra, Sidharth, sameer)
4lastNamehigh[4] header 'Sao'; values are family names (Sao, bhardwaj, Patel, Handa, chauhan, SEKARAN)
5address1high[5] street/primary address lines (Chakardhar Nagra Bangla Para, vijaynagar, C/33 Rameshwar 3rd Flr S V Road, ghasiyawas radhanpur, apartment addresses, Yatra.Com office)
6address2high[6] secondary address component (delhi, Mumbai, cross streets like '10th cross BTM 1st stage')
7cityhigh[7] header 'Raigarh'; values are Indian cities (Raigarh, delhi, Mumbai, radhanpur, TRICHY, Gurgoan)
8statehigh[8] header 'Chhattisgarh'; values are Indian states (Chhattisgarh, Maharashtra, Gujarat, Tamilnadu, karnatka)
9countryhigh[9] header 'IN'; values are country codes/names (IN, IND, India)
10ziphigh[10] header '496001'; values are 5-6 digit Indian postal codes (496001, 400054, 385340, 621216, 560029)
11phonehigh[11] header '9762162480'; values are 10-digit Indian mobile numbers starting with 9

Notes: 13 columns total, 11 contain PII (email, name components, full address, phone). Column [0] is numeric user ID (skip). Column [12] contains sparse numeric values — appears to be an internal reference or secondary ID (skip). Breach context confirms Indian travel booking platform with personal account data including contact info and residential addresses.

16.csv
12 columns98,904 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs and match email format (yahoo.com, gmail.com, etc.)
2suffixhigh[2] header 'Mr', values are salutations/titles (Mr, Mr.)
3firstNamehigh[3] values are common given names (VIKASH, SAAGAR, Anubinda, BASANTA, JAHNAVI)
4lastNamehigh[4] values are surnames (KUMAR, Banshkar, Patra, NAYAK, MARATHE)
5address1high[5] values are street addresses (AT+ PO - JALPURA, Barjhai post panagar, D 601 REGENCY, etc.)
6address2high[6] values are secondary address components (PS +DIST- ARWAL, spaces suggesting address line 2)
7cityhigh[7] values are Indian city names (ARWAL, jabalpur, THANE, Thane)
8statehigh[8] values are Indian states (BIHAR, Madhya Pradesh, Maharashtra)
9countryhigh[9] values are country codes/names (IN, India, IND)
10ziphigh[10] values are 6-digit Indian PIN codes (804401, 483220, 401202, 400607)
11phonehigh[11] values are 10-digit Indian mobile numbers (7305510321, 8699750340, 9898321401, 9755852574)
12phonehigh[12] values are 10-digit Indian phone numbers (9324532709, 9920696417), alternate contact number

Notes: Yatra 2019 breach — Indian travel booking platform. Column [0] is numeric user_id (skip). Columns [11] and [12] both map to phone as alternate contact numbers. All addresses, names, and contact details are Indian in origin.

17.csv
12 columns99,257 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] All values contain @ symbol and match email address format (Indian domains: sbi.co.in, yahoo.com, gmail.com)
2suffixhigh[2] Values are 'Mr', 'Mr.', 'Mr' — salutation/title prefix
3firstNamehigh[3] Values are common Indian given names (Sumit, Arun, Raunak, KESHAV)
4lastNamehigh[4] Values are Indian surnames (Lakhotiya, DASARI, Rana, Agrawal, SHENDE)
5address1high[5] Street-level addresses (2b Rajkamal Complex, B-22 East Uttam Nagar, 11 SOMWAR PETH)
6address2high[6] Secondary address components (Panchsheel Square Dhantoli, sri nagar colony)
7cityhigh[7] Indian city names (Nagpur, Hyderabad, New Delhi, KARAD)
8statehigh[8] Indian states/provinces (Maharashtra, Andhra Pradesh, Delhi)
9countryhigh[9] Country indicators (IN, IND, India)
10ziphigh[10] Indian postal codes (440012, 500045, 110059, 415110 are valid PIN formats)
11phonehigh[11] Indian mobile phone numbers (10 digits starting with 9: 9448993063, 9869345389, 9703580777, 9876644001, 9311056668)
12phonehigh[12] Secondary phone numbers including landlines (040-44430777 is Hyderabad area code, 9960196954 is mobile)

Notes: 13 columns total. Column [0] is numeric user_id (skip). Columns [1]–[12] contain personal PII. Data structure confirms Yatra travel platform breach with Indian user records containing contact and address information.

18.csv
12 columns99,136 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ symbols and match email address patterns (gmail.com, yahoo.com, accenture.com, etc.)
2suffixhigh[2] header 'Mr', consistent salutation/title prefix values
3firstNamehigh[3] values are common given names (SHANKAR, Mohamed Saifulla, Khawar, sushant)
4lastNamehigh[4] values are surnames (MISHRA, Shakeel, Hussain, Nethala, shukla)
5address1high[5] street/mailing address values (marble mkt trikuta nagar jammu, H No: 9-7-8/7, Shivajinagar, etc.)
6address2high[6] secondary address line values (P O Lawsons Bay, banda(u.p.), main road)
7cityhigh[7] Indian city names (jammu, Visakhapatnam, lucknow, VIJAYAWADA)
8statehigh[8] Indian state names (Jammu and Kashmir, Uttar Pradesh, Andhra Pradesh)
9countryhigh[9] country codes and names (IND, India)
10ziphigh[10] Indian PIN codes (180012, 226001, 520001) — numeric postal codes
11phonehigh[11] 10-digit Indian mobile phone numbers (9003698543, 7305591205, 9960517609, etc.)
12phonehigh[12] alternate 10-digit Indian phone numbers (9815926631, 9450169787) — secondary contact

Notes: 13 columns total, 12 contain PII. Column [0] is numeric user ID — skipped as internal identifier. Breach context confirms Indian travel booking platform (Yatra.com) with Indian addresses, states, PIN codes, and phone numbers. No header row present in data.

19.csv
10 columns96,370 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs and email domains (yahoo.in, rediffmail.com, gmail.com)
2suffixhigh[2] header pattern matches salutation/title; values are 'Mr.' and 'Mr'
3firstNamehigh[3] values are common Indian given names (siddhartha, santhosh, ARIJEET, nongthombam)
4lastNamehigh[4] values are Indian surnames (singh, RAIKAR, kangleinganba)
5address1high[5] values contain street addresses and landmarks (varuna, sunrise town, chowkaghat)
7cityhigh[7] values are Indian city names (varanasi, chennai, Imphal)
8statehigh[8] values are Indian state names (Uttar Pradesh, Manipur)
9countryhigh[9] values are 'India' and 'IN' country codes
10ziphigh[10] values are 6-digit Indian postal codes (211002, 795001)
11phonehigh[11] values are 10-digit Indian mobile numbers (9415015092, 8756538111, etc.)

Notes: Yatra 2019 breach: Indian travel booking platform. Column [0] is numeric user ID (skip). Columns [6] and [12] are empty (skip). Total PII columns: 10 of 13.

2.csv
12 columns98,870 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] header 'email', values contain @ symbols and match email format (gmail.com, rediffmail.com, yahoo.co.in)
2suffixhigh[2] header 'salutation/title', values are 'Mr' — standard name suffix
3firstNamehigh[3] header 'first_name', values are common given names (Sandhya, sahil, prabhat, NITIN, Samidurai)
4lastNamehigh[4] header 'last_name', values are surnames (Gunjote, gaikwad, kumar, GUPTA, Nadarajan)
5address1high[5] header 'address', values are street addresses (1/7 ram nager, K61/119 SAPTSAGAR, 7 1st Floor Kamatchiamman Colony)
6address2high[6] header 'address2', values are secondary address components (wagle estate behind r p compnay thane w, Sriramapuram)
7cityhigh[7] header 'city', values are Indian city names (thane, VARANASI, Chennai, Bangalore)
8statehigh[8] header 'state', values are Indian state names (Maharashtra, Uttar Pradesh, Tamilnadu, Karnataka)
9countryhigh[9] header 'country', values are country codes/names (IND, India, IN)
10ziphigh[10] header 'PIN/postal_code', values are 6-digit Indian postal codes (400604, 221002, 600026, 560021)
11phonehigh[11] header 'mobile_phone', values are 10-digit Indian phone numbers (9873323693, 8886801515, 9619222942, 9950674190)
12phonehigh[12] header 'alternate_phone', values are 10-digit Indian phone numbers (67341600, 9198418199) — secondary phone field

Notes: Yatra-2019 breach. Column [0] is numeric user_id (skip). All 13 columns analyzed. 12 PII columns identified. Data is entirely Indian travel platform user records with standard contact information and address fields.

20.csv
13 columns95,946 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
0skiphigh[0] header '4066017', values are sequential numeric IDs (4066018, 4066019, etc.) — internal customer/user identifiers
1emailhigh[1] header '[email protected]', all values are valid email addresses with @ symbol (yahoo.co.in, gmail.com, hotmail.com, etc.)
2suffixhigh[2] header empty, values are titles: 'Mr', 'Mr.', 'Mr' — salutation/suffix field
3firstNamehigh[3] header 'reetagupta', values are given names: 'RAHUL', 'DRTG', 'megh', 'mohankumar', 'Geeta' — first names
4lastNamehigh[4] header empty, values are surnames: 'CHAUDHARY', 'CHANDRAMOHAN', 'singh', 'N', 'kolkar' — last names
5address1high[5] header '1112seergovardhanpur', values are street addresses: 'MPHASIS LTD', '3 CITY PARK SAMA', 'AKE Road NO3 Tiruchengode', 'shri venkateshwara residency' — primary address lines
6address2medium[6] header 'Enter Address 2', mostly empty with occasional secondary address data ('Namakkal', '313/7 B&C colony R&D Pashan')
7cityhigh[7] header 'varanasi', values are Indian city names: 'MANGALORE', 'VADODARA', 'koderma', 'Namakkal', 'bangalore'
8statehigh[8] header empty, values are Indian states: 'Karnataka', 'Gujarat', 'Jharkhand', 'Tamil Nadu' — mailing_state equivalent
9countryhigh[9] header empty, all values are 'IND' or 'India' — country code/name field
10ziphigh[10] header empty, values are 6-digit Indian PIN codes: '575001', '390008', '825413', '637211', '562157' — postal codes
11phonehigh[11] header '9795510893', values are 10-digit Indian mobile numbers: '9901121130', '9212412662', '9567228210' — primary phone
12phonehigh[12] header empty, values are 10-digit Indian mobile numbers: '9430314543', '9171224815', '9341704489' — alternate/secondary phone

Notes: 13 columns total. Yatra 2019 breach: Indian travel booking platform. Data includes user IDs, emails, names with titles, complete Indian addresses (street, city, state, country, PIN), and primary + secondary mobile phone numbers. All addresses and phone numbers confirm Indian origin. Column [0] is internal customer ID (skipped). Columns [11] and [12] are both phone fields (primary and secondary mobile contacts).

21.csv
13 columns99,318 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
0skiphigh[0] numeric user IDs (4174487, 4174489, etc.) — internal identifiers, not PII
1emailhigh[1] all values contain @ symbol and are valid email addresses ([email protected], [email protected], etc.)
2suffixhigh[2] salutation/title values: 'Ms', 'Mr', 'Mr.' — clearly honorific suffixes
3firstNamehigh[3] given names (sivakumar, Nisha, AYUSH, SANGAY, Amit) — first name values in Indian context
4lastNamehigh[4] family names (Sharma, KHURANA, LAMA, EMANI) — last name values, though some rows contain anomalies
5address1high[5] street addresses (Vrajbhumi soc, A-1382 BAPU NAGAR BHILWAR, 6/6 mahant layout) — primary address line
6address2medium[6] mixed secondary address/contact data ([email protected], bull temple road, chola block, phone 9963785987) — appears to be secondary address or overflow field
7cityhigh[7] Indian city names (Vadodara, bhilwara, New Delhi, bangalore) — city field
8statehigh[8] Indian state names (Gujarat, Rajasthan) and locality descriptors — state/region field
9countryhigh[9] country code 'IND' and empty values — country field
10ziphigh[10] Indian postal codes (390021, 311001) — PIN/zip code field
11phonehigh[11] 10-digit Indian mobile numbers (9442333343, 9558724370, 8058757575) — primary phone field
12phonehigh[12] 10-digit Indian mobile numbers (9558724370, 8058757575) — alternate/secondary phone field

Notes: 13 columns total. Data structure matches Yatra breach context: Indian user account records with numeric user IDs, email addresses, names, Indian addresses (street, city, state, PIN), and mobile phone numbers. No hasHeader flag present in raw data. Column 6 contains data quality issues with mixed field types (emails, addresses, phones) suggesting possible data corruption or secondary contact information.

22.csv
12 columns97,711 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] header '[email protected]' is email; all values contain @ and are valid email addresses
2suffixhigh[2] header 'Mr'; all values are titles/salutations (Mr, etc.)
3firstNamehigh[3] header 'Nitin'; values are given names (Amos, Tathagata, sainmi, Manoj)
4lastNamehigh[4] header 'Gupta'; values are family names (Samson, Sureddy, Ghosh, Kumar)
5address1high[5] header '45 Old Ashoka Garden'; values are street addresses (NSK Street, Mohammad Villa, etc.)
6address2high[6] header ' '; values are secondary address components (apt/area names: koperkhairene)
7cityhigh[7] header 'Bhopal'; values are Indian cities (Chromepet Chennai, Hyderabad, navi mumbai, PATHANKOT)
8statehigh[8] header 'Madhya Pradesh'; values are Indian states/provinces (Tamil Nadu, Andhra Pradesh, Punjab, Maharashtra)
9countryhigh[9] header 'IND'; all values are 'IND' (India country code)
10ziphigh[10] header '462023'; values are 6-digit Indian postal codes (600044, 500072, 145001, 401202)
11phonehigh[11] header '9893084352'; values are 10-digit Indian mobile numbers
12phonehigh[12] header '9893084352'; values are 10-digit Indian mobile numbers (alternate/secondary phone)

Notes: 13 columns total. Column [0] is numeric user ID (skip). Columns [11] and [12] both contain phone numbers; [12] appears to be alternate/secondary mobile. Indian travel booking platform breach with complete address and contact information.

23.csv
12 columns99,132 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] All values contain @ signs and match email format (gmail.com, yahoo.com, rediffmail patterns)
2suffixhigh[2] Values are 'Mr', 'Ms', 'Mr.' — salutation/title indicators
3firstNamehigh[3] Values are common given names (jagdish, yashmeen, SUNIL, anil, Ravi, venkat)
4lastNamehigh[4] Values are surnames (mamgai, kaur, KUMAR, satish, Sonule)
5address1high[5] Values are street addresses (E 499 sc 11 pratap vihar, flat no-240 dwarka delhi, b-84 indupuram aurangabad)
6address2medium[6] Secondary address component; mostly empty/spaces but contains some location data (kothapet guntur)
7cityhigh[7] Values are Indian city names (ghaziabad, delhi, Visakhapatnam, mumbai, guntur)
8statehigh[8] Values are Indian states (Uttar Pradesh, Delhi, Andhra Pradesh, Maharashtra, Punjab)
9countryhigh[9] All values are 'IND' — country code for India
10ziphigh[10] Values are 6-digit Indian PIN codes (201001, 110075, 530013, 400072, 522001, 143001)
11phonehigh[11] Values are 10-digit Indian mobile numbers (9910629882, 8826955689, 9502889933, 9819382329)
12phonehigh[12] Values are 10-digit Indian mobile numbers — alternate/secondary phone field

Notes: 13 columns total, 12 contain PII (email, suffix, firstName, lastName, address1, address2, city, state, country, zip, phone×2). Column [0] appears to be user_id (numeric sequential identifiers like 4384367-4384371) and is skipped as internal ID. No header row present in data.

24.csv
13 columns99,047 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
0skiphigh[0] header '3674926', values are sequential numeric IDs (3674927, 3674928, etc.) — internal user/customer IDs
1emailhigh[1] header contains email address, all values contain @ sign and match email format ([email protected], [email protected], etc.)
2suffixhigh[2] header 'Mr', values are salutations/titles (Mr, Mr., etc.)
3firstNamehigh[3] header 'sunil', values are given names (Prabhjot, chongkholun, Najmul, pratik)
4lastNamehigh[4] header 'kuriakose', values are surnames (Gill, haokip, Khan, amar)
5address1high[5] header 'Puthiyadathuparambil', values are street addresses (17/4 zorawar enclave, C-53 New Raipur Rd, etc.)
6address2high[6] header 'Puthiyadathuparambil', values are secondary address components (mall road near church, sadar hills dist.)
7cityhigh[7] header 'Kannur', values are Indian city names (Jalandhar, manipur, Kolkata)
8statehigh[8] header 'Kerala', values are Indian states/provinces (Punjab, manipur, West Bengal)
9countryhigh[9] header 'IND', values are country codes/names (IND, IN, India)
10ziphigh[10] header '670632', values are 6-digit postal codes (144005, 795001, 700084) — Indian PIN codes
11phonehigh[11] header '9.19496E+11', values are 10-digit Indian mobile numbers (8123150766, 8872489814, 9415050816, 9957570748, 8657109081)
12phonehigh[12] header 'Kerala', values are 10-digit phone numbers (9041660921, 8861797747) — alternate/secondary phone field

Notes: 13 columns total, 12 contain PII. Column 0 is internal user ID (skip). Columns 11 and 12 both contain phone numbers, likely mobile and alternate phone as noted in breach context. Data confirms Indian travel booking platform with Indian addresses, phone numbers, and email providers.

25.csv
12 columns98,808 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
0skiphigh[0] Numeric user IDs (3781279, 3781280, etc.) — internal identifiers, not usernames
1emailhigh[1] Values contain @ symbol and email domains (gmail.com, sify.com, hotmail.com, zyduscadila.com)
2suffixhigh[2] Values are 'Mr.' and 'Mr' — salutation/title indicators
3firstNamehigh[3] Values are given names (Biju, Ignatius, Felix, Abhishek, PRASAD, NITISH)
4lastNamehigh[4] Values are family names (George, Richard, Dupont, Dani, KAPPAZHASANKARA, AMIN)
5address1high[5] Values are street addresses (26 GANDHI NAGAR DINDIGUL ROAD TRICHY, 21-C Vrindavan 2, IPSIT,21B, 80/1 A+B)
6address2high[6] Values are secondary address components (Panchavati, Pashan Rd, Varanasi soc, warje)
7cityhigh[7] Values are Indian city names (trichy, Pune)
8statehigh[8] Values are Indian states (Tamilnadu, Mah/Maharashtra)
9countryhigh[9] Values are 'India' and 'IN' — country code and name
10ziphigh[10] Values are 6-digit Indian PIN codes (620001, 411008, 411058)
11phonehigh[11] Values are 10-digit Indian mobile numbers (9974051970, 9841322640, 9969352104, etc.)

Notes: 13 columns total. Column 12 is empty. Breach context confirms Indian travel booking platform with Indian addresses, phone numbers, and email providers. All PII fields identified and mapped.

26.csv
12 columns99,260 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
0skiphigh[0] numeric user IDs (3886583, 3886584, etc.) — internal identifiers, auto-generated
1emailhigh[1] all values contain @ symbol, standard email addresses (gmail.com, yahoo.com, rediffmail.com)
2suffixhigh[2] salutation/title values ('Mr') — matches suffix field type
3firstNamehigh[3] personal given names (Deokumar, hitendra, RAKESH, venkatasi)
4lastNamehigh[4] personal family names (Singh, pawar, TIWARI, bommarajupeta)
5address1high[5] street addresses (Vighnharta Colony, H.NO. 107, 2ND FLOOR, 19-4-121/e3)
6address2high[6] secondary address lines (panchavati Gas Agency Road, GALI NO. 2, ASHOK VIHAR, RAILWAY ROAD)
7cityhigh[7] Indian city names (Dhule, GURGAON)
8statehigh[8] Indian states (maharashtra, HARYANA)
9countryhigh[9] country code 'IN' (India), consistent with breach context
10ziphigh[10] Indian postal codes/PIN codes (424002, 122001) — 6-digit format
11phonehigh[11] Indian mobile phone numbers (9962266133, 9246647888, etc.) — 10 digits starting with 7-9

Notes: 12 columns total, 11 contain PII. Breach context (Yatra 2019, Indian travel platform) confirmed by Indian addresses, phone numbers, email providers (rediffmail.com, yahoo.co.in), and city/state data. No header row present in sample.

27.csv
12 columns96,519 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] header '[email protected]' is sample data; values contain @ signs and are valid email addresses
2suffixhigh[2] header 'Mr.' is sample data; values are salutations (Mr., Mrs)
3firstNamehigh[3] header 'Prince' is sample data; values are common given names (vijaylakshmi, Pradyumna, ramchandraprasad, AndrewWilson, Balu)
4lastNamehigh[4] header 'Chakma' is sample data; values are surnames (srikanth, Mohapatra, kalavala, Cornforth, Pathangey)
5address1high[5] header is sample data; values are street addresses (Bodhicariya, Maitree Nagar; ashok nagar; a-504,plot no -23)
6address2high[6] header is sample data; values are secondary address components (PO:Khandagiri, Bilandpur, 110colleg road)
7cityhigh[7] header 'Kolkata' is sample data; values are city names (chennai, Bhubaneswar, navi mumbai, Scottsdale, Gorakhpur)
8statehigh[8] header 'West Bengal' is sample data; values are state/province codes and names (Tamilnadu, Odisha, Maharashtra, AZ, UP)
9countryhigh[9] header 'India' is sample data; values are country names/codes (IN, India, United States of America)
10ziphigh[10] header '700135' is sample data; values are postal codes (600008, 751030, 400706, 85259, 273001)
11phonehigh[11] header '9038432966' is sample data; values are 10-digit Indian mobile numbers (9940074243, 9937563105, 9619394141)
12phonehigh[12] header is sample data; values are phone numbers (9436121525), appears to be alternate/secondary phone field

Notes: File contains Indian travel platform user records. Column [0] (numeric user IDs like 3991230) is skipped as internal identifier. All columns except [0] map to PII fields. Context confirms Indian addresses, Indian phone numbering, and Indian email providers.

28.csv
11 columns99,455 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] All values contain @ symbol and match email format (gmail.com, yahoo.com, yahoo.co.in domains)
2suffixhigh[2] Header 'Mr' and all sample values are salutations/titles (Mr)
3firstNamehigh[3] Values are common given names (sahil, ajay, Nitesh, Sajad, Tarun)
4lastNamehigh[4] Values are surnames (singla, jain, Mishra, bhat, Chanana)
5address1high[5] Street addresses with building/plot numbers (466 sector 31, 357 tagan street, samanvay nagar, JG1-105 B, Yatra.Com office)
6address2high[6] Neighborhood/area names (Vikas Puri, ameerpet)
7cityhigh[7] Indian city names (gurgaon, khatauli, bhopal, new delhi)
8statehigh[8] Indian state/territory names (haryana, Uttar Pradesh, Madhya Pradesh, delhi)
9countryhigh[9] Country codes (IN, IND)
10ziphigh[10] Indian PIN codes (122001, 251201, 462023, 110018)
11phonehigh[11] Indian mobile phone numbers (10-digit format: 9819996726, 9811667166, 9412711594, etc.)

Notes: 13 columns total, 11 contain PII. Column [0] is numeric user ID (skip). Column [12] is empty (skip). All addresses are Indian, consistent with Yatra.com breach context. Salutation field mapped as suffix.

29.csv
12 columns95,992 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] header '[email protected]' is an email address; all sample values contain @ signs and are valid email addresses
2suffixhigh[2] values are 'Mr', 'Mr.', 'Dr' — salutation/title suffixes
3firstNamehigh[3] header 'ARUNDABIR'; values are given names (Sadanand, PEEYUSH, SAMI, ASEM, Pradipta)
4lastNamehigh[4] values are surnames (Gharat, SAJJAD, SUNILKUMARSINGH, taj, KDas)
5address1high[5] street/building-level address component (Growel House, SAI SECURITIES, A-162/32, Hostel No.2)
6address2high[6] secondary address component (Dapodi, SIMALGAIR BAZAAR, Nandalaya,Subachani Road,Tinsukia,Assam)
7cityhigh[7] values are Indian city names (Pune, DELHI, THOUBAL, Silchar)
8statehigh[8] values are Indian state names (Delhi, Manipur, Maharashtra, Madhya Pradesh)
9countryhigh[9] values are 'India' and 'IND' — country designator
10ziphigh[10] values are 6-digit Indian postal codes (110053, 795138, 400706, 452001)
11phonehigh[11] values are 10-digit Indian mobile phone numbers
12phonehigh[12] alternate phone numbers; 10-digit Indian mobile format (9953262896, 9004795001)

Notes: Yatra 2019 breach. 13 columns total, 12 contain PII. Column [0] is numeric user_id (skipped). Addresses span Indian cities/states with postal codes. Phone numbers are Indian mobile format. Breach context confirms Indian travel booking platform data.

3.csv
13 columns99,171 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
0skiphigh[0] numeric user IDs (3116178, 3116179, etc.), internal identifier pattern
1emailhigh[1] values contain @ signs, email addresses ([email protected], [email protected], etc.)
2suffixhigh[2] salutation/title values (Mr), maps to suffix field
3firstNamehigh[3] given names (Pardeep, thangjam, Rohit, MOHAN, VelluMadhom)
4lastNamehigh[4] family names (Khatri, bishwarjitsingh, Sharma, KUMAR, Rajen)
5address1high[5] street addresses (V. % P. O. Ismaila Haryana 9Beswa, c-43 Ganga CHS, PCDA(NC)JAMMU)
6address2high[6] secondary address/locality lines (Sane Guruji Nagar Mulund E), often empty
7cityhigh[7] Indian city names (Distt. Rohtak, Mumbai, JAMMU)
8statehigh[8] Indian states (Haryana, Maharastra, Jammu and Kashmir)
9countryhigh[9] country codes and names (India, IN, IND)
10ziphigh[10] Indian PIN codes (124517, 400081, 180003)
11phonehigh[11] 10-digit Indian mobile numbers (9416206295, 9716555414, 9780323332, etc.)
12phonehigh[12] alternate 10-digit Indian phone numbers (9469000923), often empty

Notes: Yatra travel booking platform breach (2019). 13 columns total, 11 contain PII. No header row present in data. All addresses are Indian, phone numbers follow Indian format (10 digits starting with 7-9), email providers include Indian domains (yahoo.co.in, rediffmail.com). Columns 0 is user_id (skip). Columns 5-6 are multi-part address. Columns 11-12 are primary and alternate phone numbers.

30.csv
9 columns97,592 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs and email domains (.yahoo.co.in, @gmail.com, @yahoo.com), clearly email addresses
2suffixhigh[2] header empty but values are 'Mrs', 'Mr' — salutation/title suffixes
3fullNamehigh[3] values are full names like 'Anju Puri', 'jayantha sanjeeva shetty', 'nirmala saagar' — full name field
4lastNamehigh[4] values are surnames: 'Shelat', 'Balram', 'Verma' — last name field
5address1high[5] values are street addresses like 'pari house, bunglow no.:4' and 'No,1 Nolambur main road'
6address2medium[6] values include apartment/suite details like 'A/2-C/4 golden fortune mogappair w' — secondary address
7cityhigh[7] header empty but values are Indian cities: 'mumbai', 'chennai', 'Ahemdabad', 'BANGALORE', 'delhi'
11phonehigh[11] values are 10-digit Indian mobile phone numbers like '9870593838', '9930240545', '9980673487'
12phonehigh[12] values are 10-digit Indian phone numbers like '8826749428' — alternate/secondary phone

Notes: 13 columns total. Yatra 2019 breach — Indian travel booking platform. Columns [0], [8], [9], [10] are empty or non-PII (internal IDs, flags) and excluded. Columns [5] and [6] both map to address fields per Indian address structure (street + apt/suite). No state, zip, country, DOB, or SSN fields present in this file sample.

31.csv
12 columns99,256 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ symbols and match email format (gmail.com, yahoo.com, yahoo.co.in)
2suffixhigh[2] header 'Mr', values are salutations/titles
3firstNamehigh[3] values are common given names (Ashraf, Vishu, ANIL, Sandeep, manoj, RAKESH)
4lastNamehigh[4] values are surnames (Shaikh, Kapoor, SURI, Pradhan, singh, MANJUNATH)
5address1high[5] values are street addresses (139 Green Avenue, 22,1st Floor,Sena Vihar, sultanpuri, #58, plot no 287 rawatpur)
6address2medium[6] column contains only spaces or empty values, consistent with optional address2 field
7cityhigh[7] values are Indian cities (Amritsar, Bangalore, delhi, BANGALORE, kanpur)
8statehigh[8] values are Indian states (Punjab, Karnataka, Delhi, Uttar Pradesh)
9countryhigh[9] all values are 'IND' (India country code)
10ziphigh[10] values are 6-digit Indian PIN codes (143001, 560043, 110086, 560078, 208019)
11phonehigh[11] values are 10-digit Indian mobile numbers (9699925818, 7893414806, 9845062142, 9096333326, 8604092136)
12phonehigh[12] values are 10-digit Indian mobile numbers, alternate/second phone number

Notes: 13 columns total, 12 contain PII. Column [0] is user_id (numeric identifier, skipped). Breach context confirms Indian travel booking platform with Indian addresses, phone numbers, and email providers. All address fields populated with valid Indian data.

32.csv
12 columns99,407 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ symbol and match email format ([email protected], [email protected], etc.)
2suffixhigh[2] values are salutations/titles ('Mr.', 'Mr')
3firstNamehigh[3] values are given names (ganesh, kamal, Anisha, Parikshit, venkat, PRAKASH)
4lastNamemedium[4] values appear to be surnames (krishnan, gulati, Parse, rangan, PATIL) though some entries are unclear or may be abbreviations
5address1high[5] values contain street addresses (c-15 orange block orchrad, SRI RAM NAGAR COLONY, etc.) or are empty/NA
6address2medium[6] secondary address field, mostly empty or 'NA', consistent with optional address2 column
7cityhigh[7] values are Indian city names (chennai, BAGALKOT, RAICHUR)
8statehigh[8] values are Indian states (Tamil Nadu, Karnataka)
9countryhigh[9] values are 'IND' or 'India', indicating country field
10ziphigh[10] values are 6-digit Indian postal codes (600026, 587101, 534001)
11phonehigh[11] values are 10-digit Indian mobile numbers (9160126226, 9810484932, 9920079979, etc.)
12phonehigh[12] secondary phone field with 10-digit Indian mobile numbers, consistent with alternate phone number column

Notes: Yatra-2019 breach file: Indian travel booking platform. Column [0] is numeric user ID (skip). All address, phone, and personal identifiers are Indian. Two phone columns present ([11] and [12]) mapping to primary and alternate phone numbers.

33.csv
13 columns99,276 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
0skiphigh[0] Sequential numeric IDs (4484483, 4484484, etc.) — internal user_id identifiers
1emailhigh[1] Values contain @ signs, standard email format ([email protected], [email protected], etc.)
2suffixhigh[2] Salutation/title values: 'Mr', 'Ms' — standard name suffix field
3firstNamehigh[3] Given names: 'mohd', 'HANIF', 'Rajat', 'hari', 'Rajesh', 'pooja' — first name component
4lastNamehigh[4] Family names: 'idrees', 'CHHATRIWALA', 'Singh', 'babu', 'Yadav', 'hagawane' — last name component
5address1high[5] Street addresses: 'SHANTI NIKETAN SCTY', 'Chandni Agar, Sangam Naga' — primary address line
6address2medium[6] Column appears empty in sample but positioned as secondary address field
7cityhigh[7] City names: 'MUMBAI', 'pune', 'Mumbai' — Indian cities
8statehigh[8] State values: 'Maharashtra', 'Maharashtra' — Indian state codes/names
9countryhigh[9] Country codes: 'IND', 'India' — standardized to India
10ziphigh[10] Indian postal codes: '400061', '411043', '400037' — PIN code format (6 digits)
11phonehigh[11] 10-digit Indian mobile numbers: '9997579189', '9768378694', '9999012953' — primary phone
12phonehigh[12] 10-digit Indian mobile numbers: '9768378694', '9604420117', '9619875434' — alternate/secondary phone

Notes: Yatra.com 2019 breach. 13 columns, 11 contain PII. Full Indian user records including contact details, address, and phone numbers. Two phone columns ([11] and [12]) both map to phone field as they represent primary and alternate contact numbers.

34.csv
11 columns96,873 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] header '[email protected]' is an email address; all values contain @ and follow email format
2suffixhigh[2] header 'Mr', values are salutations/titles (Mr, Mrs)
3firstNamehigh[3] header 'sunil', values are common given names (anupma, vicky, shubham)
4lastNamehigh[4] header 'kumar', values are common surnames (bhatnagar, rajak, arora)
5address1high[5] header 'hno1b/49bstcollny', values are street addresses (hno patterns, 'a 1604 3rd floor', 'VIVEKANAND')
7cityhigh[7] header 'ganaur', values are Indian city names (delhi, sonepat, SULTANPUR, Bangalore)
8statehigh[8] header 'Haryana', values are Indian state names (Delhi, Haryana, Uttar Pradesh)
9countryhigh[9] header 'IND', values are country codes/names (IND, India)
10ziphigh[10] header '131101', values are 6-digit Indian postal codes (110033, 131001, 228001)
11phonehigh[11] header '9812860304', values are 10-digit Indian mobile numbers
12phonehigh[12] header '9812860304', values are 10-digit Indian mobile numbers (alternate phone)

Notes: 13 columns total, 10 contain PII. Column [0] is numeric user ID (skip). Column [6] is empty (skip). Yatra-2019 Indian travel booking breach; all addresses, phone numbers, and email providers confirm Indian origin.

35.csv
12 columns99,230 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs, match email format (gmail.com, rediffmail.com, bheltry.co.in)
2suffixhigh[2] values are salutations/titles: 'Mr', 'Mrs', 'Mr.'
3firstNamehigh[3] values are given names: 'shikher', 'priyanka', 'lalit', 'ashish', 'KARUNANIDHY'
4lastNamehigh[4] values are family names: 'verma', 'sharma', 'luthra', 'modi', 'S'
5address1high[5] values are street addresses: 'Mayur Vihar-III', 'kalander chowk', '19 meghna arcade', 'EZHIL NAGAR'
6phonehigh[6] values are 10-digit Indian mobile numbers: '9898067155'
7cityhigh[7] values are Indian city names: 'Delhi', 'ahmendnagar', 'panipat', 'TRICHY', 'Ahmedabad'
8statehigh[8] values are Indian states: 'Delhi', 'Maharashtra', 'Haryana', 'Tamil Nadu', 'Gujarat'
9countryhigh[9] all values are 'IND' (India country code)
10ziphigh[10] values are Indian postal codes: '110096', '414003', '132103', '620014', '380008'
11phonehigh[11] values are 10-digit Indian mobile numbers: '9212201486', '8446688741', '9812020026'
12phonehigh[12] alternate phone column, values are 10-digit Indian mobile numbers matching [11] pattern

Notes: Yatra 2019 breach (Indian travel booking platform). Column [0] contains user IDs (skip). Columns [11] and [12] appear to be primary and alternate phone numbers — both mapped as phone. File lacks header row (hasHeader: false). All records contain Indian addresses, phone numbers, and email providers consistent with breach context.

36.csv
11 columns96,059 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] header '[email protected]' is an email address; all sample values contain @ and are valid email addresses
2suffixhigh[2] header 'Mr' and all values are 'Mr', 'Mr.' — salutation/title field
3firstNamehigh[3] header 'jothi' and sample values 'Namboodiri', 'abhishek', 'Hunijandi', 'bhupendra Singh', 'Mahendra' are given names; context confirms first name column
4lastNamehigh[4] header 'lakshmi' and sample values 'AGV', 'garg', 'Anthena', 'ranawat', 'Punmia' are family names; positioned after firstName
5address1high[5] sample values '3-13-102/1, Madhuranagar', '1695 / room no 5,', 'Yatra Office', '118-H,Thiru flats,lakshmi' are street addresses; breach context confirms physical addresses included
7cityhigh[7] sample values 'HYDERABAD', 'Gurgaon', 'Gurgaon', 'chennai' are Indian city names
8statehigh[8] sample values 'Andhra Pradesh', 'Haryana', 'Haryana', 'Tamil Nadu' are Indian state names
9countryhigh[9] sample values 'IND', 'India', 'India', 'IND', 'India' are country indicators
10ziphigh[10] sample values '500013', '122002', '122003', '600116' are 6-digit Indian postal codes (PIN codes)
11phonehigh[11] sample values '9094864296', '8886000022', '9953535282', '8009123', '7428814684', '9860203638' are 10-digit Indian mobile numbers
12phonehigh[12] sample values '8886000022', '7428814684', '9445191807' are 10-digit Indian phone numbers; alternate/secondary phone column

Notes: 13 columns total; 11 contain PII (email, suffix, firstName, lastName, address1, city, state, country, zip, phone, alternate phone). Column [0] is numeric user_id (skipped). Column [6] is empty (skipped). Breach confirmed as Yatra 2019 with Indian user records containing contact and address information.

37.csv
6 columns99,377 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] All values contain @ symbol and match email format (yahoo.com, gmail.com domains)
2suffixhigh[2] Values are 'Mr', 'Mr.' — salutation/title prefix
3firstNamehigh[3] Values are common Indian given names (jinu, cijoy, Sangeetha, Pramod, karthik)
4lastNamehigh[4] Values are surnames (garg, varghese, Kumar, mudaliar)
9countryhigh[9] Value is 'India'
11phonehigh[11] All values are 10-digit Indian mobile numbers (8893212763, 9894939142, etc.)

Notes: File appears to be headerless CSV from Yatra 2019 breach. Columns [0], [5], [6], [7], [8], [10], [12] contain sparse/empty data and are treated as skip. Column [0] appears to be user_id (skip). Columns [5], [6] appear to be incomplete address/zip fragments. Remaining empty columns skipped. Total 13 columns, 6 contain PII.

38.csv
12 columns96,183 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] header missing but values are email addresses with @ signs ([email protected], [email protected], [email protected])
2suffixhigh[2] values are salutations/titles (Mr, Mr.) matching breach description of 'salutation/title'
3firstNamehigh[3] values are common given names (Jasmail, Christy, Anurag, Satyajitkumar, bhavesh)
4lastNamehigh[4] values are family names (Sidhu, Fernandez, Singh, kala)
5address1high[5] street/house addresses (ST NO 2 SEC NO 13, A-68, mokram manzil, b.b ganj)
6address2high[6] secondary address lines (South Extension Part 1) matching address2 pattern
7cityhigh[7] Indian city names (Malout, Gurgoan, cuttack, muzaffarpur)
8statehigh[8] Indian state abbreviations/names (Punjab, Orissa, Bihar)
9countryhigh[9] country code/name (IND, India)
10ziphigh[10] Indian PIN codes (152107, 753001, 842001) matching postal_code pattern
11phonehigh[11] Indian mobile numbers (9317766144, 9711062778, 9431218870, 10-digit format)
12phonehigh[12] alternate/secondary phone number (9317766144, duplicate pattern)

Notes: 13 columns total, 11 contain PII. Column [0] contains numeric user IDs (4804981, etc.) and mixed junk data — treated as skip. Breach context confirms Indian travel platform with user account records including addresses, phones, names matching observed data.

39.csv
11 columns99,330 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs, email addresses from Indian providers (gmail.com, outlook.com, yahoo.com)
2suffixhigh[2] header value 'Dr.' and sample values 'Mr', 'Mr.', 'Mr' are common name suffixes/salutations
3firstNamehigh[3] header 'Tarundeep' with samples 'KISHOR', 'jigyasa', 'ABDUL', 'sudesh', 'yogesh' are given names
4lastNamehigh[4] header 'kaur' with samples 'KUNAL', 'srivastava', 'KF', 'subba', 'bhanushali' are surnames
5address1high[5] values are street addresses (e.g., '1548,pushpac complex,sector-49-b,chandigarh', 'RAHAM KHAN', 'murmah tea estate')
7cityhigh[7] values are Indian city names: 'chandigarh', 'DARBHANGA', 'mirik'
8statehigh[8] values are Indian states: 'Chandigarh', 'Bihar', 'West Bengal'
9countryhigh[9] values are 'India' or 'IND', country identifier
10ziphigh[10] values are 6-digit Indian postal codes: '160047', '846004', '734214'
11phonehigh[11] values are 10-digit Indian mobile numbers: '9872876818', '9716688666', '7838594981'
12phonehigh[12] values are 10-digit Indian mobile numbers, alternate/secondary phone field

Notes: Yatra.com travel booking platform breach. Column [0] is numeric user ID (skipped). Column [6] is empty/blank (skipped). All PII fields identified: email, name components (suffix, first, last), full address (street, city, state, country, ZIP), and dual phone numbers.

4.csv
12 columns99,017 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] header '[email protected]' is an email; all values contain @ and are valid email addresses
2suffixhigh[2] header 'Mr'; all values are salutations/titles (Mr)
3firstNamehigh[3] values are common given names (ujjwal, vivek, prashant, priyanka, Sunil)
4lastNamehigh[4] values are surnames (gupta, tiwari, sangwan, vardhan, Sajnani)
5address1high[5] values are street addresses (168,A.P.R.,Colony Katanga, 772,main market, etc.)
6address2medium[6] values appear to be secondary address components (sec 27d, Brook Road Gadag Road, Nagar)
7cityhigh[7] values are Indian city names (Jabalpur, katni, chandigarh, Hubli, Ajmer)
8statehigh[8] values are Indian states (Madhya Pradesh, Chandigarh, Karnataka, Rajasthan)
9countryhigh[9] all values are 'IND' (India country code)
10ziphigh[10] values are Indian PIN codes (482001, 483770, 160019, 580020, 305001)
11phonehigh[11] header '9811348731'; all values are 10-digit Indian mobile numbers
12phonehigh[12] values are 10-digit phone numbers (alternate/secondary phone number)

Notes: 13 columns total, 12 contain PII. Column [0] is a numeric user ID (skip). Breach context confirms Indian travel platform with Indian addresses, phone numbers, and email providers. Address broken into components: street (address1), secondary (address2), city, state, country, postal code. Two phone columns: primary [11] and alternate [12].

40.csv
11 columns99,299 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs and match email format ([email protected], [email protected], etc.)
2suffixhigh[2] values are titles/salutations (Mr, Mrs)
3firstNamehigh[3] header position and values are common given names (Santosh, Prashant, deepa, harmohan, NAMAN, Daya)
4lastNamehigh[4] header position and values are surnames (Gupta, Sharma, kabra, bhatia, KAUSHIK, Nand)
5address1high[5] values are street addresses (champasari more, Sangam Vihar New Delhi, bougen villa aundh pune, h no 548 st no 5 guru nan, 31-SHIVAJI ENCLAVE)
7cityhigh[7] values are Indian cities (Siliguri, New Delhi, pune, patiala, LUCKNOW)
8statehigh[8] values are Indian states (West Bengal, Delhi, Maharashtra, Punjab, Uttar Pradesh)
9countryhigh[9] values are country codes/names (IND, India)
10ziphigh[10] values are Indian postal codes (734001, 110062, 411007, 147001, 226016)
11phonehigh[11] values are 10-digit Indian mobile numbers (9832043310, 9968238272, 9552570365, 9569585244, 9674304005)
12phonehigh[12] values are 10-digit Indian phone numbers, appears to be alternate phone field (9832043310, 29913128, 9552570365, 9569585244, 9674304005)

Notes: Yatra 2019 breach of Indian travel booking platform. Column [0] is numeric user ID (skip). Column [6] is empty/blank (skip). Columns [1], [11], [12] represent email and two phone number fields. All addresses confirmed as Indian with Indian postal codes and states.

41.csv
12 columns99,313 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ symbols and match email format (gmail.com, rediffmail.com, yahoo.com)
2suffixhigh[2] header 'Mr', values are salutations/titles (Mr, Mrs.)
3firstNamehigh[3] values are common given names (zeeshana, Manas, SUKHENDU, James, Shiv)
4lastNamehigh[4] values are surnames (mushtaq, Bhattacharjee, CHANDA, Dillon, Mani)
5address1high[5] values are street addresses (New Rausa Patna Colony, kulgachia, K-509 Near Mata)
6cityhigh[6] values are Indian city names (howrah, Cuttack, kolkata, Delhi)
7cityhigh[7] values are Indian city names (Cuttack, kolkata, Delhi) — duplicate city field or city variant
8statehigh[8] values are Indian state names (Orissa, Delhi)
9countryhigh[9] values are country codes (IND = India)
10ziphigh[10] values are Indian postal codes/PIN codes (753001, 110037)
11phonehigh[11] values are 10-digit Indian mobile numbers (9906816378, 9777171714, 9477032166)
12phonehigh[12] values are 10-digit Indian phone numbers, likely alternate/secondary phone

Notes: Yatra.com travel booking platform breach. Column [0] is numeric user ID (skip). Columns [6] and [7] both contain city data — may represent primary/alternate city or data duplication. All phone numbers are Indian format. Addresses, cities, states, and postal codes are consistent with Indian geography. Column [2] contains salutation/title data mapped as suffix field.

42.csv
10 columns99,346 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] All values contain @ signs and are email addresses (yahoo.in, gmail.com, rediffmail.com)
2suffixhigh[2] Header 'Mr.' and values are salutations (Mr., Mr)
3firstNamehigh[3] Values are common given names (siddhant, VINEETA, Chetan, dicky, sayyed, BRAJESWAR)
4lastNamehigh[4] Values are surnames (surana, SINGH, Kawatia, hora, khalid, MISHRA)
5address1high[5] Sample contains 'Yatra.Com 1101-03', typical street/building address format
7cityhigh[7] Values are Indian city names (kolkata, Gurgoan, mumbai)
8statehigh[8] Values are Indian state names (West Bengal, Maharashtra)
9countryhigh[9] All values are 'India'
10ziphigh[10] Sample '400089' is an Indian PIN code format
11phonehigh[11] All values are 10-digit Indian mobile numbers starting with 9

Notes: 13 columns total. Column [0] is numeric user ID (skip). Columns [6] and [12] are empty (skip). Breach context confirms Indian travel platform with Indian addresses, phone numbers, and email providers.

43.csv
11 columns99,397 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs, clear email addresses from Indian domains (rediffmail.com, gmail.com, yahoo.com)
2suffixhigh[2] header 'Mr.', values are salutations/titles (Mr., Mr)
3firstNamehigh[3] values are given names (Rajneesh, HIMALAYA, akansha, Mapuia, NAVINRAMJI)
4lastNamehigh[4] values are family names (Bhimte, GAMOT, bilolikar, pant, Hmar, PARMAR)
5address1high[5] values are street addresses (Ward no 01 banjar coloney, 208 center plaza, 1/17 vivekanand path)
7cityhigh[7] values are Indian city names (balaghat, mumbai, patna, amravati)
8statehigh[8] values are Indian states (Madhya Pradesh, Maharashtra, Bihar)
9countryhigh[9] values are country identifiers (India, IND)
10ziphigh[10] values are 6-digit Indian PIN codes (481111, 400097, 800013, 444809)
11phonehigh[11] values are 10-digit Indian mobile numbers (9406766015, 9833406888, 9422165920)
12phonehigh[12] values are 10-digit Indian phone numbers, alternate/second phone field (9798423128, 9404339108)

Notes: 13 columns total. Column [0] is numeric user ID (skip). Column [6] is empty (skip). Yatra-2019 Indian travel booking breach with complete user profiles including names, emails, addresses, and phone numbers.

44.csv
12 columns99,358 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs and match email format ([email protected], [email protected], etc.)
2suffixhigh[2] header 'Mr', values are salutations/titles (Mr, Mr, Mr)
3firstNamehigh[3] values are common given names (MURTAZA, DHRUVA, Ankan, vinoth, madhu)
4lastNamehigh[4] values are surnames (TRAVADI, REDDY, Sarma, kumar, venkatesh)
5address1high[5] values are street addresses (Raj paints, 17-1-478/19 krishna nagar, Lakhinagar, no 93 otthavadai st)
6address2medium[6] mostly empty/spaces, secondary address line field
7cityhigh[7] values are Indian city names (porbandar, hyderabad, Guwahati, arakkonam, hyd)
8statehigh[8] values are Indian states (Gujarat, Andhra Pradesh, Assam, Tamil Nadu)
9countryhigh[9] values are 'IND', country code for India
10ziphigh[10] values are 6-digit Indian postal codes (360575, 500059, 781005, 631002, 534275)
11phonehigh[11] values are 10-digit Indian mobile numbers (9825281392, 9789985126, 9706217852, 9196771286)
12phonehigh[12] values are 10-digit Indian mobile numbers, alternate/secondary phone number

Notes: Yatra-2019 breach. 13 columns total. Column [0] is numeric user_id (skipped). Columns [11] and [12] both contain phone numbers — likely primary and secondary/alternate mobile. All addresses, names, phones, emails, and postal codes are Indian in origin, consistent with Yatra.com platform.

45.csv
12 columns99,364 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs, clearly email addresses (gmail.com, rediffmail.com pattern)
2suffixhigh[2] header 'Mr.', values are salutations/titles (Mr, Mr.)
3firstNamehigh[3] values are given names (SUBHASISH, DEVENDRA, Chandan, Tapi, senthil, Ankit)
4lastNamehigh[4] values are family names (SAHA, WALDE, Dheeraj, Nalo, kumar, Murarka)
5address1high[5] values are street addresses and apartment numbers (202,195/34, A6/203, azhuvalappil)
6address2high[6] values are secondary address components (Shrinath Soceity, thalakap — building/locality names)
7cityhigh[7] values are Indian city names (Tiruchirappalli, lucknow, Pune, kottakkal)
8statehigh[8] values are Indian states (Tamilnadu, Uttar Pradesh, Kerala)
9countryhigh[9] values are country indicators (India, IND)
10ziphigh[10] values are 6-digit Indian PIN codes (620024, 226018, 676503)
11phonehigh[11] values are 10-digit Indian mobile numbers (9994593516, 9766576353, 8130808225)
12phonehigh[12] values are 10-digit Indian phone numbers (8081863718, 4832706593) — alternate/secondary phone

Notes: Yatra.com 2019 breach — Indian travel platform. Column [0] is numeric user_id (skipped). Complete address structure present: address1, address2, city, state, country, zip. Two phone columns capture mobile and alternate numbers. Suffix field captures Mr./Ms. salutations common in Indian records.

46.csv
12 columns99,358 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] All values contain @ signs, clearly email addresses from various providers (gmail.com, yahoo.co.in, rediffmail.com, etc.)
2suffixhigh[2] Values are Mr., Mrs, Ms, consistent with salutation/title field
3firstNamehigh[3] Values are common given names (Sujit, ANKIT, Mamta, venu, INZAMAMUL, etc.)
4lastNamehigh[4] Values are surnames (Jaiswal, GUPTA, Jha, dorna, HOQUE, etc.)
5address1high[5] Values are street addresses (Room No:344 Narmada Hostel, H.N.66 CHRIAG DELHI, flat.no.102 reddy apartment, etc.)
6address2medium[6] Sparse column, appears to be secondary address/apartment info, mostly empty
7cityhigh[7] Values are Indian city names (Dharwad, NEW DELHI, secunderabad, Chennai, Patna, Darbhanga, Delhi, etc.)
8statehigh[8] Values are Indian states (Karnataka, Delhi, Andhra Pradesh, Bihar, Tamil Nadu, Maharashtra, Uttar Pradesh, etc.)
9countryhigh[9] Values are India, IND, India consistently
10ziphigh[10] Values are 6-digit Indian PIN codes (580002, 110017, 500003, 600117, 800004, 846001, 110092, 411019, 522626, 452008, 209724, 110009, 424108, etc.)
11phonehigh[11] Values are 10-digit Indian mobile numbers (8123472948, 875062562, 7738560101, 9959224411, 9775890989, 9474283010, 8977385909, 9900968850, etc.)
12phonehigh[12] Alternate/secondary phone field, also 10-digit Indian mobile numbers where populated

Notes: Yatra-2019 breach data. 13 columns total (0-12), 12 contain PII. Column 0 is user_id (skip). Records are Indian travel booking platform user accounts with complete personal information including addresses, phone numbers, email addresses, names, and location data.

47.csv
10 columns99,392 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ symbol and match email patterns ([email protected], [email protected], etc.)
3suffixhigh[3] values are salutations/titles (Mr., Mrs.) indicating suffix field
4firstNamehigh[4] values are common given names (Nikhil, Varun, parshuram, Apratim)
5lastNamehigh[5] values are family names (Gupta, Sharma, Yadavalli, yadav, Agrawal)
6address1high[6] values are street addresses and building identifiers (A-3, 9930919568 appears to be phone in this position, Viman Nagar, 632/308 Sanatan Nagar)
7address1high[7] continuation of address data (Bhushan Chs - apartment/building name)
8address1high[8] street/road names (V.P Road, Pune, Lucknow - mixed address components)
9address2high[9] landmark/area descriptors (Near Andhra Bank, Maharashtra, Uttar Pradesh)
10countryhigh[10] values are 'India' and 'IND' indicating country field
11ziphigh[11] values are postal codes (421201, 411014, 226028) and city-PIN combinations (Dombivli(East)-421201)

Notes: File has no header row. Column [0] contains numeric user IDs (customer_id) - treated as skip. Column [2] contains city names but mostly empty in sample - appears to be city field but sparse. Columns [6-11] contain fragmentary address data distributed across multiple fields (typical of parsed address storage). Indian context confirmed by address patterns, PIN codes, and city names (Thane, Dombivli, Pune, Lucknow). Total 12 columns, 11 contain PII.

48.csv
11 columns99,235 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs, standard email format (gmail.com, yahoo.co.in, rediffmail patterns)
2suffixhigh[2] values are 'Mr', 'Mr.', 'Mr' — salutation/title field
3firstNamehigh[3] values are given names: Ramakrishnarao, ZHANG, govind, Shanmuga, JITENDRA, pintu
4lastNamehigh[4] values are family names: Achanta, YUANCHAO, singh, Chitipiralla, VYAS, kumar
5address1high[5] values are street addresses: 'PLOT NO.176, WARD 12-C', 'model house ldh', 'singjamei thongam leikai'
7cityhigh[7] values are Indian city names: delhi, GANDHIDHAM, ludhiana, hoshangabad, Imphal
8statehigh[8] values are Indian states: Delhi, Gujarat, Punjab, Madhya Pradesh, Manipur
9countryhigh[9] values are 'IND', 'India' — country codes/names
10ziphigh[10] values are Indian PIN codes: 110044, 370201, 141003, 461001, 795008 (6-digit postal codes)
11phonehigh[11] values are 10-digit Indian mobile numbers: 9448066554, 9597227977, 8765730865, 9490843701, 9898038586
12phonehigh[12] alternate phone numbers, same format as [11]: 8765730865, 9560759461

Notes: 13 columns total. Column [0] is numeric user_id (skipped). Column [6] is entirely empty (skipped). Breach context confirms Indian travel platform with Indian addresses, phone numbers, email providers, and cities. All PII columns identified and mapped.

49.csv
10 columns99,300 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] Values contain @ signs and match email format ([email protected], [email protected], etc.)
2suffixhigh[2] Values are salutations/titles: 'Ms', 'Mr.', 'Mr'
3firstNamehigh[3] Values are given names: 'Lisa', 'k james', 'ravi', 'XAVIER', 'Rajasridhar', 'deepi'
4lastNamehigh[4] Values are family names: 'Stenmark', 'Mathai', 'mallian', 'SHERMEILA', 'Lankapalli', 'kaur'
5address1high[5] Values are street addresses: '38 sector A , Ambedkar Colony Govindp...'
7cityhigh[7] Values are city names: 'Bhopal'
8statehigh[8] Values are Indian states: 'Madhya Pradesh'
9countryhigh[9] Values are 'India'
10ziphigh[10] Values are Indian PIN codes: '462023'
11phonehigh[11] Values are 10-digit Indian mobile numbers: '9815825646', '9826361390', etc.

Notes: Yatra.com travel booking breach. Column [0] is numeric user ID (skip). Columns [6] and [12] are empty (skip). No numbered PII columns detected.

5.csv
12 columns99,271 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs, match email format with domains (yahoo.com, gmail.com, yahoo.co.in, yahoo.in)
2suffixhigh[2] header values are titles: 'Mr', 'Mrs.', 'Ms' — salutations/suffixes
3firstNamehigh[3] values are given names (vivek, krushna, SENTHIL, SK, preethy, Harpreetsingh)
4lastNamehigh[4] values are family names (mahajan, sahoo, MURGAN, Joshi, gopaal, Punjabi)
5address1high[5] values are street addresses (railwayrod, IOCL, suryasoma, Heliconia,Magarpatta city,Hadapsar)
6address2medium[6] secondary address component, sparse values (thirunageswaram, JAMMU)
7cityhigh[7] values are Indian city names (kumbakonam, JAMMU, munnar, Pune)
8statehigh[8] values are Indian states (Karnataka, Tamilnadu, Jammu and Kashmir, Kerala, Maharashtra)
9countryhigh[9] values are country codes/names (India, IND, IN)
10ziphigh[10] values are 6-digit Indian PIN codes (612204, 180015, 685612, 411028)
11phonehigh[11] values are 10-digit Indian mobile numbers (8085660666, 9611672288, 8233586063, 9426422681, 9086088447, 9447004303)
12phonehigh[12] alternate phone numbers, 10-digit format (9571003885, 1912460038) — secondary phone field

Notes: Yatra-2019 breach, Indian travel booking platform. Column [0] is numeric user ID (skip). All 13 columns present; 12 contain PII. Data is entirely Indian: addresses, phone numbers, email providers (yahoo.co.in, rediffmail.com), cities, and states confirm Indian origin.

50.csv
11 columns99,367 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs, clear email addresses (gmail.com, yahoo.co.in, hotmail.com)
2suffixhigh[2] header 'Mr', values are salutations (Mr, Mr.)
3firstNamehigh[3] values are given names (julius, Babyrani, jaydeepnarayan, rahul, Ganesh)
4lastNamehigh[4] values are family names (amstrong, Arambam, srivastava, dey, Sahoo)
5address1high[5] values contain street addresses and building numbers (107classic business center, Yatra.Com 1101-03, #503)
6address2medium[6] values appear to be secondary address components (4th Main) or building details
7cityhigh[7] values are Indian cities (bengaluru, Gurgoan)
8statehigh[8] values are Indian states (Karnataka)
9countryhigh[9] values indicate country (India)
10ziphigh[10] values are 6-digit Indian postal codes (560001, 560019)
11phonehigh[11] values are 10-digit Indian mobile numbers (8281255381, 8750788331, 9835021326)

Notes: 13 columns total, 11 contain PII. Column [0] is numeric user_id (skip). Column [12] is empty (skip). Breach context confirms Indian travel booking platform with Indian user data.

51.csv
11 columns99,387 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ symbols, standard email format with Indian domains (gmail.com, yahoo.com)
2suffixhigh[2] values are 'Mr', 'Mr.' — salutation/title prefix
3firstNamehigh[3] header pattern and values are common given names (ganesh, vijay, sreenivasapavankumar, Rushik, KNIRMALA, RS)
4lastNamehigh[4] values are surnames (pillai, bhushan, surey, Patel, NIRMALA, Sharma)
5address1high[5] street addresses with house numbers and locality names (5/89 A PRIYABHAVAN, House No.3977)
7cityhigh[7] values are Indian city names (CHENNAI, Rewari, RAJAHMUNDRY, Gurgoan)
8statehigh[8] values are Indian state/province names (Tamil Nadu, Haryana, Andhra Pradesh, Delhi)
9countryhigh[9] values are 'IND' and 'India' — country code/name
10ziphigh[10] 6-digit postal codes matching Indian PIN code format (629704, 123401, 533101, 110043)
11phonehigh[11] 10-digit numbers matching Indian mobile phone format (9469732301, 9355763377, 9848156621)
12phonehigh[12] 10-digit numbers matching Indian mobile phone format — alternate/secondary phone number

Notes: Yatra.com 2019 breach — Indian travel booking platform. Column [0] is user_id (skipped). Column [6] is empty (skipped). All phone numbers are Indian mobile format (10 digits starting with 9). Addresses and location data all Indian. Two phone columns ([11] and [12]) represent primary and alternate contact numbers.

52.csv
11 columns96,890 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs and are valid email addresses (yahoo.co.in, gmail.com, hotmail.com)
2suffixhigh[2] header empty but values are 'Mr.' and 'Mr', which are salutation titles
3firstNamehigh[3] header empty but values are given names (abubakkar, surendra, Anil, diptakshi, Kamal)
4lastNamehigh[4] header empty but values are family names (siddique, kumar, Kumar, mondal, Bhardwaj)
5address1high[5] header empty but values are street addresses (No.302 moolai streeet, vill-silaich pos-balapur, Yatra.Com 1101-03, 6/1 J M LANE, 12/92 geeta colony)
7cityhigh[7] header empty but values are Indian city names (thiruvannamalai, ghazipur, Kolkata, Gurgoan, Bangalore)
8statehigh[8] header empty but values are Indian states (Tamilnadu, Uttar Pradesh, West Bengal, Karnataka)
9countryhigh[9] header empty but values are 'India' and 'IND', country codes/names
10ziphigh[10] header empty but values are Indian postal codes (606708, 273227, 700008, 560053, 110031)
11phonehigh[11] header empty but values are 10-digit Indian mobile phone numbers (9994420065, 9790434352, 9721771231, 9818363222, 9883078522)
12phonehigh[12] header empty but values are 10-digit Indian phone numbers (9538383882), appears to be alternate/secondary phone

Notes: Yatra 2019 breach: 13 columns total. Column [0] is user_id (skipped). Column [6] contains no header and no sample values (skipped). All other columns contain valid PII. Breach context confirms Indian travel booking platform with Indian addresses, phone numbers, and email providers.

53.csv
11 columns99,489 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs and email domains (gmail.com, yahoo.co.in, rediffmail.com)
2suffixhigh[2] header shows salutation patterns: 'Mr', 'Mr.', 'Mrs'
3firstNamehigh[3] values are common given names (Nishabh, Rahul, ravindra, umesh, Antara, sujan)
4lastNamehigh[4] values are family names (Jauhari, Jadon, limay, sharma, Bhattacharjee, saha)
5address1high[5] values contain street addresses and building numbers (Yatra.Com 1101-03, gazna,east bishnupur,24pg)
7cityhigh[7] values are Indian cities (Gurgoan, kolkata)
8statehigh[8] values are Indian states (West Bengal)
9countryhigh[9] values are country codes/names (IND, India, Other)
10ziphigh[10] values are 6-digit Indian PIN codes (743273)
11phonehigh[11] values are 10-digit Indian mobile numbers (9003226678, 9329294999, 9224582575, 9158989735, 9830595722)
12phonehigh[12] values are 10-digit Indian phone numbers (alternate/secondary phone)

Notes: Yatra.com travel booking platform breach (2019). File contains 13 columns; 11 contain PII. Column [0] is user_id (skip), column [6] is empty (skip). Data is entirely Indian: Indian addresses, phone numbers, email providers (rediffmail.com, yahoo.co.in), cities (Kolkata, Gurgaon), and states (West Bengal). One record confirms 'Yatra Office' in Gurgaon.

54.csv
7 columns99,436 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs and match email format (yahoo.com, gmail.com, ingvysyabank.com)
2suffixhigh[2] header 'Mr', all values are salutations/titles
3firstNamehigh[3] values are common given names (Anil, Ravindra, Shiladitya, Ashish, Nikhil)
4lastNamehigh[4] values are common surnames (Sharma, Mate, Nag, Pahuja, Ahire)
5address1medium[5] sample value 'Yatra.Com 1101-03' matches street/building address pattern
7citymedium[7] sample value 'Gurgoan' (variant of Gurgaon) is an Indian city name
11phonehigh[11] values are 10-digit numbers matching Indian mobile phone format

Notes: 13 columns total. Column [0] is user_id (skip). Columns [6], [8], [9], [10], [12] are empty across samples (skip). Breach context confirms Indian travel platform with Indian addresses, phone numbers, and email providers.

55.csv
11 columns98,005 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] header '[email protected]' is an email address; all values contain @ signs and are valid email addresses
2suffixhigh[2] values are 'Mr', 'Mrs.', 'Mr.' — salutation/title suffixes
3firstNamehigh[3] values are given names: 'Akshaynidhi', 'Nima', 'shipinder', 'Mastinder Nath'
4lastNamehigh[4] values are family names: 'Rathore', 'Galjan', 'kaur', 'Yadav', 'AUGUSTIN'
5address1high[5] values are street addresses: 'NISHA RESIDENCY 1001', '34,anantya apartments,naththam link r...'
7cityhigh[7] values are Indian cities: 'jalandhar', 'GoregaonMumbai', 'chennai'
8statehigh[8] values are Indian states: 'Punjab', 'Maharashtra', 'Tamil Nadu'
9countryhigh[9] values are country codes/names: 'India', 'IND'
10ziphigh[10] values are Indian PIN codes: '144002', '400062', '603103'
11phonehigh[11] values are 10-digit Indian mobile numbers: '9234667050', '8684012345', '9036721822'
12phonehigh[12] alternate phone number; values are 10-digit Indian mobile numbers: '9867673042'

Notes: Yatra.com 2019 breach. Column [0] is numeric user ID (skip). Column [6] is empty (skip). All PII columns identified: email, name components (firstName, lastName, suffix), full address (address1, city, state, country, zip), and dual phone numbers.

56.csv
11 columns99,444 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] Values contain @ signs and match email format (yahoo.com, yahoo.co.in, rediffmail.com, gmail.com)
2suffixhigh[2] Values are 'Mr', 'Mr.', 'Mr' — salutation/title prefix
3firstNamehigh[3] Values are common given names (Hardik, surajit, PARTH, Nalin, riyaz)
4lastNamehigh[4] Values are surnames (Kapadi, singh, VERMA, Shah, Khan)
5address1high[5] Values contain street addresses and building references (Yatra.Com 1101-03, C/ 803 Shree Niketan, Guwahati assam)
7cityhigh[7] Values are Indian city names (Gurgoan, guwahati, Mumbai)
8statehigh[8] Values are Indian states (Assam, Maharashtra)
9countryhigh[9] Values are 'IND' and 'India' — country code/name
10ziphigh[10] Values are 6-digit postal codes (788736, 400067) — Indian PIN codes
11phonehigh[11] Values are 10-digit numbers matching Indian mobile phone format
12phonehigh[12] Values are 10-digit numbers — alternate/secondary phone number

Notes: Yatra 2019 breach. 13 columns total, 12 contain PII. Column [0] is numeric user ID (skip). Column [6] is empty (skip). Data contains Indian user records with email, names, addresses, and phone numbers from major Indian travel booking platform.

57.csv
13 columns99,499 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
0skiphigh[0] Numeric sequential IDs (5437935, 5437936, etc.) — internal user/record identifiers
1emailhigh[1] All values contain @ symbol and match email format (gmail.com, infosys.com, yahoo.co.in)
2suffixhigh[2] Values are 'Mr', 'Mr.', 'Ms.' — salutation/title indicators
3firstNamehigh[3] Values are given names: Rajendra, srinivasa, KUNAL, idrish, vinit, Komal
4lastNamehigh[4] Values are surnames: Shende, karanam, SINHA, khan, kumar, Tak
5address1high[5] Values are street addresses: 'gandhidham', 'Bhaironath varanasi', 'Ramdas Colony,Ram nagar vistar, Sodala'
6skiphigh[6] All values empty/blank across sample
7cityhigh[7] Values are Indian cities: bhuj, varanasi, Jaipur
8statehigh[8] Values are Indian states: Uttar Pradesh, Rajasthan
9countryhigh[9] Values are 'IND' and 'India' — country codes/names
10ziphigh[10] Values are 6-digit Indian PIN codes: 370110, 221001, 302006
11phonehigh[11] Values are 10-digit Indian mobile numbers: 8975657235, 9987303997, 9930566628, etc.
12phonehigh[12] Values are 10-digit Indian phone numbers (alternate/secondary phone): 9099908462, 8795822111

Notes: Yatra 2019 breach — Indian travel platform. 13 columns total, 11 contain PII (email, names, full address, two phone fields). Columns 0 and 6 are non-PII (user_id and empty field). Breach context confirms Indian addresses, phone numbers, and email providers.

58.csv
6 columns99,502 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs and email domains (yahoo.com, gmail.com, rediffmail patterns); matches email format
2suffixhigh[2] values are 'Mr.' and 'Mr' — salutation/title fields typical in breach context
3firstNamehigh[3] header 'Abha' with values like 'abhik', 'Karthik', 'Jaz', 'ABDUL', 'Ram' — all common given names
4lastNamehigh[4] header 'Bhatnagarsaluja' with values like 'bajaj', 'Gurumoorthy', 'Manak', 'LEYMAN', 'Unrivaled' — surname pattern
9countryhigh[9] value 'India' matches country field; breach context confirms all Indian records
11phonehigh[11] values are 10-digit numbers (9412522621, 858606047, 8884182631, 9415131305, 6593847985, 9820384897) — Indian mobile phone format

Notes: Yatra-2019 Indian travel booking breach. Column [0] is numeric user_id (skip). Columns [5–8], [10], [12] are empty across all samples. Column [2] contains salutation/prefix (suffix field). 6 PII columns identified out of 13 total.

59.csv
5 columns99,367 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs and valid email addresses (Gmail, Yahoo, corporate domains)
2suffixhigh[2] header/values are salutations: 'Mr', 'Miss' — standard title/suffix indicators
3firstNamehigh[3] values are common given names (Sushil, Shirish, Ashish, Shubham, Chandreyi, satyabrata)
4lastNamehigh[4] values are surnames (Gautam, Chandavarkar, Agarwal, Singh, daklai, Das sharma)
11phonehigh[11] values are 10-digit Indian mobile numbers (8294222522, 8587078399, 9833554423, etc.)

Notes: File contains 13 columns. Column [0] is a numeric user ID (skip). Columns [5-10] and [12] are empty in all sample rows (skip). This is a structured PII breach from Yatra.com containing Indian user records with email, name, title, and phone data.

6.csv
12 columns99,321 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] header 'email', values contain @ signs and valid email addresses (gmail.com, yahoo.com, rediffmail.com)
2suffixhigh[2] header 'salutation', values are titles: 'Mr', 'Mrs'
3firstNamehigh[3] header 'first_name', values are common given names: Vidhu, binod, Sanjay, madderla, abiskar
4lastNamehigh[4] header 'last_name', values are surnames: Sood, kejriwal, Shinde, sathaiah, sinha
5address1high[5] header 'address1', values are street addresses with building numbers and street names
6address2high[6] header 'address2', values are secondary address components (apartment/area details)
7cityhigh[7] header 'city', values are Indian city names: Ghaziabad, muzaffarpur, Vasai, viziyanagaram, kolkata
8statehigh[8] header 'state', values are Indian state names: Andhra Pradesh, Uttar Pradesh, bihar, Maharashtra, west bengal
9countryhigh[9] header 'country', values are country codes: 'IND', 'IN'
10ziphigh[10] header 'zip', values are 6-digit Indian postal codes: 516001, 201005, 842001, 401208
11phonehigh[11] header 'phone', values are 10-digit Indian mobile numbers: 9966785475, 9971471113, 9431238778
12phonehigh[12] header 'alternate_phone', values are Indian phone numbers (mobile and landline formats): 0120-4134190, 8922223222, 9830325718

Notes: 13 columns total, 12 contain PII. Column [0] is user_id (numeric identifier, skipped). Breach context confirms this is Yatra 2019 Indian travel booking platform data with Indian addresses, phone numbers, and email providers.

60.csv
12 columns99,302 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
0skiphigh[0] Numeric user IDs (5547528, 5547529, etc.) — internal customer identifiers
1emailhigh[1] Values contain @ symbol and are valid email addresses ([email protected], [email protected], etc.)
2suffixhigh[2] Salutation/title values: 'Ms', 'Mr', 'Mr.' — formal address prefixes
3firstNamehigh[3] Common given names (ranjitha, Reinhard, Abdul, ELAKKIYAVENDAN, OZA, Sameer)
4lastNamehigh[4] Common family names (nayak, Buck, Jabbar, RAJENDRAN, PUNAMCHAND, Belusonti)
5address1high[5] Street address values (e.g., '18-7-312/1d/43 Aman Nagar') — mailing street/first address line
7cityhigh[7] Indian city names (Hyderabad, srikakulam)
8statehigh[8] Indian state names (Andhra Pradesh, Andhra Pradesh)
9countryhigh[9] Country code/name values (IND, India)
10ziphigh[10] Indian PIN/postal codes (500023, 532001)
11phonehigh[11] 10-digit Indian mobile/phone numbers (8105003571, 9059060173, 9944420624, etc.)
12phonehigh[12] Alternate phone numbers — 10-digit Indian mobile format (9059060173)

Notes: Yatra 2019 breach — Indian travel booking platform. Column 6 is empty/unused in all samples. 11 PII columns identified (email, suffix, firstName, lastName, address1, city, state, country, zip, phone, phone). Columns 0 and 6 are skipped (user_id and empty column respectively).

61.csv
11 columns99,376 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs and email domains (gmail.com, rediffmail patterns)
2suffixhigh[2] header/values are 'Mr', 'Mr.', 'Ms' — salutation titles
3firstNamehigh[3] values are common given names (Dhananjay, Ananthula, Pankaj, Veera, Dhairya, swetha)
4lastNamehigh[4] values are surnames (Kumar, Giridhar, kumar, manikandan, Ashar, s)
5address1high[5] values are street addresses (Yatra Office, A-7 Zavernagar Society)
7cityhigh[7] values are Indian cities (Gurgaon, vadodara)
8statehigh[8] values are Indian states (Haryana, Gujarat)
9countryhigh[9] values are 'India', 'IND' — country codes/names
10ziphigh[10] values are Indian PIN codes (122003, 390022)
11phonehigh[11] values are 10-digit Indian mobile numbers (9002135246, 9866559484, etc.)
12phonehigh[12] alternate/secondary phone field, 10-digit Indian mobile numbers

Notes: Yatra 2019 breach dataset. Column [0] is numeric user_id (skip). No header row present. All addresses, phone numbers, and email domains are Indian. 11 of 13 columns contain PII.

62.csv
10 columns98,134 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] header '[email protected]' is an email address; all sample values contain @ and are valid email addresses
2suffixhigh[2] values are 'Mr', 'Mr.', which are common salutation/title suffixes
3firstNamehigh[3] values are 'Amit', 'sakib', 'jahir', 'Aun' — common given names
4lastNamehigh[4] values are 'kumar', 'shaikh', 'abbas', 'Alia' — common family names
5address1high[5] values are street addresses like 'vill+post.=saryan', 'House no1 & 2 Gali no 1', '110052' (mixed with postal codes)
6phonehigh[6] value '8588851512' is a 10-digit number matching Indian mobile phone format
8statehigh[8] value 'Uttar Pradesh' is an Indian state name
9countryhigh[9] value 'India' is a country name
10ziphigh[10] value '277121' is a 6-digit Indian PIN/postal code
11phonehigh[11] values are 10-digit numbers ('9699545608', '9812195202', etc.) matching Indian mobile phone format; alternate phone number

Notes: 13 columns total. Column [0] contains user IDs (numeric, skip). Column [7] appears to be a username or login handle ('baliia', skip as non-standard). Column [12] is empty. Breach context confirms Indian travel booking platform with Indian addresses, phone numbers, and email providers.

63.csv
11 columns98,283 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] header '[email protected]' is an email; all values are email addresses with @ signs
2suffixhigh[2] header 'Mr.' contains salutation/title; values are 'Mr', 'Mr.' — common suffixes/titles
3firstNamehigh[3] header 'rambhajan' is a given name; values are common first names (kranthi, Omkar, veer, dilshad)
4lastNamehigh[4] header 'kumawat' is a surname; values are family names (Chennamsetty, Mohite, sharma, alam, Gholap)
5address1high[5] header is a street address; values contain street addresses with house numbers, lane names, and building details
7cityhigh[7] header 'pune' is an Indian city; values are Indian cities (Gurgoan, NEW DELHI, Vapi)
8statehigh[8] header 'Maharashtra' is an Indian state; values are Indian states (Maharashtra, Gujarat)
9countryhigh[9] header 'India' is a country; all values are 'India' or 'IND'
10ziphigh[10] header '411033' is an Indian postal code (PIN); values are 5-6 digit Indian PIN codes (110019, 396191)
11phonehigh[11] header '9049494641' is a 10-digit phone number; all values are 10-digit Indian mobile numbers
12phonehigh[12] values are 10-digit Indian mobile numbers or empty; alternate/secondary phone field

Notes: 13 columns total, 10 contain PII. Column [0] is numeric user ID (skip). Column [6] is empty (skip). Breach confirmed as Yatra 2019 — Indian travel booking platform with Indian addresses, phone numbers, email providers, and city/state references.

64.csv
11 columns99,469 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs, clearly email addresses (gmail.com, live.com, northwestern.edu)
2suffixhigh[2] values are 'Mr', 'Mr.', 'Mr' — salutation/title field
3firstNamehigh[3] values are given names (Murali, SUBHASIS, Edmond, Sandeep, venkat)
4lastNamehigh[4] values are family names (Mohan, BOSE, Ferrao, Goel, sundar)
5address1high[5] header context + sample values show street addresses (chakdaha kanthalpuli, kannppa bilding)
7cityhigh[7] values are Indian city names (kolkata, banglore)
8statehigh[8] values are Indian states (West Bengal, Karnataka)
9countryhigh[9] values are 'IND', 'India' — country codes/names
10ziphigh[10] values are 6-digit Indian PIN codes (741222, 560036)
11phonehigh[11] values are 10-digit Indian mobile numbers (9676438881, 7407535001, 9642525252)
12phonehigh[12] values are 10-digit Indian mobile numbers — alternate/second phone number

Notes: 13 columns total. Breach context confirms Yatra (Indian travel platform) with Indian addresses, Indian phone numbers (10-digit format), Indian cities/states, Indian PIN codes. Column [0] is numeric user ID (skip). Column [6] is empty. Columns [5], [7], [8], [9], [10] together form complete Indian postal address. Columns [11] and [12] are both phone numbers.

65.csv
10 columns99,397 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ symbols and email domains (gmail.com, rediffmail.com patterns consistent with Indian travel booking)
2suffixhigh[2] values are 'Mr.' / 'Mr' — salutation/title prefix
3firstNamehigh[3] header position and values are common given names (vaibhav, Parag, john, Mohd, CEENA, DULAL)
4lastNamehigh[4] values are surnames (dubey, Mital, thapa, Ahmad, JOSEPH, SHIL)
5address1high[5] values are street addresses ('Srikona daily bazar', 'Yatra.Com 1101-03' — Gurgaon office reference)
7cityhigh[7] values are Indian city names (silchar, Gurgoan/Gurgaon)
8statehigh[8] values are Indian states (Assam)
9countryhigh[9] values are 'India' — country field
10ziphigh[10] values are 6-digit Indian PIN codes (788001)
11phonehigh[11] values are 10-digit numbers consistent with Indian mobile phone format

Notes: 13 columns total; 11 contain PII (email, suffix, firstName, lastName, address1, city, state, country, zip, phone). Columns [0], [6], [12] are empty or internal identifiers (skip). Breach context confirms Indian travel platform (Yatra.com) with user account records including addresses, emails, names, and phone numbers.

66.csv
10 columns99,299 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs, match email pattern (gmail.com, yahoo.co.in, etc.)
2suffixhigh[2] values are 'Mr', 'Mrs', 'Mr.' — salutation/title indicators
3firstNamehigh[3] header position and values are common Indian given names (Nagaraj, Ramaswamy, Abhijit, geeta, Sudhir, viswanath)
4lastNamehigh[4] values are surnames following first names (Sitaram, Ragu, nair, singh, Malik, ravikumar)
5address1medium[5] mostly empty but sample value 'Yatra Office' indicates street/mailing address line
7cityhigh[7] mostly empty but sample 'Gurgaon' is Indian city name
8statehigh[8] mostly empty but sample 'Haryana' is Indian state
9countryhigh[9] mostly empty but sample 'India' confirms country field
10ziphigh[10] mostly empty but sample '122003' matches Indian PIN code format (5-6 digits)
11phonehigh[11] values are 10-digit numbers matching Indian mobile phone format (9739166243, 9873422360, etc.)

Notes: 13 columns total. Yatra.com travel booking platform breach from 2019 containing Indian user account records. Column [0] contains numeric user IDs (skip). Columns [6] and [12] are empty (skip). Breach context confirms Indian addresses, phone numbers, email providers consistent with India.

67.csv
5 columns98,098 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ symbols and match email format (gmail.com, rediffmail.com, etc.)
2suffixhigh[2] header 'Mr', values are salutations/titles (Mr, Ms)
3firstNamehigh[3] values are common given names (Sanjay, UNNIKRISHNA, Vicky, monika, sk, Atul)
4lastNamehigh[4] values are surnames (Sanjay, PANICKER, Hans, chandra, azijul, Tandon)
11phonehigh[11] values are 10-digit Indian mobile numbers (9431092293, 9869167114, 9213262144, etc.)

Notes: File contains 13 columns; 5 contain PII. Columns [0] is user_id (skip), columns [5-10, 12] are empty (skip). No address fields present in this sample despite breach context mentioning addresses — those may be in other files or later columns.

68.csv
10 columns99,548 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ symbols and are email addresses ([email protected], [email protected], etc.)
2suffixhigh[2] values are titles/salutations (Mr, Ms, Mr., etc.)
3firstNamehigh[3] values are given names (Kapur, GEORGE, Susannah, Liyaqat, ankit, Samyak)
4lastNamehigh[4] values are family names (Shukla, WILLIAM, Robinson, Khan, prajapati, Jain)
5address1high[5] values are street addresses (Yatra Office, kadipur sultanpur uttar pradesh, Yatra.Com 1101-03)
7cityhigh[7] values are Indian city names (Gurgaon, sultanpur, Gurgoan)
8statehigh[8] values are Indian states (Haryana, Uttar Pradesh)
9countryhigh[9] values are country (India)
10ziphigh[10] values are Indian postal codes (122003, 228145)
11phonehigh[11] values are 10-digit Indian mobile phone numbers (7709460812, 9709230221, 7841631742, etc.)

Notes: Breach context confirms Yatra.com Indian travel platform. File contains user account records with names, email addresses, addresses (street, city, state, country, PIN), and mobile phone numbers. Column [0] is numeric user ID (skip). Columns [6] and [12] are empty (skip).

69.csv
10 columns99,358 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ symbol and match email address format ([email protected], [email protected], etc.)
2suffixhigh[2] values are salutations/titles: 'Mrs', 'Mr', 'Mrs.', 'Miss', 'Mr.' — standard name suffixes
3firstNamehigh[3] values are given names (MALATHY, KANDURU, Meenu, Mehak, surya, Mary) positioned after salutation
4lastNamehigh[4] values are family names (PILLAI, MAJHI, Bhullar, Sharma, khatri, kerketta) positioned after first name
5address1high[5] values are street addresses (F-21 GALI NO. 61 LAKSHMI NAGAR, 685/8 nanda nagar, a-13 Coral Block Vatika Green City)
7cityhigh[7] values are Indian city names (indore, Jamshedpur) — city field in address sequence
8statehigh[8] values are Indian states (Uttar Pradesh, Madhya Pradesh, Jharkhand) — state field after city
9countryhigh[9] values are 'India' — country field in address hierarchy
10ziphigh[10] values are Indian PIN codes (452001, 831018) — numeric postal codes
11phonehigh[11] values are 10-digit Indian mobile numbers (9427623326, 9496559666, 9873653307, 7534078000, 9165077070)

Notes: Yatra 2019 breach — Indian travel booking platform. Column [0] is numeric user ID (skip). Columns [6] and [12] are empty across all samples (skip). All remaining columns contain valid PII: email, name components (surname/title/first/last), complete Indian address (street, city, state, country, PIN), and mobile phone. No passwords, SSNs, or DOB fields present in this extract.

7.csv
11 columns99,473 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ symbols and match email address format ([email protected], [email protected], etc.)
3suffixhigh[3] values are 'Mr' and 'Ms' — salutation/title indicators
4firstNamehigh[4] header context + values are common given names (Parag, sachin, Sudhir, gopal, Ajit, kalpana)
5lastNamehigh[5] values are surnames (Shah, sirohi, Singh, solanki, Gandhi, devi)
6address1high[6] values match street address format (193, h-8 , iitb , mumbai; flat no 106 sector 29; etc.)
7address2high[7] values are secondary address components or locality names (m.r.nagar,kodungaiyur; Sukchar; Lower Parel West; etc.)
8cityhigh[8] values are Indian city names (mumbai, noida, Mumbai, chennai, Kolkata)
9statehigh[9] values are Indian state names (Maharashtra, Uttar Pradesh, Tamil Nadu, West Bengal)
10countryhigh[10] all values are 'IND' — country code for India
11ziphigh[11] values are 6-digit Indian PIN codes (400076, 201301, 400056, 600118, 700115)
12phonehigh[12] values are 10-digit Indian mobile numbers (9924013901, 9920740849, 9910013923, etc.)

Notes: 13 columns total. Column [0] is a numeric user_id (skip — internal identifier). Column [2] appears to be a password or authentication token (values are alphanumeric strings like 'Trg9EQl', '96385274', 'ramjanee', etc.) — mapped as password. Breach context confirms Indian travel platform with Indian addresses, phone numbers, and email providers.

70.csv
8 columns98,503 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs and email domains (gmail.com, yahoo.com, rediffmail patterns)
2suffixhigh[2] values are 'Mr.', 'Dr.' — salutation/title indicators
3firstNamehigh[3] values are given names (davinder, bhargavi, PR RAVI, Saket)
4lastNamehigh[4] values are family names (vashisht, ravi, SHANTHI, Jha)
5address1high[5] values contain street addresses and location data (Yatra Office address, street address placeholders)
6phonehigh[6] value '8870538888' is 10-digit Indian mobile number
9countryhigh[9] value is 'India'
11phonehigh[11] values are 10-digit Indian mobile numbers (9448614523, 9041134608, etc.)

Notes: 13 columns total. Column [0] appears to be user_id (skip). Columns [7], [8], [10], [12] are empty or non-PII. Column [2] contains title/suffix mixed with some geographic data ('Tamil Nadu', 'Gurgaon') but primary values are salutations. Two phone columns detected ([6] and [11]) — both valid Indian mobile numbers. Indian context confirmed by address formats, phone patterns (10-digit mobile), and country field.

71.csv
10 columns99,400 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs and are clearly email addresses (rediffmail.com, gmail.com, icicibank.com domains)
2suffixhigh[2] values are 'Mr.', 'Mr' — salutation/title indicators
3firstNamehigh[3] values are given names (Geetha, RAMANUJA, seetharaman, teja, Raqib)
4lastNamehigh[4] values are surnames or family names (Murugaiah, P, AVR, ananthakrishnan, rayasam, Baba)
5address1high[5] values contain street addresses (e.g., '117/H-1/255,Model town,Pandu Nagar', 'Kukatpally')
7cityhigh[7] values are Indian city names (hyderabad, Kanpur)
8statehigh[8] values are Indian states (Andhra Pradesh, Uttar Pradesh)
9countryhigh[9] values are 'India' — country field
10ziphigh[10] values are 6-digit Indian PIN codes (500072, 208005)
11phonehigh[11] values are 10-digit Indian mobile phone numbers

Notes: 13 columns total. Column [0] contains numeric user IDs (skip). Columns [6] and [12] are empty (skip). Column [5] may also contain address2 data (apartment/suite info embedded), but mapping as address1. File has no header row.

72.csv
11 columns99,584 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs and match email format (gmail.com, aol.pk, rediffmail patterns)
2suffixhigh[2] header 'Mr.' / values are salutations (Mr, Mr.)
3firstNamehigh[3] header 'abhijit', values are common given names (Shoueeb, Paras, Dileep, gaurav, Andrea)
4lastNamehigh[4] header 'mukherjee', values are surnames (Dar, Banka, Gunda, kumar, ketherton)
5address1high[5] values are full street addresses (35/6 kayastha para main road kol-700078, etc.)
7cityhigh[7] header 'kolkata', values are Indian city names (Gurgoan, Gurgaon, delhi)
8statehigh[8] header 'West Bengal', values are Indian states/territories (Haryana, National Capital Territory of Delhi)
9countryhigh[9] header 'India', all values are 'India'
10ziphigh[10] header '700078', values are 6-digit Indian PIN codes (121002, 110059)
11phonehigh[11] values are 10-digit Indian mobile numbers (9433852491, 9643313902, 8013120406)
12phonehigh[12] alternate phone field, values are 10-digit numbers (8970353472) or empty

Notes: Yatra 2019 breach. 13 columns total, 11 contain PII. Column [0] is numeric user_id (skipped). Column [6] appears empty across all samples (skipped). Data is entirely Indian: Indian addresses, phone numbers, states, PIN codes, and email providers confirm Indian user base.

73.csv
5 columns99,584 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ symbols and match email format ([email protected], [email protected], etc.)
2suffixhigh[2] values are 'Mr', 'Mr.' — salutation/title indicators
3firstNamehigh[3] header context and values are common given names (Sai, Rajib, Manjesh, Monindersingh, VIJAY, Animesh)
4lastNamehigh[4] header context and values are surnames (Varma, Banerjee, Thomas, Vasant, KUMAR, Dey)
11phonehigh[11] values are 10-digit Indian mobile phone numbers (9494774499, 8986880570, 9946031800, etc.)

Notes: 13 columns total, 5 contain PII. Column [0] is numeric user_id (skip). Columns [5-10] and [12] appear empty in samples and are skipped. Breach context confirms Indian travel platform with user account data including emails, names, phone numbers.

74.csv
9 columns99,550 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs and match email address format ([email protected], [email protected], etc.)
2suffixhigh[2] values are 'Mr.' — standard salutation/title prefix
3firstNamehigh[3] values are common given names (Mohmmed, Vikas, Misato, bipinbhai, sarita)
4lastNamehigh[4] values are surnames (Muazzam, Gupta, Higashi, padhiyar, rewri)
5address1high[5] value 'B-188/2' matches street address format common in Indian addresses
8statehigh[8] value 'National Capital Territory of Delhi' is an Indian state/territory
9countryhigh[9] values are 'India', 'Other' — country field
10ziphigh[10] value '110009' is a 6-digit Indian postal code (PIN)
11phonehigh[11] values are 10-digit Indian mobile numbers (9940940671, 9827122704, etc.)

Notes: 12 columns total. Column [0] is numeric user ID (skip). Columns [6] and [7] are empty in sample and skipped. Breach context confirms Indian travel booking platform with user account data including addresses, phones, and emails.

75.csv
5 columns98,894 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] header is email address, values contain @ symbol and match email format (gmail.com, jd.com domains)
2suffixhigh[2] header is empty/blank but values contain 'Ms.' which is a salutation/title suffix
3firstNamehigh[3] header is empty/blank but values are common given names (YASIN, Nitesh, Anuradha, mohammed, Yati)
4lastNamehigh[4] header is empty/blank but values are surnames (MOHIDEEN, ABDUL RAZAK, Marjara, Tiwari, khaliq, Bawri)
11phonehigh[11] header is empty/blank but values are 10-digit Indian mobile phone numbers (7598208158, 9438262776, 8447963651, 9177485105, 6392581470)

Notes: 13 columns total. Columns [0] (numeric user ID), [5-10] (empty), [12] (empty) are skipped as non-PII or unpopulated. Yatra 2019 Indian travel booking breach; phone numbers and email addresses confirm Indian origin.

76.csv
7 columns99,445 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] Values contain @ symbols and are clearly email addresses (gmail.com, yahoo domains, corporate emails)
2suffixhigh[2] Values are titles: 'Mr', 'Mrs.', 'Mr.' — salutation/suffix field
3firstNamehigh[3] Values are given names: Balaji, Neha, SHIV, ravi, Pranav, Amrita
4lastNamehigh[4] Values are family names: Subramanian, Maney, YADAV, kumar, Punjabi, Rai
5address1medium[5] Sample value 'Yatra.Com 1101-03' suggests address/building information; mostly empty but when populated contains address data
7citymedium[7] Sample value 'Gurgoan' (variant spelling of Gurgaon, Indian city); city field mostly empty but populated values are city names
11phonehigh[11] Values are 10-digit Indian mobile numbers: 8451900852, 9743666660, 9950595563, 7396560040, 9538106042, 8238391804

Notes: Yatra-2019 breach (Indian travel booking platform). Column [0] appears to be user/record ID (skip). Columns [6], [8], [9], [10], [12] are entirely empty in sample and unmapped. No header row provided; analysis based on value patterns. Data is consistently Indian (mobile numbers, cities, email providers).

77.csv
11 columns99,358 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] Values contain @ signs and email domains (gmail.com, rediffmail.com, yahoo.com, ymail.com)
2suffixhigh[2] Values are 'Mr.', 'Mr', 'Ms' — salutation/title indicators
3firstNamehigh[3] Values are common given names (daulat, SHIVASHANKAR, Parkash, abhishek, GOVIND)
4lastNamehigh[4] Values are surnames (bhansali, MUGALI, Shankar, sharma, JAGIRDAR)
5address1high[5] Values are street addresses (e.g., '1/21 champa nagar beawar')
6address2medium[6] Column typically follows address1, though values appear empty in sample
7cityhigh[7] Values are city names (beawar)
8statehigh[8] Values are Indian states (Rajasthan)
9countryhigh[9] Values are country names (India)
10ziphigh[10] Values are 6-digit Indian postal codes (305901)
11phonehigh[11] Values are 10-digit Indian mobile phone numbers (9414009587, 8586968292, etc.)

Notes: 13 total columns. Column [0] is numeric user ID (skip). Column [12] is empty (skip). Breach context confirms Indian travel booking platform with Indian addresses, phone numbers, and email providers.

78.csv
10 columns98,468 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs and match standard email format (Gmail, Hotmail, Yahoo)
2suffixhigh[2] header 'Mr.' and values are salutations/titles (Mr, Mr)
3firstNamehigh[3] values are common given names (naramalla, Tribhuwannath, Yuthika, APARNA, Bunty)
4lastNamehigh[4] values are surnames (purushotham, Srivastava, Kathuria, RAWAL, Thokchom)
5address1high[5] values are street addresses with house numbers and road names (h.no.22-118/3,saikunta road, chunnambattiwada,mancherial)
7cityhigh[7] values are Indian city names (mancherial)
8statehigh[8] values are Indian state names (Andhra Pradesh)
9countryhigh[9] values are country names (India)
10ziphigh[10] values are Indian postal codes/PIN codes (504208)
11phonehigh[11] values are 10-digit Indian mobile phone numbers (9989145950, 8197557799, 9001714000, 9538916585)

Notes: Yatra.com travel booking platform breach. Column [0] is numeric user ID (skip). Column [6] and [12] are empty (skip). This is a clean Indian user database with addresses, contact info, and names. All PII columns identified.

79.csv
10 columns99,524 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs, standard email format (gmail.com, yahoo.co.in, yahoo.com)
2suffixhigh[2] header 'Mr', values are salutations/titles
3firstNamehigh[3] values are given names (Suresh, AMIIT, adf, Gopal singh, SUNIL, tarun)
4lastNamehigh[4] values are family names (Kumar, AMAR, basdas, Dasana, NAGORI, bhatnagar)
5address1high[5] sample shows 'TATA Consulting Engineers Limited, A-...' — company/street address format
7cityhigh[7] sample value 'Kolkata' is an Indian city
8statehigh[8] sample value 'West Bengal' is an Indian state
9countryhigh[9] sample value 'India'
10ziphigh[10] sample value '700091' is a 6-digit Indian PIN code
11phonehigh[11] values are 10-digit Indian mobile phone numbers (8378982262, 7042265840, 9999999999, etc.)

Notes: Yatra 2019 breach — Indian travel booking platform. Column [0] is numeric user_id (skip). Columns [6] and [12] are empty. No DOB, SSN, username, password, or other sensitive PII detected in sample.

8.csv
12 columns98,921 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] Values contain @ signs, match email format (rediffmail.com, gmail.com, hotmail.com)
2suffixhigh[2] Header 'Mr', values are salutations (Mr, Mr., etc.)
3firstNamehigh[3] Values are common given names (Abhishek, DILIP, Arun, Vipin, nikhil)
4lastNamehigh[4] Values are surnames (Jain, PATIL, Jayaraman, Katiyar, vaibhav)
5address1high[5] Street addresses (Goyal Nagar, patil hospital, No 24 2nd Cross St, Yatra Office)
6address2high[6] Supplementary address info (near lalbag road, weavers colony)
7cityhigh[7] Indian city names (Indore, karad, Venkata Nagar, Gurgaon, banglore, ranchi)
8statehigh[8] Indian states (Madhya Pradesh, Maharashtra, Pondicherry, Haryana, Karnataka, Jharkhand)
9countryhigh[9] Country codes/names (IND, India)
10ziphigh[10] Indian PIN codes (452001, 415110, 605011, 122003, 560027, 834003)
11phonehigh[11] 10-digit Indian mobile numbers (9926478790, 9822031498, 9443500362, 8446340001)
12phonehigh[12] Alternate phone numbers, 10-digit or shorter (9922955171, 22238788, 9431079981, 4442055381)

Notes: Yatra 2019 breach. 13 columns total, 12 contain PII (column 0 is user_id, skipped). All data is Indian: Indian addresses, Indian phone numbers, Indian email providers (rediffmail.com, yahoo.co.in), Indian cities/states, country code IND. Column 0 contains auto-incremented numeric IDs. Columns 11 and 12 both map to phone (primary and alternate contact numbers).

80.csv
5 columns99,594 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] Values contain @ signs and are clearly email addresses (gmail.com, yahoo.com domains)
2suffixhigh[2] Values are 'Mr', a salutation/title prefix
3firstNamehigh[3] Values are common given names (Sahil, Rehana, Nikhil, Amita, Hiten, harsh)
4lastNamehigh[4] Values are surnames (Jain, Islam, Patel, Kumari, Karani, shukla)
11phonehigh[11] Values are 10-digit Indian mobile numbers (9034567082, 9830674770, 8123895937, etc.)

Notes: 12 columns total. [0] is numeric user ID (skip). [5-10] are empty columns (skip). Columns [1,3,4,11] contain core PII. Column [2] contains salutation. Indian phone numbers and email providers confirm breach context.

81.csv
5 columns99,368 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ symbol and match email format (yahoo.com, rediffmail.com, gmail.com, hotmail.com)
2suffixhigh[2] values are 'Mr' and empty strings, consistent with salutation/title field
3firstNamehigh[3] values are common given names (Naga, jayadeep, Mayank, sri, Rachit, rabin)
4lastNamehigh[4] values are surnames (Vemula, uppalapati, Prakash, ch, Mehrotra, ray)
11phonehigh[11] values are 10-digit Indian mobile phone numbers (9491328661, 9930468956, etc.)

Notes: Yatra-2019 travel booking platform breach. Column [0] is numeric user ID (skip). Columns [5-10] and [12] are empty in sample rows (likely address/demographic fields with no data in these particular records, cannot be reliably mapped from empty values alone). Breach context confirms Indian user data with phone numbers, emails, and names.

82.csv
5 columns99,503 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs, match email format (gmail.com, rediffmail.com patterns)
2suffixhigh[2] values are 'Mr.' — salutation/title indicator
3firstNamehigh[3] values are common given names (Alex, Raju, Ravi, Shameer, arun, Manjeet)
4lastNamehigh[4] values are surnames (Thomas, chirra, Puri, Hakk, Saminathan, Dabas)
11phonehigh[11] values are 10-digit Indian mobile numbers (8893955294, 8754514873, etc.)

Notes: 12 columns total. Column [0] is numeric user_id (skip). Columns [5-10] are empty or contain no sample values (skip). Yatra-2019 Indian travel booking platform breach: email addresses, names, salutations, and Indian phone numbers confirmed.

83.csv
5 columns99,492 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs and match email format (gmail.com, rediffmail.com, etc.)
2suffixhigh[2] values are 'Mr', 'Ms.' — standard salutation/title suffixes
3firstNamehigh[3] header context and values are common given names (Suman, Tapan, Harsha, Anuj, ashish, Divya)
4lastNamehigh[4] values are surnames (KUMAR, Gupta, GC, Ishu, k, Sruthi) following firstName
11phonehigh[11] values are 10-digit Indian mobile numbers (8147760998, 9344870960, etc.)

Notes: Column [0] is numeric user ID (skip). Columns [5], [6], [7], [8], [9], [10] are mostly empty in sample rows; [7] has one value 'Gurgoan' (city) but insufficient data to confirm address mapping. No address1, address2, city, state, zip, country, dob, ssn, or password columns present in header row sample.

84.csv
10 columns99,394 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs, clearly email addresses (gmail.com, rediffmail domains typical of Indian users)
2suffixhigh[2] header pattern indicates salutation/title, values are 'Mr.' — standard suffix field
3firstNamehigh[3] values are individual given names (jay, AMOD, Anand, Promina)
4lastNamehigh[4] values are individual family names (patel, KESHRI, Singh, steels, Barve)
5address1medium[5] sparse but contains 'Yatra Office' — street/mailing address line
7cityhigh[7] values are Indian cities (Gurgaon)
8statehigh[8] values are Indian states (Haryana)
9countryhigh[9] values are 'India'
10ziphigh[10] values are 6-digit Indian PIN codes (122003)
11phonehigh[11] values are 10-digit Indian mobile numbers (9753038723, 7567739881, etc.)

Notes: 12 columns total. Column [0] is numeric user ID (skip). Column [6] is empty (skip). Remaining 10 columns contain PII. Breach context confirms Indian travel booking platform with Indian phone numbers, cities, states, PIN codes, and email providers.

85.csv
4 columns99,531 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs, standard email addresses (gmail.com, yahoo.com, adani.com domains)
3firstNamehigh[3] values are common given names (Mehul, Kranthi, Abhijith, ehtesham, krishna)
4lastNamehigh[4] values are surnames (Rupera, Kumar, Balakrishnan, khan, miglani)
11phonehigh[11] values are 10-digit Indian mobile numbers (7676662525, 9099002463, 9666698666, 9999999999)

Notes: File has no header row. Column [0] contains numeric user IDs (skip). Columns [2], [5], [6], [7], [8], [9], [10] are predominantly empty in sample rows; [5] shows 'Yatra.Com' (company name, skip); [7] shows 'Gurgoan' (likely city but too sparse to confidently map). Total 12 columns; 4 contain identifiable PII.

86.csv
5 columns99,495 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ symbol, standard email format with common Indian email providers (gmail.com)
2suffixhigh[2] header empty but values are 'Mr', a salutation/title suffix
3firstNamehigh[3] header empty but values are common given names (Shalender, MANJIT, Prakash, Madhuri, Dipak)
4lastNamehigh[4] header empty but values are family names (Sharma, Wadhwani, Salgaonkar, Sarda)
11phonehigh[11] values are 10-digit numbers consistent with Indian mobile phone format (9822264981, 9867683858, etc.)

Notes: Yatra 2019 breach — Indian travel booking platform. Columns [0], [5], [6], [7], [8], [9], [10] are mostly empty in sample and mapped as skip. Column [0] appears to be user_id (numeric, auto-generated). Columns [5]-[10] likely contain address fields (company name, address components, state) but are predominantly empty in provided sample; if populated in full dataset, would map as address1, city, state, country, zip. File structure suggests 12 total columns.

87.csv
5 columns99,330 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] header is email address, values contain @ symbol and are clearly email addresses from Indian providers (gmail.com, yahoo.com, yahoo.co.in, rediffmail.com)
2suffixhigh[2] values are 'Mrs.' and similar salutations/titles
3firstNamehigh[3] values are common given names (Ravi, Sesha, UDAY, avais)
4lastNamehigh[4] values are surnames/family names (Vadrevu, Kaushik, KAMAT, ahmmed)
11phonehigh[11] values are 10-digit Indian mobile phone numbers (8168014290, 8104192360, 9676142910, etc.)

Notes: Yatra 2019 travel booking breach. Column [0] is numeric user ID (skip). Columns [5-10] are empty in sample rows. Columns [3] and [4] sometimes contain full names or single names, but mapped as firstName/lastName based on context and breach description mentioning 'first and last names' fields.

88.csv
9 columns99,329 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ symbol and are clearly email addresses (gmail.com, yahoo.co.in domains)
3firstNamehigh[3] header pattern suggests given name, values are common Indian first names (Dhiarj, RUshi, Amit, ramakrishna)
4lastNamehigh[4] values are surnames (Patel, Kumar, reddy) paired with firstName column
5address1high[5] values include street-level addresses (Yatra Office)
7cityhigh[7] values are Indian cities (Gurgaon)
8statehigh[8] values are Indian states/provinces (Haryana)
9countryhigh[9] values are country names (India)
10ziphigh[10] values are Indian postal codes/PIN codes (122003)
11phonehigh[11] values are 10-digit Indian mobile numbers (9766032846, 9554629112, 9999999999)

Notes: Yatra travel booking platform breach from India. Column [0] is numeric user_id (skip). Columns [2] and [6] are empty (skip). Remaining columns map to core PII: email, name, address components, and phone.

89.csv
10 columns99,526 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] header '[email protected]' is sample email; all values contain @ signs and are valid email addresses
2suffixhigh[2] values are 'Mr.' — salutation/title indicator
3firstNamehigh[3] values are single given names (Shaik, sambath, Ganesh)
4lastNamehigh[4] values are surnames (Chand, koshy, Venkatraman)
5address1high[5] values are street addresses (Yatra Office)
7cityhigh[7] values are Indian cities (Gurgaon)
8statehigh[8] values are Indian states (Haryana)
9countryhigh[9] values are country names (India)
10ziphigh[10] values are 6-digit Indian PIN codes (122003)
11phonehigh[11] values are 10-digit Indian mobile phone numbers (8805477500, 9394706583, 9629178770)

Notes: Yatra 2019 breach — Indian travel platform. 13 total columns. Column [0] is numeric user_id (skip). Columns [6] and [12] are empty (skip). Mapped 10 PII columns: email, suffix, firstName, lastName, address1, city, state, country, zip, phone.

9.csv
12 columns99,258 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ symbol and email domains (gmail.com, yahoo.co.in); clearly email addresses
2suffixhigh[2] header 'Mr', values are salutations/titles (Mr)
3firstNamehigh[3] values are common given names (Vinod, Viraf, Rahul, JAYANT, Manish, Nipom)
4lastNamehigh[4] values are family names (Gandhi, Chinoy, Raghavan, KUMAR, Jaiswal, Boruah)
5address1high[5] values contain street/building addresses (E2/506 Bharat Nagar, #218 Rich field appt., etc.)
6address2high[6] values contain secondary address components (342 Grant Road, ORR Marathahalli, etc.)
7cityhigh[7] values are Indian city names (Mumbai, BHARUCH, Bangalore, Guwahati)
8statehigh[8] values are Indian state names (Gujarat, karnataka, Assam)
9countryhigh[9] values are country codes (IND, IN) representing India
10ziphigh[10] values are 6-digit Indian postal codes (400007, 392001, 560037, 781003)
11phonehigh[11] values are 10-digit Indian mobile phone numbers (9322230917, 9820636090, etc.)
12phonehigh[12] values are 10-digit Indian alternate phone numbers (9820636090, 8040998026, etc.); duplicate phone column

Notes: 13 columns total, 12 contain PII. Column [0] is numeric user_id (skip). File is structured user account records from Yatra.com breach with Indian user data. Columns [11] and [12] are both phone numbers (primary and alternate mobile).

90.csv
4 columns99,450 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] All values contain @ symbol and match email format (gmail.com, yatra.com, yahoo.in domains)
3firstNamehigh[3] Values are single given names (mohd, Padma, krishna)
4lastNamehigh[4] Values are surnames (faisal, BHIMIREDDY, Ghadhi, kumar)
11phonehigh[11] All values are 10-digit numbers consistent with Indian mobile phone format

Notes: File contains 12 columns. Column [0] is numeric user ID (skip). Columns [2] contains salutation/title (Mr.) — mapped as suffix would be appropriate but values are minimal; however, per instructions, only PII fields are included, so this is excluded. Columns [5-10] are empty in all sample rows (skip). Breach context confirms Indian travel booking platform with Indian addresses, emails, and phone numbers.

91.csv
4 columns99,483 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs and match email format (gmail.com, yahoo.co.in, rediffmail.com, sbi.co.in)
3firstNamehigh[3] values are common Indian given names (Chandrasekhar, Umesh, Rohit, Anju, Acharya)
4lastNamehigh[4] values are common Indian surnames (Reddy, Pati, Bhardwaj, Sawai, Ashwin)
11phonehigh[11] values are 10-digit numbers matching Indian mobile phone format (7066820264, 9989131113, 8468895380, etc.)

Notes: 13 columns total; 4 contain PII. Column [0] appears to be numeric user IDs (skip). Columns [2], [5], [6], [7], [8], [9], [10], [12] are empty or contain no readable data. Breach context confirms Indian travel platform with Indian addresses, phone numbers, and email providers.

92.csv
10 columns99,463 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs and email domains (gmail.com, yahoo.co.in); clear email addresses
2suffixhigh[2] values are 'Mr.' and empty strings; salutation/title indicator
3firstNamehigh[3] header position and values are common given names (Anita, Amrita, lindaj, Satish, ilangovan)
4lastNamehigh[4] header position after firstName; values are surnames (Behera, Bais, bos, Sasanapuri, ramakrishnan)
5address1high[5] street/mailing address field; contains 'Yatra Office' and mostly empty values
7cityhigh[7] values are Indian city names (Gurgaon); city field in standard address structure
8statehigh[8] values are Indian state names (Haryana); state field in standard address structure
9countryhigh[9] values are 'India'; country field
10ziphigh[10] values are Indian PIN codes (6 digits: 122003); postal code field
11phonehigh[11] values are 10-digit Indian mobile numbers (9925224870, 9952400470, etc.); mobile phone field

Notes: Yatra travel booking platform breach (2019). 13 columns total, 11 contain PII. Column [0] is numeric user ID (skip). Columns [6] and [12] are empty (skip). Indian addresses confirmed by PIN codes, Indian phone numbers, Indian cities/states, and Indian email providers.

93.csv
4 columns99,346 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] Values contain @ signs and are valid email addresses (gmail.com, rediffmail patterns consistent with Indian breach context)
3firstNamehigh[3] Header absent but values are common given names (Vamsi, SOWMYA, Gokul, sheetal, Debiprasanna)
4lastNamehigh[4] Header absent but values are common surnames (Krishna, KUMAR, Sampath, sharma, Mahanta)
11phonehigh[11] 10-digit numbers matching Indian mobile phone format (9790712356, 9840402744, etc.)

Notes: 13 columns total. [0] is internal user_id (skip). [2], [5], [6], [7], [8], [9], [10], [12] are empty or non-PII (skip). Yatra-2019 breach context confirmed: Indian phone numbers, Indian email providers, consistent with travel platform user records.

94.csv
12 columns99,143 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] header is email address, all values contain @ symbol and are valid email addresses
2suffixhigh[2] header 'Mr'/'Ms', values are salutations (Mr, Ms)
3firstNamehigh[3] values are given names (Arumugaperumal, Raviprakash, Amin, mayank, Kanupriya, jayesh)
4lastNamehigh[4] values are family names (chelliah, Baruah, MERCHANT, shah, Bhardwaj, parmar)
5address1high[5] header contains street address, values are building/street addresses with street numbers and names
6address2high[6] values are neighborhoods/sub-localities (Ambur, SANTACRUZ, Janakpuri, Mirchandani Gardens)
7cityhigh[7] header 'city', values are Indian city names (Ambasumdaram, guwahati, MUMBAI, New Delhi)
8statehigh[8] header 'state', values are Indian state names (Tamilnadu, Assam, Maharashtra, Delhi)
9countryhigh[9] all values are 'IND' (India country code)
10ziphigh[10] header contains postal code, values are 6-digit Indian PIN codes
11phonehigh[11] values are 10-digit Indian mobile numbers (9787005915, 9864012243, etc.)
12phonehigh[12] alternate phone number column, values are phone numbers (some blank, some 10-digit)

Notes: 13 columns total. Column [0] is a numeric user ID (skip). All other columns map to PII fields. This is Indian travel booking platform data with complete address records and phone numbers.

95.csv
13 columns99,152 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
0skiphigh[0] header '1993765', values are sequential numeric IDs (1993765-1993770), characteristic of auto-generated user_id/record_id
1emailhigh[1] header '[email protected]', all values contain @ symbol, valid email addresses from Indian providers (yahoo.in, gmail.com, rediffmail.com)
2suffixhigh[2] header 'Mr', values are salutation titles (Mr, Mrs), classic suffix/title field
3firstNamehigh[3] header 'VASUDEVAN', values are common given names (Biju, Dinesh, vivek, PARAMITA, Satpal)
4lastNamehigh[4] header 'KRISHNAN', values are surnames (vatavathi, kumar, malik, DAS, Singh)
5address1high[5] header 'AIKKARASERIL PALLARIMANGALAM...', values are street addresses (full mailing addresses with building/locality details, Indian format)
6address2high[6] header 'AIKKARASERIL PALLARIMANGALAM...', values are secondary address components (locality/area names: REYAMI, LODI ROAD)
7cityhigh[7] header 'MAVELIKARA', values are Indian city names (ERNAKULAM, chennai, new delhi, BHILAI, Mumbai)
8statehigh[8] header 'Kerala', values are Indian state names (Kerala, Tamilnadu, Delhi, Chhattisgarh, Maharashtra)
9skiphigh[9] header 'IND', all values are 'IND' (country code constant), non-variant flag field
10ziphigh[10] header '690107', values are 6-digit Indian postal codes (PIN codes: 682011, 600104, 110085, 491006, 400028)
11phonehigh[11] header '9328112572', values are 10-digit Indian mobile numbers (9746678670, 9840310039, 9891398331, 9163669229, 9827873816)
12phonehigh[12] header '' (empty), values are 10-digit Indian phone numbers (9746678670, 9962443677, 9830168804, 9819217173, 9796742972), alternate/secondary phone field with some empty cells

Notes: 13 columns total. Yatra-2019 travel booking platform breach containing Indian user account records. All phone numbers are Indian format (10 digits starting with 9). All addresses, cities, states, and postal codes are Indian. 11 PII columns mapped (email, suffix, firstName, lastName, address1, address2, city, state, zip, phone, phone). Column [9] is constant country code 'IND' (skip). Column [0] is sequential numeric user ID (skip).

96.csv
12 columns99,410 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs; email addresses from Gmail, IIT, corporate domains
2suffixhigh[2] header/values are 'Mr' — salutation/title prefix
3firstNamehigh[3] values are given names: ARUN, VSKmurthy, mithilesh, MURALI, virendra
4lastNamehigh[4] values are family names: KUMAR, Balijepalli, prasad, dharan, pandey
5address1high[5] values are street addresses: '1A kannan avenue', 'IIT Bombay', '46assamrifle', 'C-1/202, NILGIRI GARDEN'
6address2high[6] values are secondary address components: apt/suite/unit lines like 'c/o99a.p.o', 'SECTOR 24, AMRA ROAD, C.B.D.BELAPUR'
7cityhigh[7] values are Indian cities: chennai, Mumbai, tezpur, delhi, NAVI MUMBAI
8statehigh[8] values are Indian states: Tamil Nadu, Maharashtra, Assam, Delhi
9countryhigh[9] values are country codes: IND, IN (India)
10ziphigh[10] values are 6-digit Indian postal codes: 600063, 400076, 784110, 110096, 400614
11phonehigh[11] values are 10-digit Indian mobile numbers: 7299926890, 9967178363, 9864806036, 9970050045
12phonehigh[12] values are 10-digit Indian phone numbers (alternate); same format as [11]

Notes: 13 columns total. Column [0] is numeric user_id (internal identifier, skipped). Columns [11] and [12] both map to phone as they represent primary and alternate phone numbers. All data is Indian in origin (Yatra travel platform breach). No DOB, SSN, username, or password fields present in sample.

97.csv
12 columns51,771 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] header '[email protected]' contains @ symbol; all values are email addresses from Indian providers (yahoo.co.in, yahoo.com, gmail.com)
2suffixhigh[2] header 'Mr'; all values are salutations/titles (Mr, Mrs, Ms)
3firstNamehigh[3] header 'MOHIT'; values are given names (Mohit, Sonaal, shubhamoy, wrickban, vinoy, Satish)
4lastNamehigh[4] header 'GUPTA'; values are family names (Gupta, Goel, Mukherjee, mazumdar, vincent, Mishra)
5address1high[5] street addresses: 'E-165 GREATER KAILASH PART-2', 'Virat House', 'madhpur', 'flat no.-104.shahi palace'
6address2high[6] secondary address lines: '49/1 Kishanpur', 'near kulharia complex.ashok rajpath.' — continuation of address blocks
7cityhigh[7] Indian cities: NEW DELHI, Dehradun, hyderabad, patna
8statehigh[8] Indian states/territories: DELHI, Uttarakhand, 'Andaman and Nicobar Islands', Bihar
9countryhigh[9] country codes: 'IN', 'IND' (India)
10ziphigh[10] Indian PIN codes: 110048, 248001, 123456, 800001
11phonehigh[11] Indian mobile numbers (10 digits starting with 9): 9810779217, 9411570350, 8939900215, 9830411474
12phonehigh[12] alternate phone numbers (mix of mobile and landline): 9810779217, 1147020804, 2677097, 7923290121

Notes: Yatra 2019 travel booking platform breach. All 13 columns contain PII. Column [0] (user IDs) automatically skipped as internal identifier. Data is entirely Indian: addresses, phone numbers, email providers (yahoo.co.in, gmail.com), cities (Delhi, Dehradun, Hyderabad, Patna), and states (Delhi, Uttarakhand, Bihar) confirm Indian user base. Columns [5] and [6] form complete street addresses. Columns [11] and [12] are both phone fields — [11] appears primary, [12] alternate contact.

98.csv
11 columns10,440 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] header '[email protected]' is an email address; all values contain @ and are valid email addresses
2suffixhigh[2] header 'Mr'; values are salutations (Mr, Mr., etc.)
3firstNamehigh[3] header 'shaikh'; values are common given names (ramasamy, jamie, sridher, sandeep, gurpreet)
4lastNamehigh[4] header 'mdrehan'; values are surnames (muthukrishnan, chithung, kumar, singh)
5address1high[5] values are street addresses (v.p.o- wadala veeram, Yatra.Com 1101-03, New BEL Road, 203-A sector)
7cityhigh[7] values are Indian cities (amritsar, gurgoan, bangalore, hsr, bhopal)
8statehigh[8] values are Indian states (Punjab, Karnataka, Haryana, Madhya Pradesh)
9countryhigh[9] values are country codes/names (India, IND)
10ziphigh[10] values are 6-digit Indian postal codes (143601, 560094, 125005, 462023)
11phonehigh[11] values are 10-digit Indian mobile numbers (9892667779, 9840013010, 8586054672, 9597165868)
12phonehigh[12] values are 10-digit Indian mobile numbers; alternate phone field

Notes: 13 columns total, 10 contain PII. Column [0] is a numeric user ID (skip). Columns [6] are empty (skip). Yatra 2019 Indian travel booking breach with typical user profile data: name, email, address, and phone numbers.

99.csv
10 columns96,105 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] All values contain @ symbol and email domain patterns (gmail.com, yahoo.com, etc.)
2suffixhigh[2] Values are salutations/titles: 'Mr', 'Mr.', 'Mrs' — matches suffix field type
3firstNamehigh[3] Values are individual given names: 'SUBASH', 'prasad', 'kareem', 'Talat', 'jinesh'
4lastNamehigh[4] Values are family names: 'SUBRAMANIYAM T', 'jadhav', 'mullah', 'Taj', 'sheth'
5address1high[5] Values are street addresses: 'At Post:- Sakurede' — primary address line
6ziphigh[6] Values are 6-digit Indian PIN codes: '412303', '8600014818' appears to be phone mixed in row
7cityhigh[7] Values are Indian city names: 'chennai'
8statehigh[8] Values are Indian state names: 'Tamilnadu', 'Maharashtra'
9countryhigh[9] Values are 'India' — country field
11phonehigh[11] All values are 10-digit Indian mobile numbers: '8123740534', '9003054257', '9821842298', '9654990353', '9901665970'

Notes: 13 columns total, 10 contain PII. Column [0] contains numeric user IDs (skip). Column [10] and [12] are empty/non-PII (skip). Column [6] appears to have mixed data (both 6-digit PIN and 10-digit phone in different rows) — mapped as zip based on header position and primary values. Breach context confirms Indian travel booking platform with user account records.

tuser__1_.csv
12 columns198,934 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] All values contain @ symbol and match standard email format (gmail.com, yahoo.com, tatkal.in, etc.)
3suffixhigh[3] Header 'Mr.' and values are salutation/title markers (Mr, Mr.)
4firstNamehigh[4] Common given names (Deepak, Milind, Jaya, Upendra, Sharique) matching Indian naming patterns
5lastNamehigh[5] Surname values (Kulkarni, Krishnan, Singh, Ahmad) following first names in typical name order
6address1high[6] Street addresses with building/house numbers and locality names (484/75 Sawali, F-13 Ravindra Bhawan, HN 933 Sunita Nagar)
7address2high[7] Secondary address components (neighborhood/area names: Mitramandal Colony, Surabhi Nagar, Wadgaonsheri)
8cityhigh[8] Indian city names (Pune, Haridwar, Kochi)
9statehigh[9] Indian state abbreviations and names (Maharashtra, Uttarkhand)
10countryhigh[10] Country codes (IND, IN) indicating India
11ziphigh[11] Indian postal/PIN codes (411009, 247667, 411014) — 6-digit format standard for India
12phonehigh[12] Indian mobile phone numbers (10 digits, starting with 7-9 as per Indian telecom standards: 9428330969, 9765361020, 9811178409, 9820296136)
13phonehigh[13] Alternate/secondary Indian mobile phone numbers (10-digit format: 2024459307, 9446740168)

Notes: Yatra.com travel platform breach (2019). 14 columns total, 11 contain searchable PII. Column [0] is numeric user ID (skip). Column [2] appears to be internal username/platform identifier (skip). File has no header row. All addresses, phone numbers, and email providers confirm Indian origin as documented in breach context.

tuser__2_.csv
10 columns99,035 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] header is email address, values contain @ signs and are clearly email addresses (yahoo.co.in, gmail.com)
2suffixhigh[2] header 'Mr', values are salutations/titles (Mr)
3firstNamehigh[3] header 'shyamal', values are common given names (Vijay, jitendra, Bhalchandra, ajay, Aditya)
4lastNamehigh[4] header 'bandyopadhyay', values are surnames (Vijay, jain, Badve, oberoi, Aditya)
5address1high[5] header 'P-12 nilachal complex...', values are street addresses (P-12 nilachal complex, 504 Shanti Kamal Bldg, House No-2328)
6address2high[6] header blank, values appear to be secondary address info (Dr. B. A. Road, Chinchpokli)
7cityhigh[7] header 'kolkata', values are Indian city names (Mumbai, tilak nagar, Mohali)
8statehigh[8] header 'West Bengal', values are Indian states (Maharashtra, Delhi, Punjab)
10ziphigh[10] header '700103', values are 6-digit Indian PIN codes (400012, 110018, 160065)
11phonehigh[11] header '9546274404', values are 10-digit Indian mobile numbers (9833222314, 9819739937, 8939136619)

Notes: 13 columns total, 10 contain PII. Column [0] is numeric user_id (skip). Column [9] is country code 'IND' (skip). Column [12] is empty (skip). All PII fields mapped including Indian addresses, phone numbers, and email addresses consistent with Yatra.com travel booking platform breach context.

tuser__3_.csv
12 columns99,150 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] Values contain @ symbols and match email format ([email protected], [email protected], etc.)
2suffixhigh[2] Header 'Mr', consistent salutation/title across all rows
3firstNamehigh[3] Values are given names (SWAMINATHAN, Venugopal, aswin, asif, TANMOY, Chelaram)
4lastNamehigh[4] Values are family names (LANKA, Thangamuthu, fernandis, masud, CHOWDHURY, Choudary)
5address1high[5] Street addresses (#28-7-6 ANNADANA SAMAJAM ROAD, 24 Bharathi Nagar, orthodox church center sector 10 a, BARRACKPORE)
6address2high[6] Secondary address component (Opp: AMBEDKAR BHAVAN ARUNDELPET, Kovaipudur)
7cityhigh[7] Indian city names (VIJAYAWADA, Coimbatore, navi mumbai, KOLKATA)
8statehigh[8] Indian states (Andhra Pradesh, Tamilnadu, Maharashtra, West Bengal)
9countryhigh[9] Country codes (IND, IN) representing India
10ziphigh[10] Indian PIN codes (520002, 641042, 400703, 700122)
11phonehigh[11] 10-digit Indian mobile numbers (9347406048, 9894019632, 9210920151, etc.)
12phonehigh[12] Alternate phone numbers, 10-digit format matching Indian numbering, some empty cells allowed

Notes: 13 columns total, 12 contain PII. Column [0] (numeric IDs like 2569869) is skipped as internal user_id. Data confirmed Indian travel booking platform (Yatra) with Indian addresses, phone numbers, email providers (yahoo.co.in, rediffmail.com), and city/state references.

tuser__4_.csv
12 columns99,200 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] header '[email protected]' contains @ signs, all values are valid email addresses
2suffixhigh[2] values are 'Mr', 'Ms' — salutation/title indicators
3firstNamehigh[3] values are given names: 'Venugopal', 'Manoj', 'Molle', 'Anil', 'sateeshkumar' — first name column
4lastNamehigh[4] values are family names: 'Joshi', 'Sethi', 'kumar', 'Kulkarni' — last name column
5address1high[5] values are street/building details: 'sotla', 'Shop No 5', 'delhi', '201 Batavia Chambers' — primary address line
6address2high[6] values are secondary address components: 'Sector 14', 'Kumarakrupa Road' — secondary address line
7cityhigh[7] values are Indian city names: 'hoshiarpur', 'Gurgaon', 'Delhi', 'Bangalore'
8statehigh[8] values are Indian states: 'Punjab', 'Haryana', 'Delhi', 'Karnataka'
9countryhigh[9] values are country codes: 'IND', 'IN' — country identifier
10ziphigh[10] values are 6-digit postal codes: '144210', '122001', '110001', '560001' — Indian PIN codes
11phonehigh[11] values are 10-digit phone numbers: '9716203040', '7666519936' — mobile numbers
12phonehigh[12] values are 10-digit phone numbers: '9999410009' — alternate/secondary phone

Notes: Yatra.com 2019 breach. Column [0] is numeric user ID (skipped). All other columns contain valid PII. Indian addresses, email providers (gmail.com, yahoo.co.in), and phone format (10 digits starting with 7-9) confirm Indian travel booking records.

tuser__5_.csv
12 columns98,921 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ symbols and email domain patterns (gmail.com, yahoo.co.in, hotmail.com)
2suffixhigh[2] header/values are 'Mr', 'Mrs' — salutations/titles
3firstNamehigh[3] values are given names (Anjana, Pranay, viswanathan, sajad, Gopal)
4lastNamehigh[4] values are family names (Naskar, Sinha, vijaya, sofi, Raju)
5address1high[5] values are street addresses (J.P.Nagar, F 401 SBOQ)
6address2high[6] values are secondary address components (Sector 17, Vashi)
7cityhigh[7] values are Indian city names (Bangalore, Navi Mumbai)
8statehigh[8] values are Indian state abbreviations/names (Karnataka, Maharashtra)
9countryhigh[9] values are country codes (IN, IND) — all India, consistent with breach context
10ziphigh[10] values are Indian PIN codes (560078, 400705)
11phonehigh[11] values are 10-digit Indian mobile numbers (8600520494, 9433498635, etc.)
12phonehigh[12] values are 10-digit alternate phone numbers (4422243847)

Notes: 13 columns total. Column [0] is numeric user ID (skip). All addresses, phone numbers, and email domains are Indian in origin, consistent with Yatra 2019 breach context. Suffix field indicates formal salutation data.

tuser__6_.csv
12 columns98,887 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] values contain @ signs, clearly email addresses from Indian providers (rediffmail.com, gmail.com, hotmail.com)
2suffixhigh[2] header 'MR', values are titles (Mr, MR)
3firstNamehigh[3] values are common given names (RAHUL, Jagadish, Krishnamurthy, shaikh, bhanumurthy, Rajeev)
4lastNamehigh[4] values are surnames (NIGAM, Prabhala, Viswanathan, sayeed, govindu, Sharma)
5address1high[5] values contain street addresses and building details (Basement 25Vishwas Market, 267 jawahar nagar, 7/104 Prithvi Brahmand)
6address2high[6] values contain secondary address components (Ghodbunder Road, landmarks, street names)
7cityhigh[7] values are Indian city names (Lucknow, moulali, THANE, Bangalore, nagpur)
8statehigh[8] values are Indian states (Uttar Pradesh, Andhra Pradesh, MAHARASHTRA, Karnataka)
9countryhigh[9] values are country codes (IN, IND) indicating India
10ziphigh[10] values are 6-digit Indian PIN codes (226020, 500040, 400607, 560032)
11phonehigh[11] values are 10-digit Indian mobile numbers (9415049287, 9903138911, 9654426143)
12phonehigh[12] values are 10-digit Indian mobile numbers, alternate phone field (9905417398, 9945567054, 9441231084)

Notes: Yatra 2019 breach. Column [0] is numeric user_id (skip). All other columns contain searchable PII from Indian travel booking records.

tuser__7_.csv
12 columns98,763 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] Values contain @ signs and match standard email format (gmail.com, rediffmail.com, yahoo.com)
2suffixhigh[2] Header is 'Mr', all values are titles/salutations
3firstNamehigh[3] Header appears to be first name position, values are common Indian given names (Siddhartha, Vanshika, SATYENDRA, etc.)
4lastNamehigh[4] Header appears to be last name position, values are common Indian surnames (Bhargava, Saksena, JAIN, SUBRAMANIAM, etc.)
5address1high[5] Values match street address format (D-1, 11A 3rd Cross 1st Main, D-No 38-34-1 marripalem, Milan apartment thatipur)
6address2high[6] Values are locality/area names (Defence Colony, KHB Colony Basaveshwara Nagar, daman)
7cityhigh[7] Values are Indian city names (New Delhi, Bangalore, Visakhapatnam, Gwalior, daman)
8statehigh[8] Values are Indian states/territories (Delhi, Karnataka, Andhra Pradesh, Madhya Pradesh, Daman and Diu)
9countryhigh[9] Values are country codes/names (IND, India) consistent with breach context
10ziphigh[10] Values are 6-digit Indian PIN codes (110024, 560079, 530018, 474011, 396210)
11phonehigh[11] Values are 10-digit Indian mobile numbers (8126968932, 8600600843, 9879966325, etc.)
12phonehigh[12] Values are 10-digit Indian phone numbers (alternate/secondary phone, some 7-10 digits)

Notes: Yatra 2019 breach data. Column [0] is numeric user ID (skip). 12 PII columns identified mapping to complete user profiles with email, name, address, and phone contact information. All addresses and phone numbers confirmed as Indian per breach context.

tuser__8_.csv
11 columns98,470 rows

File structure

Format: CSV·Delimiter: comma·Has header: yes·Quote: "

Source columnMapped fieldConfidenceLLM assessment
1emailhigh[1] header 'email' (inferred), values contain @ signs and email domains (gmail.com, yahoo.co.in, whsmith.co.uk)
2suffixhigh[2] header 'salutation/title', values are honorifics (Mr, Mrs, Dr)
3firstNamehigh[3] header 'first_name' (inferred), values are common given names (Mathew, Shailendra, Ankur, SANTOSH)
4lastNamehigh[4] header 'last_name' (inferred), values are surnames (Edes, Singh, Bhatnagar, BHARTI, ARUL)
5address1high[5] header 'address1' (inferred), values are street addresses (65 Riverpark Drive, 229 G.Floor Mandakini Enclave, Flat 131)
6address2high[6] header 'address2' (inferred), values are secondary address components (C-58/21 Sector 62, BY PASS ROAD)
7cityhigh[7] header 'city' (inferred), values are Indian city names (New Delhi, Noida, MUMBAI, MADURAI, surat)
8statehigh[8] header 'state' (inferred), values are Indian states (Delhi, UP, Maharashtra, Tamilnadu, Gujarat)
9countryhigh[9] header 'country' (inferred), values are country codes (GB, IND, IN)
10ziphigh[10] header 'zip/PIN' (inferred), values are postal codes (SL7 1QT, 110019, 201301, 400005, Indian PIN formats)
11phonehigh[11] header 'mobile' (inferred), values are 10-digit Indian mobile phone numbers (7792166476, 9873338291, 8010609426)

Notes: Yatra 2019 breach — Indian travel platform. 13 columns total, 11 contain PII. Column [0] is numeric user_id (skip). Column [12] contains incomplete phone numbers/data (skip). All addresses, emails, and phone numbers are consistent with Indian geography and telecom formats.

tuser__9_.csv
13 columns98,929 rows

File structure

Format: CSV·Delimiter: comma·Has header: no·Quote: "

Source columnMapped fieldConfidenceLLM assessment
0skiphigh[0] Sequential numeric user IDs (2816970, 2816971, etc.) — internal identifier pattern
1emailhigh[1] Values contain @ symbols and email domain patterns (gmail.com, yahoo.co.in, verizon.net, hotmail.com)
2suffixhigh[2] Header 'Mr' and values are salutation/title prefixes (Mr, Mr, Mr) — common Indian address convention
3firstNamehigh[3] Common given names (SESHADEV, Anil, Luke, Ambarish, Sahil, mohit) — typical first name values
4lastNamehigh[4] Family names (RAY, Puri, Hay, Bhusari, Ravjit, wasson) — typical last name values
5address1high[5] Street-level addresses (332 khajor road, 155 Victoria Drive, Flat B-104 Prime Heights, house no. 323 sector-71 mohali, c-9/6 sector-8 rohini)
6address2medium[6] Suburb/locality names (Jimboomba, Sus Road Pashan) — secondary address component
7cityhigh[7] City/locality names (karol bagh, Logan City, PUNE, mohali, delhi) — Indian and international cities
8statehigh[8] State/province names (Delhi, Queensland, Maharashtra, Punjab) — Indian states and international provinces
9countryhigh[9] Country codes and names (India, IND, AU, IN) — country identifiers
10ziphigh[10] Indian postal codes and international postcodes (110005, 4280, 411045, 160071, 110085) — ZIP/PIN patterns
11phonehigh[11] Indian mobile numbers (9439559613, 9810013834, 9797186126, 9890923172, 9930022115) — 10-digit pattern starting with 9, typical Indian mobile format
12phonehigh[12] Alternate phone numbers (9814361138, 1127941766) — secondary contact numbers

Notes: Yatra.com 2019 breach. 13 columns total, 11 contain searchable PII (email, names, addresses, phone, country). Column [0] is numeric user ID (skip). Suffix column indicates formal Indian address format. Multiple phone columns indicate primary and alternate contact numbers common in Indian travel booking data.

Articles about this breach

yatra.com. Shadow Identity