← All datasets

topsy.com

Mar 15, 2014

100,000
Records
1
Files
May 30, 2026
Added

A database dump from what appears to be Topsy, a social media analytics platform acquired by Apple in 2013. The data contains user account records including names, email addresses, usernames, OAuth tokens for Google and Facebook, profile photo URLs, and account metadata. Records are dated around March 2014. The data includes live OAuth access tokens for Google and Facebook at the time of breach.

Data found in this dataset

EmailUsernameGenderskipfullName

Search this dataset

Scoped to this dataset. Fill any combination — results match if any field hits.

Source files

Expand any file to inspect its column headers and the LLM's field-mapping reasoning, recorded during ingestion.

part2.json
66 columns100,000 rows

File structure

Format: NDJSON

Source columnMapped fieldConfidenceLLM assessment
dateskiphighrecord creation date, not PII
google.data.ageRangeskiphighage range metadata
facebook.urlskiphighprofile URL
google.data.profile_urlskiphighprofile URL
microsoftskiphighnested provider object (Microsoft OAuth)
twitter.idskiphighTwitter user ID
google.data.displayNamefullNamehighdisplay name same as full name
google.idskiphighprovider user ID
twitterskiphighnested provider object
facebook.data.token_expiredskiphightoken metadata
google.data.objectTypeskiphighmetadata
google.data.emailsskiphighnested array; extract individual email values
permissionsskiphighmetadata array
facebook.data.profile_urlskiphighprofile URL
idskiphighinternal numeric ID with $numberLong type
google.data.screen_nameusernamehighsocial media screen name from Google profile
is_searchableskiphighboolean flag
google.data.gendergenderhighkey is 'gender', values are 'male', 'female'
google.data.imageskiphighimage URL
createdskiphighrecord creation timestamp
googleskiphighnested provider object containing OAuth tokens and profile data; nested objects require separate analysis
email_confirmedskiphighboolean flag
google.data.profile_img_urlskiphighimage URL
facebook.emailemailhighemail within provider object (often blank for Facebook)
google.data.occupationskiphighoccupation/employment data
namefullNamehighcontains full personal names like 'Van Harold', 'angel sadika khan', 'Lupita Alvarado'
facebook.data.namefullNamehighfull name from Facebook profile
username_changedskiphighmetadata boolean flag
google.data.languageskiphighlanguage preference
_idskiphighMongoDB internal ID
google.data.emailemailhighemail from provider data
updatedskiphighrecord update timestamp
google.data.coverskiphighprofile cover photo metadata
google.emailemailhighemail address within provider object
twitter.tokenskiphighOAuth token
twitter.data.screen_nameusernamehighTwitter handle
google.data.circledByCountskiphighsocial metric
twitter.secretskiphighOAuth token secret
google.data.token_expiredskiphightoken metadata
google.data.urlskiphighprofile URL
topskiphighboolean flag, not PII
google.urlskiphighprofile URL, not PII
providerskiphighOAuth provider name (google, facebook, twitter, microsoft)
google.data.etagskiphighAPI metadata
google.data.kindskiphighschema metadata
originsskiphighmetadata array
facebook.idskiphighprovider user ID
categoriesskiphighmetadata array
facebook.data.emailemailhighemail from Facebook provider data
keyskiphighinternal UUID key
emailemailhighkey is 'email', values contain email addresses with @ symbol
subscribeskiphighboolean flag
facebookskiphighnested provider object; contains OAuth tokens and nested data
verifiedskiphighboolean flag
photoskiphighprofile image URL, not PII field
google.tokenskiphighOAuth access token, not a PII field type in scope
blacklistskiphighboolean flag
facebook.data.tokenskiphighOAuth token
google.data.tokenskiphighOAuth token
google.data.namefullNamehighfull name from OAuth provider data
twitter.data.namefullNamehighfull name from Twitter profile
commentskiphighuser comment field
google.data.isPlusUserskiphighboolean flag
facebook.tokenskiphighOAuth access token
google.data.verifiedskiphighmetadata flag
usernameusernamehighsocial media handle/account username

Notes: Topsy social media platform dump from March 2014. Data contains OAuth-linked user profiles from Google, Facebook, Twitter, and Microsoft. Primary PII fields are: name (fullName), email, and username (social handle). Nested provider objects (google, facebook, twitter, microsoft) contain OAuth tokens and linked profile data. Extract email and fullName/displayName from nested provider.data objects separately. Keys like 'id' with numeric values or UUIDs are internal IDs and must be skipped. Values in nested 'data' objects should be analyzed independently. Gender values observed: 'male', 'female'. No address, phone, DOB, SSN, or password fields detected in sample.

topsy.com. Shadow Identity