A Chinese social media management startup leaked more than 400 GB of personally identifiable information (PII) from social media users, including celebrities and social media influencers around the world and the US SocialArks obtained the information. scraping data on social media, which remains a controversial practice banned by affected networks.
The firm describes itself as a “cross-border social media management company dedicated to solving the current problems of branding, marketing, marketing and social customer management in China’s foreign trade industry.”
More worrying was the presence of private personal information not publicly provided by the victims on their public social profiles. The data breach affected 214 million social media users on Facebook, Instagram and LinkedIn.
Security detectives discovered the exposed data as part of a cybersecurity mission to find various vulnerabilities that pose cybersecurity risks to the general public.
Sensitive information exposed from an insecure ElasticSearch database
Information stored in a misconfigured ElasticSearch database without password protection or encryption was discovered by security detectives during a routine IP address check for insecure databases. The researchers noted that anyone with the IP addresses could have accessed the information.
The head of Safety Detectives’ cybersecurity team, Anurag Sen, said the exposed Elasticsearch database contained 408GB of 318 million records obtained from the social profiles of 214 million Facebook, Instagram and LinkedIn users.
Tencent hosted the vulnerable server in Hong Kong. The server was segmented into indexes to efficiently store data obtained from different sources.
SocialArks suffered a similar breach in August 2020, exposing data from 150 million social profiles on LinkedIn, Facebook, and Instagram.
Leaked information obtained from data scraping that violates user terms of service
Researchers from Safety Detectives confirmed that the information was obtained by scraping data from the affected social media platforms. The researchers also noted that the practice is unethical and violates Facebook, Instagram, and LinkedIn policy.
Data scraping involves the use of automated bots capable of extracting information from web pages without human interaction. The practice is legal in most cases, but can be abused by various rogue actors to copy large amounts of information. Some websites have a policy that prohibits the practice. Others employ various countermeasures, such as the use of captchas, which could also be defeated by scraping robots.
Typical legal applications of data scraping include the gathering of information on booking sites and job portals for analytical purposes.
However, extracting personal information and adding it to data in other secure locations is unethical and concerning for businesses and social media users.
Possession of highly personalized information could lead to social engineering attacks through custom and specifically designed messages. It also creates the possibility of identity theft to commit financial fraud in online banking systems.
Controversial practices, such as data mining, put professional network users in a dilemma over whether to provide the personal information necessary for business and employment or limit their social profiles to protect their privacy.
Private personally identifiable information leaked from public social profiles
The leaked information allowed someone to determine the full names of the victims, country of residence, workplace, job title, subscriber details, social profile link, and contact information. The information also contained profile pictures, Messenger ID, usernames of other linked social media accounts, number of followers, frequently used hashtags, number of comments, among other details.
Additionally, the leak revealed personal data of Instagram and LinkedIn users, including phone numbers and email addresses, even for users who never publicly provided such information on their social profiles.
It is unclear how SocialArks obtained the inaccessible private data through regular scraping of data from public social profiles.
Chinese startup SocialArks leaked data from more than 214 million #socialmedia users obtained by #datascraping Facebook, Instagram and LinkedIn social profiles. #cybersecurity #respectdata
In total, 11,651,162 Instagram social profiles and 66,117,839 LinkedIn users were leaked, while 81,551,567 Facebook user profiles were exposed. Another batch containing 55,300,000 Facebook profiles was removed within hours of discovery.
SocialArks never responded to messages from the researchers, but secured the database upon notification.