Data, Not Dollars: The Ongoing Threat of Data Breaches in Web3

Background

Hacks, exploits, ransomware, and all matter of cyber security threats continue to grow in scale and severity. Web3 ecosystems are unique in that they provide malicious actors with a variety of attack surfaces not found in other technologies, including bugs in smart contracts and novel phishing techniques.

However, the story of Web3 security incidents closely mirrors that of other industries. Centralized projects and companies are failing to address the same kinds of security vulnerabilities that non-Web3 companies also miss. We wanted to take a closer look at the history of cyber security incidents against Web3 targets and assess whether past incidents pose an ongoing a risk to community members today. To do that we need to look closely at what makes the security incidents in this report different from vulnerabilities that result from exploiting smart contract protocols.

We examined many incidents against Web3 companies going back to 2011 and can roughly classify them into two categories:

Protocol Exploits: Incidents that exploit smart contract code for financial gain
Breaches: Incidents where an attacker breaches the internal network of a target organization and uses the privileges acquired to exfiltrate company data or funds

There are several important distinctions to draw between these two categories in terms of their immediate and long-term risk. Protocol exploits occur within a defined time frame, beginning when an attacker executes the exploit and ending when they drain all available funds, they run out of gas, or the target project ends operations. Some of these incidents can extend for hours or days, with post-incident negotiations extending them further or projects immediately folding afterwards. However, the key point is that these exploits have definitive start and end points.

In contrast, breaches can often be ongoing events where attackers gain access to a network and maintain a presence there. Breaches are also usually defined by the loss of data which is used for attacks or subsequently sold on the dark web or online forums.

Network breaches may also result in acute loss of funds; most Web3 organizations are financial entities moving very large sums of money. This makes them a natural target for hackers. Data breaches can be particularly damaging and remain a risk years after they occur, especially if personally identifiable information (PII) is lost during the breach.

With this in mind, we collected a sample of 74 past incidents that we would classify as breaches that pose an ongoing risk to members of the community. The sample only includes incidents where companies had their internal networks breached, it does not include data on protocol exploits. We felt it was important to distinguish between incidents where loss of sensitive data occurred versus those where only loss of funds occurred. To assist in assessing the ongoing risk of these breaches, we will highlight breaches whose data is still available either for sale or for free on the dark web or other areas of the clearnet, along with commentary on the accessibility of these platforms.

Data Breaches vs. Loss of Funds

To assess the ongoing risk associated with these incidents we divided them into events defined by:

The loss of data, including PII and internal databases, etc., where the data is theoretically retrievable
Incidents where funds and/or data are lost and the data is no longer retrievable

Breaches that resulted in only the loss of funds or private keys primarily make up data considered irretrievable. Funds lost in breaches generally are not retrievable in these situations, nor are compromised private keys useful once they are no longer private.

Outlier incidents include events where stolen data was never released, where it was returned, or where it was used for other purposes. For example, in June 2020 Japanese CEX Coincheck was breached with the PII of more than 200 customers falling into the attacker's hands. The attacker breached Coincheck’s networks and then sent phishing emails from an internal company email address asking customers for PII. There was no specific database lost, and the data that was lost was only for customers who responded to those emails.

In another June 2020 incident, Canadian CEX Coinsquare also experienced a breach involving the loss of 5,000 email addresses, phone numbers, and home addresses. After some back and forth between the attacker and Coinsquare, the attacker stated they would use the data in SIM-swapping attacks rather than trying to sell it as this would be more profitable. This type of incident was also categorized as irretrievable.

Of the 74 incidents we identified, we were able to classify 23 as retrievable, roughly 31% percent. The remaining 51 events are either outliers as described above or are incidents where only funds were lost.

7a87f6eb-045f-4fbc-9ed6-7714e994474c Chart: Retrievable versus irretrievable data for incidents occurring between 2011 and 2023. Source: CertiK

There are a couple of observations worth pointing out here. First, most potentially retrievable data loss incidents increased quite significantly after 2019. This generally aligns with the significant increase in hacks and breaches seen across all industries during the Covid-19 pandemic. Similarly, the increase in government assistance during this period, some of which made its way into Web3 ecosystems, paired with the 2021 bull run may have provided attackers with increased ransomware and data sale opportunities.

Where Does Stolen Data Go?

Lost data frequently ends up being sold or dumped either on the the dark web (.onion sites) or the clear net. Where the data likely has some financial value (PII and other fraud enabling data), it is frequently sold on dark web markets but can also be found in Telegram channels. In events where the attacker does not have their demands met (ransomware) data is frequently dropped on paste sites or in hacker forums.

Where data ends up determines the long-term risk it poses to its original owners. If data is dumped on a hacker forum for little or no cost the relative risk to individuals whose data is exposed is higher than if that data has to be purchased on the dark web. The ongoing accessibility of such websites also plays a role in the long-term risk calculation for victims of data breaches as well. The following sections will take a deeper look at the Web3 data sales we found available in either of these venues.

Online Forums

Online hacking forums have come and gone over the years. Taking into consideration the growth of retrievable data events after 2019 there are only a handful that are worth considering in this context. These include Raid Forums, Breach Forums, and Dread Forums.

Given our data covers just over a decade of breaches, it is not surprising that multiple breaches cited Raid Forums as one of the go to forums for dumping and selling breach data. Raid forums was started in 2015 and operated on the clear net for years. However, in 2022 Raid Forums' domain was seized by US law enforcement in cooperation with Europol.

3e87f5a3-de1f-4bb4-8031-6c9933d8e678 Image: US And European law enforcement take down notice on the Raid Forums website

Dread Forums was founded in 2015 and appears to have been active through the end of 2022, though there are numerous indicators on social media that this forum may have also folded. We tried to access both the dark net (.onion) and IP2 versions of the forum, but these also no longer appear to work.

Immediately following the shutdown of Raid Forums, Breached Forums was launched. Breach Forums was the most logical place for users displaced by the Raid Forums seizure. It sported a similar a interface, member reputation scoring system, and sizable amount of activity reaching about 60% of the original user base of Raid Forums (approximately 550,000 users). Just one year later in March 2023, the FBI arrested the person running Breach Forums, Conor Brian Fitzpatrick, and after some internal drama about redeploying the site, it folded.

Less than a week after Breach Forums went down, another replacement appeared, purportedly being run by a self-proclaimed ex-Anonymous hacker named Pirata (@_pirate18). The forum is live but has failed to attract the communities from defunct forums as it only houses 161 members.

Numerous other markets appeared in the last weeks of March to try to capitalize on this vacuum. Some of these appear to be non-functioning, others are speculated to be law enforcement given their recent success in taking down these types of forums.

2b30dc56-438c-409a-94bb-67d701eb64f2 Image: VX-Underground list of forums following the closing of Breached Forums. Source: Twitter

We were only able to confirm the presence of Web3 data on one of these forums. ARES forums has reportedly absorbed some of the activity from the other closed forums, though it’s unclear exactly how much. This forum is alleged to associate with ransomware groups and other malicious actors in addition to also running a public facing Telegram channel that advertises data sales in its locked VIP sales channel. The channel went live on 6 March and launched hundreds of advertisements. This included two posts for centralized exchange-related databases.

f95a2186-cf75-4110-8dda-98d2d6de5cab Image: ARES Forum Telegram channel advertisement for centralized exchange data. Source: Telegram

Taken as a whole, the hacker and data dump forums community is currently dysfunctional. With no clear replacement for legacy forums, and an increased effort on the part of international law enforcement bodies to take these groups down, it is almost certain that forums will not be the avenue of choice for any major data leaks, let alone Web3 leaks, in the near term.

The Dark Web - Data Leaks on .onion Sites

Dark web markets and forums have a long history of being the place where people dump or sell data. These ecosystems face similar challenges to their clear net counterparts, which means they also face hostile takedowns from law enforcement though these appear to be more frequently directed at markets that facilitate drug sales. That said, there does appear to be a higher frequency of data leaks remaining accessible or, at the very least, being advertised even on less well known markets. This discrepancy appears particularly stark now in the face of a total takedown of the online forums that also hosted this information.

a80d607e-5ff4-4acd-bd7d-7fc46fc911ea Image: Ledger customer data for sale on a dark web market. Source: Digital Thrift Shop

Recall that in our sample of data breaches we identified, that data was likely retrievable for 23 out of 74 of the breaches we examined. Of those 23 we were able to find ten active data sale advertisements (43%). This sample is highlighted in our previous chart in green:

cbbfc2c6-8599-4cd0-bebc-267f054e79f6 Chart: Confirmed instances of breached data found for sale on dark web markets highlighted in green. Source: CertiK

The addition of paid data sales in this chart indicates a couple of things. First, we were unable to source the data for any breach that occurred after 2021. There is a reasonable possibility based on the nature of the targets in 2022 that their data would have been on any one of the now defunct forums. However, it’s difficult to confirm this, especially when none of these data sets have appeared in any of the forums that were intended to replace Raid and Breached forums. Second. these data sets were also notably not in any of the dark web markets where we saw data only from 2019 and prior. This is likely due to the markets where we sourced this data being quite old and less well known. We are unable to assess if this data is actually still available through these vendors, but the advertisements are still live.

Do These Data Breaches Pose Long-Term Risks?

Trying to quantify long-term risk is difficult, but it helps to compare data loss risk to the non-data related incidents in this sample. Remember, we can classify breaches that only resulted in immediate financial loss a lower risk because:

The loss is immediate and we can measure the impact in terms of fiat or cryptocurrency lost
Any data lost in the pursuit of funds is replaceable, meaning if a breach occurs private keys, passwords, and privileged network access points must be changed to fix the problem

Breaches that do lose sensitive data, particularly customer data, do pose greater long-term risks:

Much of this data is sold or provided for free on the dark web or the clear net, extending its long-term availability
Individual data points on customers, meaning phone numbers, first/last names, addresses, and transaction data are difficult or impossible to change.
- In the case that someone does change their personal information in light of a breach, all data for other individuals involved in the breach still remain at risk
The impact of such breaches is difficult or downright impossible to measure. Depending on the data lost, a victim can be the target of multiple instances of fraud or none at all.

This is further highlighted by the fact that we found data for sale from a breach in 2014. However, this particular data point further demonstrates the difficult nature of measuring long-term risk. The 2014 hack targeted the now defunct cryptocurrency exchange BTC-E which was seized by US law enforcement in 2017 - effectively making the risk associated with this data loss much lower than others. However, to be clear, there is still the ongoing risk that this data could be matched with data from newer breaches escalating long-term risk for individuals that have been involved in Web3 over this period of time.

Looking at this space as a whole, it is highly likely that data lost in 2019 onwards (particularly those whose sale is still easily located on dark web markets) poses the greatest ongoing long-term risk. Anyone impacted from 2022 onwards are almost certainly still at significant risk of their data being usable in any number of fraudulent activities, even if we could not physically find this data. Despite many online forums being taken down one should assume that any data lost, especially from very recent breaches, is likely still available somewhere and can resurface at any time.

Conclusion

The unfortunate truth of the matter is that security breaches have almost become an inevitability. Most people impacted by data breaches have limited means of redress when data is stored and processed by a centralized entity.

You can reduce your risk of exposure by limiting the amount of centralized services that you use, including centralized exchanges or entities that KYC their users. Individuals should also use two-factor authentication where possible to help prevent unwanted exchange wallet activity, or the use of PII to access or modify your account details. Depending on the nature of the breach, you may even consider trying to change some of the information exposed in a breach, such as email addresses or phone numbers. Finally, in Web3 data breaches there is the added threat of having your identity doxed if you intend to operate anonymously.

There are additional steps one can take to secure their data and investments. You can reduce the risk to your investments and finances by distributing your assets across self-custody wallets and hard wallets. You can also secure your data in the following ways:

Limit the number of centralized Web3 investment organizations or exchanges that you share your personal data with
Do not re-use passwords across platforms
Enable two-factor authentication on all of your accounts
Monitor websites that report data breaches which will tell you if you email address has been involved in a leak
Use credit monitoring services to monitor for attempted identity theft and bank related fraud