So What is the Cyber Threat from China, Exactly? - GDIL | Global Disinformation Lab at UT Austin

Part 1 of a 5 Part Series

Written by Kevin Lentz

Despite the abundance of cyber threat data freely available today, the exact nature of the cyber threat from China remains difficult to rigorously assess in academia. This is due to the general lack of a consistent and trustworthy source of nation-state-level cyber threat data. Entities with visibility on the topic, such as government agencies, Threat Intelligence companies, and occasionally news outlets, have overlapping restrictions and responsibilities shaping what data they can release and how. The result is a noisy open-source data pool comprised of curated annual cyber threat reports, occasional press releases supporting indictment cases, and news reports of varying quality. With no Bureau of Cyber Statistics on the way, scholars hoping to assess the cyber threat from China, or from anywhere, must wade through this noisy data pool and cobble together a dataset from scratch.[1]

Luckily, a small team of scholars and scholar-practitioners have been doing precisely this, on a global scale, for nearly a decade. The Dyadic Cyber Incident and Campaign Data (DCID) dataset is one of two peer-reviewed publicly available datasets tracking nation-state-level cyber conflict.[2] It tracks cyber conflicts between rival nation-state dyads from 2001-2020 across numerous variables.[3] It is, by its very nature, broad and comprehensive.

From this larger dataset we generated a smaller dataset limited to Chinese-initiated cyber conflicts targeting our countries of interest.[4] While still certainly incomplete,[5] the resulting dataset offers a clear basis on which to build a more concrete analysis of the cyber threat from China. This blog post is the high-level overview of that process, some general findings, and our objectives. It marks the start of a multi-part series covering some of the insights we gleaned from this DCID-derived dataset on a country-by-country basis.

At a high level, one clear takeaway from the analysis is that the Chinese Communist Party (CCP) has a distinct cyber strategy for each target country in the region. Though they share common features, the pattern of operations against the Philippines doesn’t look like the pattern of operations against Australia, which itself doesn’t look like the pattern of operations against Japan, and so on for each country. In the case of the Philippines, for example, maritime disputes in the South China Sea have been the dominant context in which cyber operations have occurred. Meanwhile, for Australia, cyber operations have consisted of long-term espionage targeting government and higher education institutions tied in part to influence operations outside of the cyber domain.

Another clear takeaway is that cyber conflict in the region remains low-intensity and mostly consists of espionage, as opposed to the ‘cyber apocalypse’ tone some mainstream reporting occasionally takes on. CCP cyber operations only depart from this general espionage to engage in harassment in response to acute crises, like the International Tribunal for the Law of the Sea ruling regarding the Philippines in 2016 and the Senkaku dispute with Japan in 2012.

Additionally, CCP cyber operations tend to support the goals set out in high level CCP strategic documents and speeches and therefore interlock with other contemporaneous CCP state actions as part of a whole-of-state effort. Here too the general pattern breaks to facilitate short-term responses to acute crises, sometimes undertaken by ‘patriotic hackers’ and other loosely directed forces.

Lastly, while low-intensity and espionage focused, the DCID-derived dataset produces a trendline of gradually increasing severity of cyber operations from China against the target countries over the past two decades.[6] Key in this regard are simultaneous cyber operations against multiple countries, the extent of infiltration and exfiltration, sensitivity of selected targets, and greater frequency of operations focused on more directly producing coercive effects.

Our goal with this analysis is to contribute to the ongoing conversation on the most effective form of cyber policy to secure cyber and geopolitical stability globally and in the Indo-Pacific as a single country and with our allies and partners. A subsidiary goal is to bring attention to the efforts of the DCID team, other similar projects, and the general state of reliable cyber conflict data in the academic domain. In this spirit, we gladly welcome any feedback or comments through the contact portal at www.gdil.org.

Note on Methodology

To generate the Chinese-initiated only dataset from the DCID we took the following steps.

Load the DCID dataset into Excel Power Query Editor
Filter out results in ‘dyadpair’ based on our desired country dyads
Filter out resulting incidents in ‘initiator’ not initiated by China (country code 710)
Eliminated numerous variables (e, g. ‘interactionenddate’ ‘ransomware,’ ‘3^rd party initiator’) to create a tidier data set.
Load the resulting dataset.

Data visualization and exploration of this dataset were conducted using Excel pivot tables and charts.

Limitations and Distortions

It is worth noting a few limitations and distortions in the resulting dataset. Firstly, the U.S.-centric nature of the initial DCID research means that cyber conflict in other countries is not as comprehensively covered. This is due to resource limitations and results in a view of our countries of interest that is incomplete. The resulting analysis should therefore only be taken as an indication of the general character of Chinese-initiative cyber conflicts and not a definitive assessment.

Secondly, we chose to eliminate the interaction end date of the resulting cyber conflicts to have a more easily visualized dataset. The tradeoff for legibility is that the duration of the resulting cyber incidents is lost. In reality, cyber incidents are not discrete one-day events and can be broken down into various phases that can span months if not years. Accordingly, it is worth stressing that the resulting dataset gives only a general indication of the temporal sequence of conflicts across the years and does not provide a definitive account of the exact time frame of cyber conflicts.

Thirdly, it is worth noting that cyber conflict is still a relatively young phenomenon. As the frequency and intensity of cyber conflict has increased over the years, we expect that media attention to the issue has as well. It is therefore possible that trends in the data may be caused in part by greater attention to the topic by the broader community.

[1] https://cyberscoop.com/data-gap-bureau-of-cyber-statistics/

[2] A European research consortium recently launched the ‘European Repository on Cyber Incidents’ which mirrors some of the work of the DCID but also departs in other ways.
https://www.lawfareblog.com/quantifying-cyber-conflict-introducing-european-repository-cyber-incidents

[3] A further update is forthcoming.

[4] The U.S., Japan, South Korea, the Philippines, Australia, Singapore, and Taiwan. See below for the methodology.

[5] The DCID team discusses the United States-centric bias of the dataset, for instance. See methodology for more on this.

[6] A general emergent feature of cyberspace noted independently and not strictly regarding China by Healey and Jervis (2023), ‘The Escalation Inversion and Other Oddities of Situational Cyber Stability’. Trendline produced by summing severity of annual cyber operations against all target countries over the time frame.