New Zealand Chief of Army Writing Competition Winner of the WO/NCO Category December 2021.
THE ‘DIGITAL LOGISTICS CHAIN’ – AN ESSENTIAL REQUIREMENT FOR INFORMATION WARFARE
By Ms D. Kendon
The world produced an estimated 64.2 zettabytes of data in 2020, forecast to reach 181 zettabytes in 2025.1 A zettabyte is one trillion gigabytes. At such an enormous scale, comprehending what that number represents can be difficult. Furthermore, our natural cognitive biases mean we favour the simple and concrete over the complex and abstract.2 When considering a future information warfare capability, there is a risk that we focus on the parts that are easier to understand, such as allocating paralines for S39 staff, and overlook the parts that are more abstract and nebulous, such as data management and analysis. However, this would be a mistake because data management and analysis capabilities will be essential for any future Army information warfare capability, as I will argue. I will use an analogy of the ‘digital logistics chain’ to show how we must establish foundational components to collect, store, analyse and disseminate data, before we can successfully employ information warfare capabilities. Just as we would not expect a manoeuvre element to function without a properly resourced logistics chain, so too we should not expect an information warfare capability to function without an effective ‘digital logistics chain’.
Data is the foundation for any information warfare capability
Any future information warfare capability will require robust and effective data management and analysis. Although we do not yet know the exact nature of the Army’s future information warfare capability, it will likely seek to interfere with adversary situational awareness or C2, protect our own situational awareness or C2, or influence other stakeholders, with the intent to give friendly forces decision advantage.3 These all require the timely collection, management and analysis of large volumes of data. To illustrate the data requirements, let’s take the example of interfering with adversary situational awareness. Among other activities, this would involve collecting information about the adversary’s intelligence, surveillance and reconnaissance (ISR) assets. These may number in the thousands due to the use of swarms of autonomous devices and soldier-mounted systems. We would be collecting a vast range of data from our own ISR assets, in a multitude of formats – such as imagery from satellites, telemetry from electronic surveillance, reports from manoeuvre elements, full-motion video from drones, and feeds from open-source reporting. It is important to understand that only a minority of these data sources will look like PowerPoint slides or Word documents. The majority will be beyond human scale to deal with, such as databases with millions of entries, or streams of unstructured code that require processing just to be readable. This information must be collected, stored and analysed before a commander can even consider their potential courses of action. If we want to achieve decision advantage, we need the ability to collect more information, from a wider range of sources, and process it faster, than the adversary.4 Shared drives and DDMS will not provide this.
Data science hierarchy of needs
What Army requires is a robust data management and analysis capability. Within the data science community there is a commonly accepted ‘hierarchy of needs’, which provides a suitable requirements framework.5 As with Maslow’s Hierarchy of Needs, growth must start at the bottom and progress upwards – we cannot skip any steps.6 The foundation is collection – for Army this would predominantly be collection from ISR assets. Next is a means to store and move this data around: physical pathways and communications infrastructure, such as servers and data centres. Once the data is stored and accessible, it then needs to be put in standardised formats (‘cleaned’) so that it can be analysed. These are the minimum requirements for human analysts to begin providing insight from the data to support effects. If we want to move to machine speed, we need to consider additional requirements. For example, machine learning and artificial intelligence need structured training data. If we try to employ a capability that is reliant on large volumes of reliable data being available at the point of need, it will fail unless we have established the necessary foundations first.
The Data Science Hierarchy of Needs. Source: Hackernoon.com
The digital logistics chain
So how is this comparable to a logistics chain? Let’s consider a multi-role battalion group (MRBG) deployed overseas. The commander establishes their plan, and their S4 turns this into the appropriate stores demands (collection). The stores then need to be moved from New Zealand to the overseas location, and require a facility to store them on arrival (movement and storage). Once there, they need to be organised so that each item can be found easily (transformation). Once these steps are complete, operational elements can use these stores to support desired effects. Appropriately trained staff support the system at each step.7 For information warfare, data is like the ammunition, fuel and food that keep the capability going.
To appreciate what our information warfare capability would be like without effective data management, think about what that operation would be like without a good logistics chain. Imagine trying to get stores if there was no transport. Could you find a piece of equipment if everything in the Q-store was left in a pile? How much freedom of action would you have if the logistics chain could only cope with one pallet of stores at a time? What would happen if there were no loggies? How could we operate with partners if parts did not have NSNs? It would not matter how many exquisite weapon systems you had at the front, they would not be effective if they could not get ammunition. Similarly, information warfare capabilities will be ineffective without good data management and analysis.
However, our current data capability is akin to trying to supply a gun battery with dismounted soldiers carrying shells in their packs – it limits us to what we can achieve at human speed and scale. In most units, soldiers store information across multiple shared drives and DDMS. At best, they organise data in folder structures and analyse it using Excel. Data is not discoverable, accessible, or normalised for use in advanced analytics or machine learning. Although doctrine states the requirement for knowledge and information management, we do not give it sufficient staff or resources. Soldiers spend the bulk of their time trying to find and marry up individual pieces of information, rather than analysing it, and cannot work at machine speed. The reality on the ground shows that we need a dedicated data management and analysis capability development effort.
Let operations staff focus on effects, not files
To illustrate a possible future state, let’s consider how improved data capabilities could enhance intelligence preparation for an information operation. Intelligence and planning staff could select their area of operations and target audience on an interactive dashboard, and the system would pull relevant data from a range of automatically-updating feeds. This could include population statistics, social media data, locations of communications infrastructure, and ISR feeds. Automated first-line processing would produce a combined information environment effects overlay, taking care of collation of information so that staff could focus on more advanced analysis. Data on adversary activities – such as previously observed tactics, techniques and procedures, locations and capabilities of assets, indicators and warnings – could be processed using machine learning to produce predicted adversary courses of action. Friendly force planners could then use this information to inform their scheme of manoeuvre. Concurrently with conventional wargaming techniques, they could also wargame their intended plan against an automated system that considered hundreds of variables that affect the performance of an information operation, ranging from the capability of communications infrastructure to crop yields to weather effects. Using both human and machine testing of their plan, commanders could conduct a more effective operation. The key point to note is that robust foundational data collection and management would allow operations and intelligence staff to focus on analysis and planning, rather than searching for and compiling piecemeal information.
Data capabilities need a range of specialists
To support this, we require an appropriate workforce. As with logistics having many specialist roles, data management and analysis requires the same.8 The training requirement for some of these positions means that people will need to be put into a training pipeline now to have them ready by 2025 (Op PROTECT dependant). However, by having roles that require less initial training, such as data ‘wranglers’, soldiers could get on the tools and provide value earlier. They can then work their way up to more specialised positions.
Collection, movement and storage require communications systems operators, information systems operators and systems engineers to build and maintain appropriate systems. The system also requires cybersecurity personnel to ensure that all processes and infrastructure are appropriately hardened. The current Signals trades could do this but would likely need additional personnel. Assuming the current soldier/officer division of roles persists, officers with degrees in data engineering could design the systems and provide leadership over these elements.
Cleaning and normalisation require data ‘wranglers’. Although operational support and information specialists are positioned to fill such a role, they would require additional training, possibly warranting a division of the trade into database managers and CP clerks. SNCOs qualified in database administration and officers with degrees in data architecture or database systems management could provide the necessary direction and leadership.
Once data is collected, stored, and normalised, we require data analysts to make sense of the data and turn it into formats that support command decision-making. We could initially meet this requirement with additional training for intelligence operators, or by creating a specialisation within the intelligence operator trade. However, we should expect data analysis to be of value beyond the intelligence realm, which will likely require data analysts under other capbadges.
Directing the entire analytic effort requires suitably qualified leaders. Data scientists, serving either as general list officers, specialist officers or civilians, could plan an analytic effort from end to end, and provide advanced data analysis. Such people usually hold advanced degrees, and therefore targeted recruiting or training effort will be needed.
Lastly, intelligence and operations staff will need training and education in the management and use of data so that they can integrate it appropriately into their planning processes, and fully exploit tools to work at machine speed. It is no use making these changes if commanders cannot use them.
Conclusion
Already, the scope and scale of data requirements for information operations warrants a comprehensive capability uplift, and this challenge will only continue to grow. Whatever the eventual shape of the future Army information warfare capability, it will not be effective unless it has properly resourced data management and analysis. We will need the infrastructure, workforce and ways of working that allow Army to collect, move and exploit large amounts of data. The success of the future information warfare capability will depend on the quality of our ‘digital logistics chain’.
Footnotes:
1https://www.statista.com/statistics/871513/worldwide-data-created/, collected 24 Oct 2021. Notably a forecast from 2016 underestimated the 2020 figure by a third – see Bradley M. Knopp, Sina Beaghley, Aaron Frank, Rebeca Orrie, Michael Watson, Defining the Roles, Responsibilities, and Functions for Data Science Within DIA, RAND Corporation, Santa Monica, 2016, p. 1.
2https://thedecisionlab.com/biases/bikeshedding/, collected 30 Oct 2021.
3ATP 3-13.1, The Conduct of Information Operations, Headquarters, Department of the Army, Oct 2018
4Dr. Christopher Paul, ‘Understanding and Pursuing Information Advantage’, The Cyber Defense Review, vol 5, no. 2, Summer 2020, pp. 113-114.
5https://hackernoon.com/the-ai-hierarchy-of-needs-18f111fcc007, collected 24 Oct 2021.
6A. H. Maslow, ‘A Theory of Human Motivation’, Psychological Review, 50, 1943, pp. 370-396.
7https://www.infogix.com/why-a-data-supply-chain-is-required-in-the-age-of-big-data/, collected 24 Oct 2021.
8Knopp et al., chap 4.