Why You Need a Data Flow Diagram
- Written by: Sammi LaBello and Tony Schwarz
If you’ve never taken the time to create a data flow diagram (DFD) of your system, you should make that investment. The value doesn’t lie simply in having the diagram in your files. The actual process of creating the diagram will almost certainly reveal new things about how your data ecosystem works.
That’s why many IT auditors and frameworks (such as PCI DSS) consider network diagrams and DFDs so critical that they expect to see them as part of the audit process. Cybersecurity consultants frequently use DFDs to drive detailed conversations about the environment, knowing that the clarity they provide can be a deciding factor in your network’s security.
In this blog, we’ll explain how a DFD helps ramp up your overall cybersecurity and how to get started creating one.
What a DFD Can Tell You
A DFD shows exactly where your data is going so that you can make sure every step is secure. Diagramming your process may show, for example, that your data is far more widespread and replicated than you previously thought. Replicating data in multiple places, or passing it through several resources while in transit, can represent a security risk you need to analyze.
In large or complex organizations, the leadership team may rely upon a DFD to help them understand the entirety of what’s going on across what could add up to hundreds of business processes. Some organizations require several DFDs to accurately portray all the operations.
Be careful of telling yourself, “We get a good view of our system through other tools.” Don’t think you’re covered just because someone on the team can verbally explain the system. They’re probably leaving things out, and that person may not be around forever. And don’t count on a typed explanation of the system. A written description almost certainly includes enough ambiguity to let problems hide in the fine print.
Diagramming Third Parties
A thorough diagram follows your data beyond the limits of your own system. Here’s an example of the kind of process red flags that a DFD can reveal:
While making the diagram, you start asking how your payroll processor sends and receives sensitive data. It turns out that they’re sending it over unencrypted channels. If that’s happening, even the best cybersecurity posture on your end won’t protect sensitive data such as employees’ personal information. If client information were compromised through a similar process weakness, your organization could face immediate financial losses and long-term damage to your reputation.
In some cases, HBS consultants create DFDs specifically for each third-party system connecting to the client organization. This clearly shows how data flows in/out of the third-party’s environment and where it may need additional protection.
The DFD Creation Process
HBS follows three core steps to build an organization’s DFD, with the details varying based on the organization’s overall cybersecurity maturity.
- Discussion – Leadership, possibly guided by cybersecurity consultants, reviews the potential risks facing an organization. They identify protective systems already in place, cybersecurity protocols, corporate governance, connected vendors, and key business processes. As you start building the DFD, be sure to look over the results of your most recent risk assessment for information on what you should map. This helps leadership identify risks, plan mitigating controls and evaluate whether the remaining risk is acceptable.
- Asset Inventory – Now the organization documents the hardware, software, and data used throughout the business. That includes both at-rest and in-transit systems.
- Hardware – The asset list should include any physical asset that may store or come in contact with an organization’s data. That includes computers, mobile devices, network equipment, printers, scanners and more.
- Software – Every organization has a set of approved applications that typically includes financial applications, fixed contract applications and licensed applications. The software inventory should include documentation of the software’s end of life.
- Data – Include a list of data types stored, how it is stored, and who in the organization owns it.
- Develop Diagrams – With all of the information gathered, you can start mapping out the DFD.
DFD Basics
A DFD uses a universally accepted set of symbols to portray information flow within and between network segments as well as through the institution's perimeter to external parties. A basic Level 0 diagram shows the overall system, while Level 1 diagrams drill down into individual processes.
DFDs should identify:
- Data sets and subsets shared between systems.
- Applications sharing data.
- Classification of data (public, private, confidential, or other) being transmitted.
- How data is identified at rest and in transit.
Who Is Using the Data?
While creating the DFD, you’ll look at how two types of users touch your data:
Employees - Look at the roles of each employee involved at the different steps of data flow. All employees have some level of responsibility for information, communication, and reporting between each level. Evaluating employee roles and access helps you spot gaps in the process. For example, organizations frequently realize that they should reduce the network access employees have at several steps, which reduces the potential for a breach and pivot into the larger system.
Vendors - Working with vendors inevitably poses a threat by allowing outside access to internal processes. Consider these factors about vendor access:
- Identification – Create a profile on each vendor with name, address, key contacts, service provided, contract details, and expenses.
- Grouping – Group vendors to determine which ones may be considered “critical.”
- Level of Risk – Does the vendor have access to information that would be highly impactful to the business?
What Is Happening to the Data?
Determine whether employees, vendors/partners or systems are writing, modifying, storing, or processing data. The answer is crucial to protecting data and mapping out the process with a DFD.
Where is the Data Going?
Knowing where individuals, resources, and entities are located helps you judge the risk of an interaction. For example, data moving between two resources in the same building may be less risky than data traveling from an office in the United States to resources in a foreign country, or vice versa.
For more information on how to properly establish a DFD, reach out to an HBS consultant to find out how the process can look for your organization.