Skip to main content

What is data loss prevention (DLP) and how can it help your business?


What is DLP and why do organizations need it?

For more than 100 years, there’s been a gradual shift in technology occurring in how data is stored and accessed. This shift stretches back to when information was printed and stored in physical files and filing cabinets, to when information became digitized and data began being stored in digital files and folders on hard drives, to more recently with data being uploaded and stored in the cloud. This progression in data storage and accessibility tends to yield more agile communication and has the following advantages for end users:

  • Digital data occupies less physical space.
  • Publishing and distributing digital documents cost less than physical ones.
  • Digital documents tend to be more environmentally friendly than printed ones.
  • With the proper precautions in place, end users can determine what data or files they need to access from the cloud and then access it as needed from any device.

Paired with all the benefits of using new technology, such as cloud storage, are the challenges for maintaining data security and integrity. These challenges include preventing data loss and disruption and avoiding cyber attacks, data leaks, and similar digital threats.

Egress Software Technologies reports that during 2020, “95% of organizations say that they’ve suffered [some form of] data loss”.1 The report argues that data is the most at risk in email and notes the following statistics about vulnerable data, emails, data loss, and data breaches:

  • 85% of employees are sending more emails.
  • 83% of organizations [are] experiencing email data breaches.
  • 59% of IT leaders report an increase in data loss linked to the pandemic.
  • 68% of IT leaders believe that a future remote and flexible workforce will make it harder to prevent email data breaches.1

Today’s organizations apply data loss prevention (DLP) practices, strategies, and techniques to prevent or otherwise mitigate these threats and to help ensure the protection of their data. Proofpoint notes that “organizations are adopting DLP because of insider threats and rigorous data privacy laws, many of which have stringent data protection or data access requirements”2 and for support with “monitoring and controlling endpoint activities”.2

Techopedia defines DLP as “the identification and monitoring of sensitive data to ensure that it’s only accessed by authorized users and that there are safeguards against data leaks”.3 This definition notes that “the adoption of DLP in 2006 [was] triggered by insider threats [and] more stringent state privacy laws”.3 A means of keeping important or sensitive data secure, DLP helps keep data from being accessed by unauthorized users or otherwise passing over a perimeter gateway device. DLP tools help monitor and manage activities and perform actions like filtering data streams and protecting dynamic data.

Proofpoint offers a simplified definition of DLP, noting that it “makes sure that users do not send sensitive or critical information outside the corporate network”2. The definition continues, stating that DLP “describes software products that help a network administrator control the data that users can transfer”.2

What are the three areas where DLP provides support?

Digital Guardian notes that DLP helps organizations by providing support in the following three areas:

  1. Personal information protection [and] compliance
  2. Intellectual property (IP) protection
  3. Data visibility4

DLP provides the following additional support:

  • Insider threats
  • Office 365 data security
  • User and entity behavior analysis
  • Any other emerging threats

How can DLP help with personal information protection and compliance?

Today’s organizations frequently deal with their users’ personal information. This personal data can include everything from their email addresses to their users’ personal identifiable information (PII), protected health information (PHI), and other financial information, such as credit cards or online payment account information. This information is sensitive and could cause real damage if it were leaked. To ensure that this data remains safe, these organizations must follow compliance regulations, such as the Health Insurance Portability and Accountability Act (HIPAA), General Data Protection Regulation (GDPR), and the California Consumer Privacy Act (CCPA). Digital Guardian notes that DLP helps organizations “identify, classify, and tag sensitive data and monitor activities and events surrounding [their users’] data [and to] provide the details needed for compliance audits”.4

How can DLP help with IP protection?

Similar to PII, organizations managing data that includes IP, trade secrets and state secrets must follow certain policies and controls to ensure that this data remains protected against any unwanted access. CSO notes that IP DLP “aims to prevent that data from being pilfered via corporate espionage or inadvertently exposed online”.5 IP protection-based DLP tools can “use context-based classification [to] classify intellectual property in both structured and unstructured forms”5 and otherwise prevent any unwanted exfiltration of this data.

How can DLP help with data visibility?

Before you can develop a strategy to help prevent data loss and unauthorized access to your data, it’s best to first identify where your data is stored, how often it moves, and where it moves to. DLP tools for data visibility help provide a greater overview of your data infrastructure by providing data tracking and insight into how end users interact with your organization’s data.

How does DLP work and what are DLP solutions? 

DLP practices, strategies, and techniques try to answer the questions of how can we identify sensitive data in need of protection, locate data vulnerabilities, quickly resolve data issues after they’ve been identified, and ultimately prevent data loss? There are three different DLP solutions and each one tries to answer this question differently. The following are four examples of DLP solutions:

  • Network DLP
  • Storage DLP
  • Endpoint DLP
  • Enterprise DLP

These DLP solutions often use agent programs to scan or sort through data, and then locate data that are sensitive or otherwise at risk.

What is network DLP?

In its article, “DLP: What is it and how does it work?”, OSTEC notes that network DLP is “delivered on the software [and] hardware platforms [and is] integrated with data points on the corporate network”.6 After its installation, the network DLP tool works to give a vast overview of all the data that’s traveling through the network. The solution can produce this insight by monitoring, tracing, scanning, and reporting on the content that transitions through the network’s protocols and ports. These reports offer greater insight into your network’s data and help determine what data is being used, who is using it, how it’s being accessed, and where the data travels to and from. 

What is storage DLP?

Storage DLP helps control the flow of data—especially data stored in the cloud—and helps IT support get greater insight into facets of data use, such as what data is being stored and shared, how much data is considered confidential, and how much data is vulnerable. OSTEC states that storage DLP solutions offer visibility into the “confidential files stored and shared by those who have access to the corporate network”.6 This information can be analyzed and used to assess the vulnerable points of a network and offers tips for preventing data leakage.

What is endpoint DLP?

Although an increasing number of today’s data is accessed through the cloud, external storage will always be used—even if it’s in a more limited capacity. Tools, such as external hard drives, flash drives, or similar external storage, usually offer a faster means for transferring data than an online-only data transfer would. External storage tools like these will pose some security risks, usually in the form of data loss or leakage. Endpoint DLP tools are the best solution for these issues. OSTEC states that once they’re “installed on all workstations and devices used by company employees, [these tools are used] to monitor and prevent the output of sensitive data by removable devices, sharing applications, or clipboards”.6

What are enterprise DLP and integrated DLP?

Geekflare breaks down the types of data that DLP solutions work to protect, noting that data consists of the following three principal states.

  • Data in use. Includes the following data examples:
    • Active data
    • Data residing in RAM
    • Cache memories
    • CPU registers
  • Data in motion. Data traveling through one of the following network types:
    • Internal and secure network
    • Unsecured public network
  • Data at rest. The opposite of active data, this group refers to data in an inactive state and includes the following data examples:
    • Data stored in a database
    • Data stored in a file system
    • Data stored in a backup storage infrastructure7

CSO notes that enterprise DLP solutions “aim to protect data in all of these states”.5 Geekflare refers to enterprise DLP as a solution that covers “the entire leakage vector spectrum”.7 Inversely, integrated DLP solutions focuses on a “single protocol”.7 CSO notes that integrated DLP can also “be integrated into a separate single-purpose tool”.5

What DLP solution aspects does your organization need?

Because there are many different types of DLP solutions, what works best for one organization may not be the best choice for another one. Each organization must consider the following factors when determining their DLP coverage: 

  • Size
  • Budget
  • Types of data, including the range of sensitivity
  • Network infrastructure
  • Various technical requirements

When you determine what coverage or similar option is best for your organization, your assessment should provide the optimal balance of the following DLP solution aspects:

  • Comprehensive coverage
  • Single management console
  • Incident management for compliance
  • Detection method accuracy7

What is comprehensive coverage?

This coverage type offers full DLP coverage. With this option, DLP components provide full network gateway coverage and monitoring of all outbound traffic. This DLP coverage helps stop email data leaks and web and File Transfer Protocol (FTP) traffic leaks. It also provides oversight and helps prevent loss of the organization’s data within its data storage, endpoints, and active data.

What is a single management console?

Geekflare notes that a DLP solution demands “time and effort spent in system configuration [and] maintenance, policy creation [and] management, reporting, incident management [and] triage, early risk detection [and] mitigation, and event correlation”.7 Single management console coverage works best at supporting these demands and reducing risk.

What is incident management for compliance?

Incident management for compliance is a DLP solution aspect that necessitates what actions must be taken immediately after a data loss incident occurs. These steps not only help ensure that your organization avoids any fines or legal issues, but also help with data recovery, disaster recovery, and getting your organization back to where it was prior to the disruptive incident as quickly as possible.

What is detection method accuracy?

Detection method accuracy is the DLP solution aspect that helps separate the solutions that work best for your organization from the ones that don’t. DLP technologies frequently rely on a limited set of detection methods for identifying PII, PHI, or similarly sensitive data.

Pattern matching is the most common detection method. Techopedia defines pattern matching [for computer science] as “the checking and locating of specific sequences of data of some patterns among raw data or a sequence of tokens”.8 Despite how commonly used it is, pattern matching can still be inaccurate and result in lengthy incident queues because of false positives. Geekflare argues that the best DLP technologies “should add other detection methods to the traditional pattern matching [to] improve accuracy”.7

Why is data loss hard to prevent?

The simple answer to the question why is data loss hard to prevent? is because data loss is rooted in human error. Accidents, neglect, or malicious actions that employees or similar end users take can lead to data loss. Consider how easily a keystroke error could attach the wrong file or send PII or trade secrets to the wrong person.

Because it’s based in human error, discovering a reliable and practical solution for preventing a data breach is no simple task for any organization. The COVID-19 pandemic and the global increase of email use have made the threat of human-activated email data breaches an even bigger risk. DLP, and specifically email DLP, has become a critical tool for eliminating or, at least, mitigating this risk.

What are common DLP practices?

In its article Email DLP: Everything you need to know, Egress notes that “traditionally, email DLP software [uses] static rules to stop users from emailing sensitive or confidential data, [helping it protect] organizations from accidentally exposing sensitive data”.9 Applying native security tools in tandem with static DLP tools is another common practice for DLP. IT security or similar IT leaders are frequently responsible for establishing the static DLP rules for their organizations.

How do static DLP rules work and what are their advantages?

When implemented correctly, rigid static DLP rules prevent PII or similarly sensitive data from being emailed at all. Traditional static DLP technology prevents that same PII from being sent to an unauthorized recipient. Egress notes that, when an end user attempts to send an email, then “the content of messages and files is scanned according to the rules in place”.1 Following the scan, “if an email violates the selected criteria, [then the following actions may be taken]:

  • The email may be blocked or quarantined
  • The sender may be asked to modify its contents or verify the recipients
  • Encryption might be mandated1

What are the issues with using static DLP rules?

One of the big issues with static DLP technology and native security tools is that they’re “not capable of detecting context-driven incidents, such as an employee selecting the wrong recipient, attaching the wrong file, [and so on]”.1 Egress continues stating that “79% of IT leader respondents have deployed static email DLP technology [to] mitigate risk, “but they [state that it isn’t] a cure-all for breach prevention and 79% have experienced difficulties resulting from [its] use”.1

The most prevalent issue with static DLP rules is their inflexibility. They’re static and otherwise resistant to change. Egress notes that for IT leaders, there’s a lot of overhead that’s associated with “maintaining static DLP rules to ensure that [DLP technology is] adapted to manage emerging risks”.1 The most prominent example of this fact is displayed by how “37% of respondents said they had to alter rules to make [the technology] more usable, putting productivity ahead of security in a bid to up employee efficiency”.1

There’s an inherent balancing act that IT leaders must maintain. It involves creating a security-rich environment that’s also easy to use, while also maintaining productivity and keeping their internal and external end users happy.

Static email DLP rules are undermined by a current general lack of confidence in them.  “74% of respondents [believe] the static email DLP tools they use are less than 75% effective”.1 Perhaps more disturbing, the other IT leaders “accept that a minimum of 25% of data loss incidents will be undetected [and] 42% overall say that half of all incidents won’t be detected by the DLP tools they have in place”.1 This profound lack of confidence showcases both that IT leaders understand the limitations of static DLP and how it struggles to keep up with erroneous human behavior. While training and continuing education can increase user awareness and what actions to take or what not to do, and help reduce the probability of data loss occurring, human error can never be fully removed.

What is a solution for the issues associated with static DLP rules?

Intelligent DLP may offer a solution to the issues that static DLP rules are suffering. Whereas static DLP shows incapability in detecting context-driven incidents, intelligent DLP uses contextual machine learning, basing its processes on a thorough analysis of its users’ behavior patterns. These analytics are constantly updated and incorporate each user’s relationship with an email’s senders and recipients. Because of this thorough calculation, intelligent DLP is better able to locate abnormal or otherwise erroneous behaviors that could lead to data loss or security breaches. By quickly detecting risk, the user has time to be notified, giving them a chance to correct their mistake before the email is sent.

“Intelligent DLP can automatically apply the appropriate level of encryption based on email and attachment content and the risk associated with the recipient’s domain”.1 By removing the need for IT leaders to make decisions on what they feel is an appropriate level of encryption, the automation also removes the potential for human error from the process.

Data protection with backup as a service

Data protection with backup as a service (BaaS) is a potential solution to an organization’s DLP wants and needs. It provides backup infrastructure, backup software, managed support, and several other components and it’s deployable across any or any combination of the following environments:

  • On-premises data centers
  • Private cloud
  • Public cloud
  • Hybrid cloud

An end-to-end, fully managed data protection solution, data protection with BaaS includes the following components and benefits:

  • Hybrid-cloud and multicloud. This component supports numerous deployment topologies and can consist of a mix of on-prem or private cloud and a variety of major public clouds.
  • Instant restore. This component helps end users bypass the average time needed to restore data. Instead, they’re able to access the backup copy and directly run their workloads while migrating them over to production.
  • Offsite second copy. An offsite second copy of your data helps ensure your cyber resilience. In the event of a total site loss, this offsite second copy greatly accelerates your recovery process and makes your recovery time objective (RTO) much easier to attain.
  • Broad workload support. This component supports several different workload types, including MongoDB and other modern generation databases, and SAP and other traditional enterprise-class applications.
  • Self-service web portal. This component provides end users with application programming interfaces (APIs) for integration with end user automation tools and a bespoke self-service web portal. The web portal streamlines the process, submitting provisioning requests, making service changes, and viewing reports. 
  • Cloud object storage. This component provides highly resilient storage at a reduced cost for long-term retention backups. It also maintains shorter-retention operational copies that are stored on a faster disk, allowing for fast restores.