March 2, 2023

3 Questions to Ask When Evaluating a Data Discovery Solution


The modern world is fueled by data. It is being constantly created—the average person generates 1.7 MB of data per second—collected, and stored. But as data protection and compliance mandates such as GDPR and CCPA become more and more mainstream and cybersecurity initiatives are political platforms, the general public is becoming increasingly aware of and savvy to where their personal and sensitive data lives and how it’s being used.

Data drives business. Organizations that are data driven are more likely to disrupt and outperform those who are not. And yet organizations are leaving up to 73 percent of their data unused in analytics and decision making. Data is critical to successful business operations, whether that’s planning for the future, understanding the past, or even avoiding reputation and revenue loss due to a data breach.

But in order to properly use and protect data, you have to be able to find it first.

Do You Really Know Where All Your Data Is?

Data discovery is what enables an organization to identify, catalog, and classify business-critical and sensitive data so it can be governed for meaningful purposes with increased transparency. Knowing what data you have and where it is stored allows you to not only use the data you have, but protect it as well.

Consider this: Sensitive data is organized into a report on a specific platform. While the data may be protected by passwords and policies within the original platform, what happens when that report is exported for analyzing? It may be sent to team members through email. The report with all its sensitive data may be saved to a desktop by an employee who means no harm, but wants to be able to work through the data while they do not have internet service to access the network. The work is completed and a final report analyzation saved back to the network, but the original downloaded data still exists on the employee’s computer, unprotected. The data may then be acquired by threat actors via a cyber attack, or even theft of the device. Or, the company may be dinged during a compliance evaluation when sensitive data existing unprotected and out of scope is discovered.

There are multiple reasons to employ data discovery; at the end of the day, it’s simply good for business.

Finding the Right Discovery Solution

The market is replete with data discovery solutions. So how do you decide which one is right for your organization?

It starts with being able to answer these three questions:

  1. How accurately does a solution identify sensitive data?
  2. Can the discovery find data anywhere across your organization?
  3. How well does discovery integrate with the cybersecurity and data protection systems you already have in place?

How Accurately Does A Solution Identify Sensitive Data?

We believe you should have complete confidence that what your discovery solution identifies as sensitive is, in fact, sensitive, without adding manual hours to verify. It’s not just about finding a 16-digit number and calling it a credit card. It means verifying that that string of numbers actually is a credit card number.

At PKWARE, we accomplish this with discovery that checks patterns in strings, in grammar, and in context to drive confidence in the value of the discovery. Patterns in strings means that our discovery can handle data elements expressed in different formats. Patterns in grammar allow our solution to differentiate when the word “April” is used as a name, street address, or calendar month. And patterns in context means that our data discovery can identify an element present only in the vicinity of other elements, such as a ZIP code. These built-in check points then produce a confidence scoring so users can easily decide what the next steps for that data should be. They eliminate false negatives and dramatically reduce false positives. All of which work in concert to better find and thus protect every piece of data that exists in your organization.

Can the Discovery Find Data Anywhere across Your Organization?

Data ends up everywhere in an organization. Purposefully, it can be stored on-premises, in the cloud, on file servers, databases, data warehouses, and data lakes. As noted above, it also ends up in places it’s not supposed to be, such as desktops and local drives. Not only does data in unexpected places create a security concern, it can also cause issues in adhering to data compliance mandates. For instance: PCI DSS version 4.0 includes a new requirement (12.10.7) for incidence response procedures that are initiated “upon the detection of stored PAN anywhere it is not expected.” Which begs the question: How do you alert for something you’re not scanning for because it’s not in scope?

At PKWARE, real-time discovery runs automatically or in scheduled scans across a myriad of platforms, including all common user technologies, whether the organization considers the location in scope for compliance or not. When data is discovered, our solution can stream its events in syslog and other common formats, allowing organizations to properly alert on sensitive data discovery. Data discovery events can then be handled the way security teams manage existing incidents, without the need for new or additional applications or complexity.

How Well Does Discovery Integrate with What You Already Have in Place?

Plenty of software solutions are more than happy to tell you how things need to be done when working within their confines. Organizations already have policies and processes in place for working with data; creating new or additional steps that lengthen the time it takes to accomplish a project could prompt employees to find workarounds that sidestep data security. This could lead to issues such as saving data exports to desktops and other places it doesn’t belong.

At PKWARE, we believe data should be the star of the show. Our discovery solution works with whatever you already have in place, whether that is process, policy, or even other cybersecurity software solutions. Existing processes don’t need to change: What PKWARE does integrates directly into what you’re already doing, automating the value in your existing tools. With hundreds of connections available, our discovery solution will help you find what’s unique and protect what’s important—all without changing existing workflows.

Data is the life-blood of any organization, so it’s essential that it be protected. And that protection starts with knowing what data you have and where it lives. PKWARE can help. Request your personalized demo to find out how.

Share on social media
  • Apr'24 Breach Report-01
    PKWARE April 17, 2024
  • Data Retention: Aligning Data Protection Strategies with Compliance Requirements
    Ben Meyers March 13, 2024
  • Data Breach Report: March 2024
    PKWARE March 8, 2024
  • PCI DSS 4.0 Compliance: Safeguarding the Future of Payment Security
    PKWARE February 22, 2024