The role of classification in DLP strategies


SC magazine released its review of classification solutions and weighed in on its importance in DLP strategies. The full review can be found here

At the risk of stating the obvious, the core mission of Information Security is about protecting sensitive data. Sensitive data means a lot of things to different people. And thus it is about accurately classifying data into what is sensitive and what does not pose a risk of harm. Hackers make headlines when they compromise valuable data, be that financial information, personal data, state secrets, or simply embarrassing emails whose publication draws unwarranted and damaging public attention.
Is Your Data Classified?
Headlines are not made nor are fines levied when successful data theft by hackers leads to access of say, an organization’s office supply purchasing data or the monthly condiment consumption in the company cafeteria. Such hacks, if they occur, are certainly unnerving to the IT security department because they would reveal exploitable openings, but they don’t make the evening news, nor do they result in financially meaningful damages.

So, if the name of the game is protecting sensitive data (aka valuable data) from unwanted access or distribution, the first step needs to be locating that data. IT departments can’t protect what they can’t find, and in order to protect the data they find that is valuable, they need to also know how to classify what they found into data sensitivity categories that they can then focus their protection strategies on.

It is for this reason that any protection strategy needs to not only include knowing where the sensitive data is, but it also needs to include relevant and precise classifications so that sensitive data can eventually be monitored and protected. Different types of classification approaches can be applicable: Common are military-derived approaches where data is separated into categories like public, private, confidential, or secret.

However, other classification schemes are possible that can address the needs of commercial enterprises, such as classifying data by how much post-breach damage its loss could do. The business impact. For example, losing financial information or healthcare data that falls under compliance regulations can be as costly as it might be losing corporate secrets.

Regulation or compliance-based classifications are also possible, e.g. classifying data that falls under PCI, HIPAA, SOX or other data privacy regulations. Custom classification schemes are also possible; the above-mentioned corporate secrets are one example such as with, say, law firms where certain clients’ data can be classified as sensitive and needs to be found, classified as such, and safeguarded.

Thus, classification schemes are a function of what type of data needs to be protected, as well as what types and magnitudes of consequences are supposed to be prevented, all while allowing the data to be properly used for legitimate business purposes. Data then needs to be classified accordingly, and the more accurate the search and classification algorithms of a classification solution, the more types of sensitive data can be identified and classified accurately.

However, achieving compliance is not the same as achieving data security, and thus additional considerations come into play when thinking about data classification: When and how do you classify sensitive data? How often do you need to classify your data? Who will perform the task of classifying the data?

Though often thrown into the same bucket, monitoring and classifying data as it transitions the perimeter of a network (‘data in motion” or DIM) is quite different from classifying data that is stored (“data at rest” or DAR). Data-in-motion-based classification strategies, such as they are implemented as part of DLP solutions, are often seen as difficult to implement, and if perpetrators gained insider credentials, they can be ineffective.

And the decision needs to be made if DAR-based classification technologies need to complement or even replace DIM-based approaches to ensure all sensitive data, whether structured or unstructured, is discovered and appropriately and accurately classified. The recent, well-publicized breaches have highlighted this gap in predominantly exfiltration and / or infiltration focused security strategies: If the bad guys manage to penetrate a network perimeter, or sensitive data makes it out, which has been increasingly shown can be the case, esp. if attackers are credentialed, the data-at-rest discovery and classification technologies are a needed, additional line of defense to ensure sensitive data does not get into the wrong hands.

When discovery and classification take place in an automated fashion, in near real-time the window of opportunity between the creation of new sensitive data, and its discovery and classification closes. Manual classification approaches, besides being subjective and prone to human error, thus cannot alone be the solution for sensitive data classification; in some cases, the time lag between creation and manual discovery and classification of sensitive data can be months.

For automated discovery and classification solutions to be effective, complementary or even alternative security strategies, they have to be highly accurate with both, low or zero false negative and false positive rates. Accuracy being dependent on the ability to discover and classify all data locations, and all data types and formats via the implementation of highly precise search and classification algorithms.

Once implemented such data-at-rest discovery and classification solutions have two principal benefits: By precisely locating and tagging sensitive data, post-breach losses can be minimized because the bad guys can’t steal what they can’t find. And they allow organizations to more efficiently target their security spend: Encryption technologies, for example, can be focused on the locations where sensitive data actually resides, or egress controls can be tuned to monitor certain types of sensitive data that are found to be the prevalent sources of risk for a security organization.

So, sensitive data classification – when done accurately, and in real-time – will reduce both, the residual, non-zero risk of post-breach damages, as well as the security investment required to protect against sensitive data loss.

Do you have regulatory data? You do now!


FTCRecently I sat down with Renee Murphy and our own Chief Council, Neil Stelzer, to discuss regulatory topics including the impact of the 7th circuit effectively lowering the threshold needed to bring a class action lawsuit in a data breach. The ruling states that harm occurs the moment that a data breach occurs, not if or when the data is used to commit fraud. On the heels of that ruling, the 3rd circuit has now upheld that the FTC can sue companies for being breached.

The ruling confirms that the FTC does have the authority to enforce its regulations against every company doing business in the United States. In this case the regulation is as simple and broad as ‘having to invest adequate resources in cybersecurity.’ The full statement from the ruling says: “A company does not act equitably when it publishes a privacy policy to attract customers who are concerned about data privacy, fails to make good on that promise by investing inadequate resources in cybersecurity, exposes its unsuspecting customers to substantial financial injury, and retains the profits of their business.”

Continue reading

New York Times Article and App Quantifies Consumer Sensitive Data Exposure


NYT WebAppToday, Josh Keller, K.K. Rebecca Lai and Nicole Perlroth published a fantastic interactive piece of content in the New York Times that directly points to the very serious nature of identity theft. The article features an app that prompts readers to indicate whether or not they’ve provided sensitive data to 26 well-known organizations (retailers, healthcare organizations, social media sites, etc.) that experienced a recent data breach. Upon completing the survey, a tally showing the number of instances hackers have potentially seen the reader’s sensitive data.

Visualizing actual numbers that directly relate to individual sensitive data exposure is a perfect way to illustrate how often consumers are victimized. Perhaps more alarming is the fact that only 26 organizations are included. Another site that tracks similar information and allows the database to be searched is Troy Hunt’s ‘have i been pwned’ site which lists several dozen more. It is safe to assume that this is the tip of the iceberg, given the almost daily incidences of data breaches.

Continue reading

The Census Bureau Was Hacked. Did Anyone Notice?


US Census BureauThe U.S. Census Bureau confirmed that they suffered a data breach involving compromised “non-confidential” data, such as employee names, email addresses, phone numbers, etc. This, of course, is a big deal. However, against the backdrop of the Office of Personnel Management incident or last week’s breach of Ashley Madison, the news seems less significant by comparison (we would assert that it’s not).

But let’s stop and think about this for a moment: a person or a group made their way into a governmental agency’s network and started plucking data at will, and it barely registers a shrug of the shoulders?

There was a time when any breach of a governmental office would be front page news, but we now live in an era when only so-called “mega breaches” grab our collective attention, and even then we’re only paying attention until the next big one. And typically the wait isn’t long. So what does that say about the state of things?

It hammers home a point we’ve been voicing for some time now: Data breaches are a matter of time, not a matter of money. Traditional security initiatives will eventually be bypassed. It may be several years, but it will happen. No amount of preparation, no matter the size of an organization and no level of attentiveness will result in a 100% success rate.

The unfortunate acceptance that data breaches are simply a price of doing business will hopefully force industries to rethink their strategy. Should they abandon traditional security strategies? Of course not. However, the winning mindset requires doing everything you can to reduce the risk of a data breach and to reduce the associated post-breach damage (customer churn, reputation, regulatory fines, stock deflation, law suits, etc.).

Part of that risk reduction requires organizations to scrutinize how they manage sensitive data prior to a breach. Are they carrying only the minimum amount of data, is it all where it should be, is it properly classified, are they able to easily detect anomalies and quickly remediate the problem? These questions must be answered with a “yes” to ensure that your data risk is at its lowest possible level.

As breaches move from mole hills to mountains and become a near-certainty, the old mindset seems more and more outdated. Sensitive data management must have an equal part in a company’s overall security strategy. You can’t remove the possibility, but you can reduce the risk.

Keep up with Identity Finder:

37 Million at risk due to Ashley Madison hack

Late Sunday evening, security blogger Brian Krebs reported that sensitive data belonging to 37 million users of Ashley Madison had been stolen by a hacker or hackers identifying only as The Impact Team.  As is typical with incidents of this nature, names, addresses, credit/debit card information and other personal information may have been taken. However, the article asserts that an alleged lapse in sensitive data management practices may have been the motivation behind the attack:

“In a long manifesto posted alongside the stolen ALM data, The Impact Team said it decided to publish the information in response to alleged lies ALM told its customers about a service that allows members to completely erase their profile information for a $19 fee.

According to the hackers, although the ‘full delete’ feature that Ashley Madison advertises promises ‘removal of site usage history and personally identifiable information from the site,’ users’ purchase details — including real name and address — aren’t actually scrubbed.”

Why the data wasn’t removed from the service is up for debate, but let’s assume the company was doing its best to make good on the service offering and not making false claims. Perhaps pieces of data were retained for compliance purposes or data was copied and saved to an unsecure location.  There are any number of reasons why sensitive data is inadvertently compromised. However, the “why” isn’t as relevant as the fact that it was there in the first place.  Regardless of intention, the damage that can be suffered by organizations that do not appropriately align business strategy with sensitive data management practices is very real. Ashley Madison offered a premium service that promised enhanced data protection, and once that becomes a part of an organization’s business model, there really is no room for error.

Risk-reducing sensitive data management requires classifying every piece of data, monitoring, detecting out-of-place data and remediating the situation immediately. As this incident and many others like it prove, sensitive data management practices—or lack thereof—directly impact risks associated with data breaches. No, it’s not easy, but it’s clearly non-negotiable.

Removing the Complexity of Data Exposure

The process of securing an organization against a data breach can be dizzying, if not impossible. There are any number of ways that a cybercriminal can get into a network—never mind the thought of an internal hack or accidental sharing of sensitive data. Should you build up a stronger perimeter, double-down on encryption, invest in business-user training to instill proper data management? The questions are seemingly endless, and being able to answer them all correctly is highly stressful. However, much of that complexity can be vastly reduced. This isn’t to say it’s easy, but by taking a few very critical strategies into consideration, preparing your company against data exposure can be more straightforward and effective.

  • Accept the inevitable. You will be breached eventually. There, we said it. It’s the elephant in the room. If the last two years have taught us anything, it’s that you cannot build a wall high enough to keep intruders at the gate. Once that reality is accepted, you can realistically move forward to institute policies and solutions that minimize the risk of a breach and shrink the associated damage once it happens.
  • Know your data. Breached organizations are very often surprised by the data that’s uncovered. Documents containing SSNs, credit/debit card numbers, home addresses, etc., often live in unseen, yet unprotected areas. Documents that are no longer legally necessary to store (and lots of them) are taken. Have a clear understanding of all the data in your possession and create strict retention and deletion schedules to ensure the smallest possible data footprint at all times. The less there is, the less that can be exposed.
  • Customize your solution approach. Whatever tools you utilize to minimize your chances of sensitive data exposure, make certain you understand every aspect of how your data is created, classified, saved, stored, retrieved, etc. Armed with that information, the selection and areas of focus for data security solutions become a lot less complicated. Additionally, it is critical that the solutions you choose are compatible with other existing tools in your information security arsenal.
  • Have a response plan in place. A worst case scenario requires a planned reaction. Data breaches will put your organization in a tough spot with customers, regulatory agencies and, depending on the size of the breach and types of data leaked, the media. A key part of that response is knowing exactly what data has been taken. An “I don’t know” will exacerbate the reputational damage associated with the incident.

There’s no easy way through a data breach. Sadly, such incidents are no longer avoidable throughout the totality of an organization’s existence. However, with the proper mindset, organizational insight and planning, an easier-to-implement plan to protect sensitive data can be achieved.

Hacked federal files couldn’t be encrypted because government computers are too old


A great deal of attention has been given to the recent government data breach, which put a reported 14 million current and former government workers’ sensitive data at risk. While the details continue to be sorted out, this incident—along with other highly-publicized breaches—hammer home the fact that strategies that focus on “keeping the bad guys out” or on monitoring data crossing a network perimeter alone are not enough anymore to protect an organization’s sensitive data.

The telltale evidence supporting this assertion is that despite the growth in traditional security spending, breach sizes and frequencies are on the rise. Consider data from the recent IBM-sponsored 2015 Cost of Data Breach study by Ponemon:

  • 65 percent of organizations surveyed say the attack evaded existing preventive security controls
  • 95 percent of organizations surveyed did not discover even their breaches for at least three months

Despite best efforts to keep intruders at bay, organizations recognize that blockading their networks is only one part of a larger data protection strategy. The study also suggests that the average breach in the US costs $6.5M, with catastrophic breaches well exceeding the largest loss amount of $29M that the study had sampled.

What’s clear is that a holistic approach that addresses sensitive data management is just as important as traditional security concerns—encryption, prevention, etc. Only focusing on security strategies that prevent infiltration and/or exfiltration leaves a critical flank unguarded, and can lead to a false sense of security. If the locations of sensitive data are precisely known and preventive measures to protect such sensitive data are taken, such as quarantining, destroying or redacting data, there is nothing for them to find or steal should they make it “past the gate.”

The end result is a significant reduction in the post-breach losses associated with sensitive data breaches. Further, sensitive data management strategies don’t require a complete redo of an organization’s security strategy. While not all breaches are “mega breaches,” every organization has sensitive data it wishes to protect. Making sure that this critical data is where it should be and eliminating all sensitive data that should no longer be present is an important key to overall data risk management.

How Jeb Bush Exposed 12,564 SSNs from a Decade Old Data Breach

In 2003, a PowerPoint presentation containing 12,564 Names, Social Security Numbers, and Dates of Birth was attached to an email sent by an employee of Florida’s Development Disabilities Program to then Governor Jeb Bush and 47 other recipients.  In addition to Florida state government email recipients, the message was sent to email addresses at,,,,,,,,, and  Due to the historical challenges when attempting to precisely discover sensitive data and accurately classify it, it was extremely difficult to find and control sensitive data.  But today, solutions can automate these processes and help organizations prevent sensitive data leaks such as the incident with Mr. Bush and the recent Sony attack.

Identity Finder researchers used its recently announced Sensitive Data Manager 8.0 software to automatically analyze the email messages and attachments posted to the Internet by Mr. Bush and quickly and precisely identify sensitive personal information.  The results included the discovery of a single file that has exposed over 12,000 individuals to the long term risks of identity fraud. Click the image to enlarge it.


That file was a presentation was a single slide displaying a chart depicting the district level trends for a waitlist.  That chart was pasted from Excel into PowerPoint as a “Microsoft Office Excel Worksheet Object,” which Microsoft states “provides access to the entire worksheet in the presentation, including data that you may want to keep private.”  As you can see from the below screenshot, it only appears as if a picture of a bar graph exists, but in reality there is a wealth of underlying data from a large Excel file embedded into the PowerPoint file.  This functionality makes it very difficult for organizations to control information but extremely easy for hackers and identity thieves to gain access to unintentionally exposed, sensitive data.


This example spotlights an extremely common data problem in organizations today:  employees forget, or never knew, that confidential information exists causing their sensitive data footprint to unintentionally grow thereby creating additional targets for cyberattacks.  This problem is extremely difficult to solve for enterprises with poor data discovery and classification tools.

As noted in the screenshot below, the chart is editable, not simply a picture.  The underlying Excel data becomes visible when the chart is double-clicked:


Notice at the top it shows Column N, O, P, etc.  By scrolling to the left, A, J, K, L and M appear.  Columns B through I are hidden from view but are still there and contain a wealth of data:


By unhiding Columns B through I, multiple columns of sensitive data are exposed.  These include Social Security Numbers, Last Names, Full Names, Middle Initials, and Dates of Birth; all the information needed to commit identity theft – such as filing a fraudulent tax return to claim a tax refund.


There are 12,594 people listed in these columns and their personally identifiable information has been exposed outside the State of Florida since 2003 and were exposed to the world when Mr. Bush posted his Outlook PST files (containing over 300,000 e-mail messages and attachments) online publicly. Those individuals should check their credit report immediately to see if they are already a victim, start monitoring their credit, and potentially place a freeze on their credit report.

Between this innocent mistake, the collateral damage from Sony, and the targeted attacks at JP Morgan Chase, Target, Home Depot, and the hundreds of other breaches in 2014, organizations must start to understand the critical importance of reducing their sensitive data footprint and shrinking the target!  Businesses can no longer believe that they can block cyberattacks and keep the bad guys out of their networks.

Sony Pictures’ breach is turning into a horror movie

As you may have heard, Sony Pictures was recently breached. Today’s story in the Wall Street Journal and other media outlets indicates how bad the Sony breach was. If you haven’t heard about this breach, it initially looked like Sony’s intellectual property was the only sensitive data stolen: a number of their unreleased Christmas “blockbusters” were posted online with millions of downloads/views on all the sites that people are using to share media. It turns out, however, that movies were only the beginning of Sony’s sensitive data breach nightmare: more than 33GB of sensitive data belonging to the firm was also posted by hackers.

The folks here at Identity Finder used our enterprise software, Sensitive Data Manager, to discover that more than 600 files that contained social security numbers (these included Acrobat PDFs, Excel spreadsheets, and Word docs) with more than 47,000 unique SSNs were publicly available as recently as Wednesday. In total, those same SSNs were referenced over 1.1 million times in the files, making it quite easy for hackers hoping to steal SSNs to be successful.

Most files containing SSNs were accompanied by other personally identifiable information, such as full names, dates of birth, and home addresses, which creates a clear path for criminals intent on committing identity fraud. Much of this data belonged to more than 15,000 current or former employees of Sony.  Through no fault of their own, deeply personal information such as salary and termination dates and reasons (where applicable) is in the wild and there is very little that these victims can now do about it.

Unlike other forms of sensitive data, such as debit and credit card numbers, Social Security numbers cannot be easily replaced or reissued once compromised. Organizations that experience such a breach are exposing employees and customers to potential identity fraud, which can take many years for victims to remediate. This particular breach serves as yet another example of the importance of proactively discovering and classifying or remediating unprotected sensitive data to prevent theft by cyber criminals.

There’s a good chance that in addition to the employee data, you will also hear something about “celebrity data” that was stolen in this breach because that’s what gets the headlines. It’s certainly unfortunate that their data was also leaked. However, the takeaway from this is that in 2014, the so-called “Year of the Breach,” it has become even more apparent that breaches are inevitable, but what doesn’t have to be a foregone conclusion is that this very important, very sensitive information will be stolen. A comprehensive sensitive data management program that addresses data discovery, data classification, and data protection will minimize the sensitive data footprint and shrink the target. Managing sensitive information can easily keep the crown jewels of an organization right where they belong: safe and secure, no matter who breaks into the network.

Although this was terrible news to find, it doesn’t lessen Identity Finder’s commitment to furthering research that promotes the mission-critical nature of sensitive data management. In 2014 alone, Identity Finder utilized Sensitive Data Manager to uncover more than 630,000 Social Security numbers exposed on IRS Form 990 tax returns and commissioned a Javelin Research survey that examined post-breach customer attrition in three critical industries.

We certainly hope we don’t find any more data out there. If you are one of the victims whose SSN was made public during this data breach, you can freeze your credit so that it cannot be used, or sign up for credit alerts when your SSN is used.

Sensitive data: it’s what hackers want

Bloomberg News, USA Today, and other news sources are reporting that banking giant JP Morgan has suffered a major breach, reportedly gigabytes of sensitive data.

Although law enforcement agencies are still sorting out the cause, early reports indicate that the origin of the attack might involve gaining access to the JP Morgan network via a single employee’s personal computer. Unfortunately, this is a common cause of breaches because there is typically a weak link when trying to penetrate an organization’s perimeter—and that weak link could be as small as one employee’s password enabling remote access to his or her system and the network.

It is all too common to see hackers using a small amount of data to access a system, then farming for sensitive data that could gain access to the entire network. In addition, there are oftentimes a great deal more passwords that are buried in the data stored on a computer, and these passwords can be used as a launching pad to attack other machines in the network. Those machines ultimately pave a path toward whatever data the hacker’s ultimate goal is.

Hackers are harvesting sensitive data such as passwords to gain further access to an organization, but once they have access, they’re then taking other sensitive data such as credit card numbers, account information, social security numbers, trading information, and intellectual property.  A large enterprise like JP Morgan typically has all of the above—and with the number of customers JPMC has, it is a huge target for stealing information that could lead to identity theft.

Stealing social security numbers provides a quick win for hackers, as opposed to a different piece of data that could be used for insider trading, simply because selling an SSN on underground websites is harder to trace than buying and selling stock using insider secrets on the market. As we saw with Target, one contractor’s password ultimately led to the theft 40 million credit cards.  While Target suffered not only from a great legal and recovery expense, but also loss of revenue, the hackers have been profiting selling that sensitive information ever since.

Attacks today are less about defacing a website, shutting down a system with denial of service, or sabotaging a company: they are more about data.  Stealing the data for illicit use such as breaking into other systems or selling the data to other criminals is the end game, and once you understand that, it is easy to see that sensitive data management must be a high security priority.