cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
Olivia1
Level 8
Report Inappropriate Content
Message 1 of 4

Text Extractor combining numbers across columns - Excel and PDF

McAfee, PLEASE open a ticket with your vendor who provides text extraction functionality to fix this problem. We've opened tickets about this and it's the same answer every time, "Sorry, we cannot help you. It's just how the text extractor works. You will have to open an enhancement request."


This is a deal breaking bug "feature." We quarantine files in our NAS environment. Before we were aware of how bad this was, we ran a remediation scan thereby quarantining tens of thousands of files and most of them were false positives caused by the text extractor combining columns. We spent countless hours restoring files back to the NAS and we're still not done. This is a bug your vendor needs to fix ASAP.

None of your other competitors have had this problem.

3 Replies
jsubbura
McAfee Employee
McAfee Employee
Report Inappropriate Content
Message 2 of 4

Re: Text Extractor combining numbers across columns - Excel and PDF

Hi @Olivia1 ,

Greetings!

Can you help me in here to understand this issue with an example? so that we can work on it internally and get it fixed?

 

Thank you.

Regards,
Jithendran S
McAfee Employee

Re: Text Extractor combining numbers across columns - Excel and PDF

I work with Olivia and can explain the issue.

Example 1:

User has an Excel spreadsheet with 4 columns, and each cell contains a 4 digit number.  The text extractor for McAfee DLP (both endpoint and Discover, which we use for our NAS) will concatenate the numbers in all 4 columns to create a single 16 digit number.  If the resulting number could pass a Luhn check, it's viewed as a valid credit card number, even though it is not.

Example 2:

User has a PDF document that includes a table.  Each cell in the table contains a 4 digit number.  The text extractor will concatenate the cells in each row of a particular column until it creates a single 16 digit number.  If that number would then pass a Luhn check, it is viewed as a valid credit card number, even though it is not.

 

We are seeing this in both our Endpoint Security (data-in-motion) and our Discover products (data-at-rest).  As a result, thousands of files were quarantined from our NAS, and the majority of those were false positives due to the text exactor issue that I described.

Re: Text Extractor combining numbers across columns - Excel and PDF

@jsubbura 

Is there a status update on this issue?  We are still seeing the problem, and we are running the latest versions of both DLP for Endpoint and DLP Discover.  We submitted this more than 2 months ago with no response.

You Deserve an Award
Don't forget, when your helpful posts earn a kudos or get accepted as a solution you can unlock perks and badges. Those aren't the only badges, either. How many can you collect? Click here to learn more.

Community Help Hub

    New to the forums or need help finding your way around the forums? There's a whole hub of community resources to help you.

  • Find Forum FAQs
  • Learn How to Earn Badges
  • Ask for Help
Go to Community Help

Join the Community

    Thousands of customers use the McAfee Community for peer-to-peer and expert product support. Enjoy these benefits with a free membership:

  • Get helpful solutions from McAfee experts.
  • Stay connected to product conversations that matter to you.
  • Participate in product groups led by McAfee employees.
Join the Community
Join the Community