Skip navigation
McAfee Secure sites help keep you safe from identity theft, credit card fraud, spyware, spam, viruses and online scams
1966 Views 20 Replies Latest reply: Oct 4, 2013 8:17 AM by kalees RSS 1 2 3 Previous Next
edfapack Newcomer 7 posts since
Feb 8, 2013
Currently Being Moderated

Feb 8, 2013 4:04 PM

Fine tuning a regex in NDLP

I have defined the majority of the regex that I need, but I'm having diffuclty with one.  The issue I have is finding an expression that will catch a string of numbers with a certain number of digits without any characters or spaces before or after that string. I am set as long as the string has a space or any non-digit character before and after, but running into trouble if the string is the only thing in a line.  Any recommendations are appreciated! 

  • elisowash Newcomer 9 posts since
    Sep 6, 2013
    Currently Being Moderated
    1. Sep 6, 2013 11:22 AM (in response to edfapack)
    Re: Fine tuning a regex in NDLP

    I'm running into the same sort of thing. Did you ever get any resolution or success? It's almost like nDLP is ignoring some basic regex tenents.

  • rtrezza Newcomer 9 posts since
    Aug 30, 2010
    Currently Being Moderated
    3. Sep 6, 2013 1:54 PM (in response to edfapack)
    Re: Fine tuning a regex in NDLP

    Can you post an example of what you are trying to match together with the expression you are using? Maybe I can help you

  • elisowash Newcomer 9 posts since
    Sep 6, 2013
    Currently Being Moderated
    4. Sep 6, 2013 3:28 PM (in response to edfapack)
    Re: Fine tuning a regex in NDLP

    I've had some success today, actually. I'm working with the SSN concept, and I found that using these expressions instead of the defaults solved my issue.

     

    \D\d\d\d\d\d\d\d\d\d\D

     

    \D\d\d\d[\D]\d\d[\D]\d\d\d\d\D

     

    So, if I were to offer a suggestion, it'd be to work with digits and non-digits exclusively.

     

    I have NOT yet made this change in production, so I don't have data on False Positives yet.

  • rtrezza Newcomer 9 posts since
    Aug 30, 2010
    Currently Being Moderated
    6. Sep 6, 2013 4:07 PM (in response to edfapack)
    Re: Fine tuning a regex in NDLP

    Assuming you are using the default concept for "SOCIAL-SECURITY-NUMBER", that expression requires the string begin with a whitespace (\s) and ends with a non-digit character (\D). If you remove those items from the default expression then the pattern will validate without the spaces. Here is the default concept:

     

     

    concept-post.jpg

     

    I remove the leading \s and trailing \D and now when I enter the pattern into the validate window, the pattern does not require the spaces as shown below:

     

    concept-post2.jpg

     

    The consideration for production needs to be what is the likelihood that the number pattern you seek will not have a space or other boundary. For example, if there a product serial number with the string

     

    SESD12345678934333, then the expression modified above will flag the 9 digits inside (123456789) as a match, which is clearly a false positive.

  • SafeBoot Group Leader 8,586 posts since
    Oct 28, 2008
    Currently Being Moderated
    8. Sep 9, 2013 8:02 AM (in response to edfapack)
    Re: Fine tuning a regex in NDLP

    Unfortunately you're coming up against the problem of machine learning - how do you tell that 1234567890 is a social security number, a telephone number, or a part number? Even you don't know if this is my social security number or not.

     

    If I wrote it as 123-45-6789 you (and DLP) might make an inference that it's an SSN, just because of the tradition of putting the "-" in certain places, but what if I wrote it like 9876-54-321 - it's more vague.

     

    You're going to find that DLP as a product category, regardless of which vendor you choose has this same limitation - unless there's a way if definitively describing a concept in a mathematically defined way you're always going to be balancing accuracy vs false positives.

     

    I wish there was a good answer for you, but it's not a problem technology can solve.


    Heisenberg is pulled over for speeding: “Do you know how fast you were going?” the police officer asks, incredulously. “No,” replies Heisenberg, “but I know exactly where I am!”
    Personal Blog : http://mcaf.ee/simon | Corporate Blog : http://SIBlog.mcafee.com | Create your own safe, short URL's - http://mcaf.ee

1 2 3 Previous Next

More Like This

  • Retrieving data ...

Bookmarked By (0)

Legend

  • Correct Answers - 5 points
  • Helpful Answers - 3 points