4 Replies Latest reply on Sep 15, 2010 2:08 PM by ijahnke

    RegEx for http://*.scr URL in e-mail?

    DBO

      Anybody got a test RegEx expression to detect URL ending in *.SCR inbedded in e-mail with IM 6.7.1?

        • 1. Re: RegEx for http://*.scr URL in e-mail?
          ijahnke

          Our Knowledge Base article KB69857 describes this issue

           

           

          A simple RegEx that would fit this querry:

           

          https?://.+\.scr$

           

           

          **Please note that generally our technical support staff does not support setting up custom RegEx dictionaries**

           

           

          Message was edited by: Ivan Jahnke on 9/15/10 12:58:40 PM CDT
          • 2. Re: RegEx for http://*.scr URL in e-mail?
            runcmd

            I'm not sure if IronMail's RegEx engine makes certain assumptions but the only downside I see to this regular expression is that it may catch anything with ".scr" in the URL.  Therefore, both of these could be hits...

             

            ht_p://www.bad.xyz/hosting/somethingbad.scr

            ht_p://www.screening.xyz/

             

            (One "t" in "http" was intentionally omitted because they are bogus URLs and I didn't want them to automatically hyperlink.)

             

            You'd almost want something like...

             

            https?://.+\.scr($|/|>|"| |\f|\n|\t|\r)

             

            The problem is that there are a lot of possibilities for characters at the end of the "scr" file extension for a URL that is embedded in an email.

             

             

            Message was edited by: runcmd on 9/15/10 1:24:54 PM EDT
            • 3. Re: RegEx for http://*.scr URL in e-mail?
              ijahnke

              If you select "Word Boundary" then it should only words that would begin and end with http[s]://<stuff>.scr

               

              However I have edited the original script and added "$" to the end

              • 4. Re: RegEx for http://*.scr URL in e-mail?
                runcmd

                Thanks for the clarification, Ivan.  The reason I added the range ($|/|>|"| |\f|\n|\t|\r) at the end of my RegEx example is because URLs can potentially be embedded in a lot of different ways...

                 

                <ht_p://www.bad.xyz/hosting/somethingbad.scr>

                <!a href="ht_p://www.bad.xyz/hosting/somethingbad.scr">

                "ht_p://www.bad.xyz/hosting/somethingbad.scr"

                ht_p://www.bad.xyz/hosting/somethingbad.scr/

                etc.

                 

                Even with just a "$" on the end and "word boundary" configured, would IronMail's RegEx engine catch the above examples?  If so, that's good to know.  If not, word boundary would probably allow you to reduce the range to something more reasonable like ($|/|>|").  I'd imagine that the more complex your RegEx is, the more processing power is consumes on the appliance as well.