Quantcast
Channel: Symantec Connect - Products - Discussions
Viewing all articles
Browse latest Browse all 21603

Help with Regular Expressions

$
0
0
I need a solution

Hi Everyone,

We have a requirement to implement DLP policy to monitor and trigger an incident on "First/Last name + Phone Number + Email address + Postal code"

We have successfully created and tested Regular Expressions to detect this confidential information. However, we are noticing a performance hit when analyzing messages with the policy (using Regex) turned on during our testing. CPU usage jumps to about 70% or more when ingesting our test messages at batches of ~200 messages.

As I am not an Expert in writing Regular expressions, can someone please review and let me know if there are any improvements that can be done that can help with the performance issue ?  Can these be optimized ? If so, how ? 

1) Email address: 
(?i)\b[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,6}\b 

2) Phone number: 
\(?\b[0-9]{3}\)?[-. ]?[0-9]{3}[-. ]?[0-9]{4}\b 

3) Postal code : (Canadian) 
\b[ABCEGHJKLMNPRSTVXY][0-9][A-Z] [0-9][A-Z][0-9]\b 

4) First and Last name: 
\b[A-Z][-'a-zA-Z]+,?([\s\|]|\s{2,})[A-Z][-'a-zA-Z]{0,19}\b 

Also, if we create a Custom Data Identifier (using exact same pattern as used in Regex), will it have any performance benefit instead of using Regular expressions directly ?

Thanks for your help in advance. 


Viewing all articles
Browse latest Browse all 21603

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>