Pages

Monday, February 10, 2014

regexp

Regular expressions (regex) are a powerful tool for pattern matching and text manipulation in Python. Some of the most commonly used regex patterns in Python include:

1. **Matching a Literal String**:
   - `pattern = 'hello'`: Matches the literal string 'hello'.

2. **Matching Any Character**:
   - `pattern = '.'`: Matches any single character except newline.

3. **Matching Digits**:
   - `pattern = '\d'`: Matches any digit (equivalent to `[0-9]`).
   - `pattern = '\D'`: Matches any non-digit character.

4. **Matching Word Characters**:
   - `pattern = '\w'`: Matches any alphanumeric character (equivalent to `[a-zA-Z0-9_]`).
   - `pattern = '\W'`: Matches any non-word character.

5. **Matching Whitespace Characters**:
   - `pattern = '\s'`: Matches any whitespace character (space, tab, newline).
   - `pattern = '\S'`: Matches any non-whitespace character.

6. **Anchors**:
   - `pattern = '^start'`: Matches 'start' only at the start of the string.
   - `pattern = 'end$'`: Matches 'end' only at the end of the string.

7. **Quantifiers**:
   - `pattern = 'a+'`: Matches one or more occurrences of 'a'.
   - `pattern = 'a*`': Matches zero or more occurrences of 'a'.
   - `pattern = 'a?'`: Matches zero or one occurrence of 'a'.
   - `pattern = 'a{2,4}'`: Matches 2 to 4 occurrences of 'a'.

8. **Character Classes**:
   - `pattern = '[aeiou]'`: Matches any vowel character.
   - `pattern = '[A-Z]'`: Matches any uppercase letter.
   - `pattern = '[0-9]'`: Matches any digit.

9. **Grouping and Capturing**:
   - `pattern = '(abc)+'`: Matches one or more occurrences of 'abc'.
   - `pattern = '(\d+)-(\w+)'`: Matches a digit followed by a hyphen and then any word characters.

10. **Alternation**:
    - `pattern = 'cat|dog'`: Matches either 'cat' or 'dog'.

These are just a few examples of commonly used regex patterns in Python. Regular expressions offer a wide range of functionality for more advanced text processing tasks, such as searching, replacing, and extracting information from strings.


Postal Address 
[a-zA-Z\d\s\-\,\#\.\+]+

set address "
   Mr S Tan
   #200, Broadway Av
   WEST BEACH SA 5024  
   AUSTRALIA"

regexp {[a-zA-Z\d\s\-\,\#]+} $address new
puts $new


ZIP Code  
^\d{5,6}(?:[-\s]\d{4})?$


Date – accept date input in the mm/dd/yyyy or mm-dd-yyyy formats.
((0[1-9])|(1[0-2]))[\/-]((0[1-9])|(1[0-9])|(2[0-9])|(3[0-1]))[\/-](\d{4})

set date "31/01/1000"
regexp {(0[1-9]|1[0-9]|2[0-9]|3[0-1])/(0[1-9]|1[0-2])/(\d{4})} $date match
puts $match


Email Address  
[a-zA-Z0-9_\.\+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-\.]+

set email "Naw_raj.lekhak01@spirent.com"
regexp {[a-zA-Z0-9_]+.[0-9a-zA-Z_]+@[a-z0-9A-Z_]+.[a-zA-Z_]+} $email match
puts $match


URL (Web domain)
https?\:\/\/[a-zA-Z0-9\-\.]+\.[a-zA-Z]{2,}
https?\:\/\/(www\.)?youtu(\.)?be(\.com)?\/.*(\?v=|\/v\/)?[a-zA-Z0-9_\-]+

set site "https://www.facebook.com/"
regexp {(http|https)://www.[a-z]+.[a-z]+} $site match
puts $match

Character Limit 
[\w]{1,140}

Phone Numbers  
\+?\(?\d{2,4}\)?[\d\s-]{3,}

Price (with decimal)  
\$?\d{1,3}(,?\d{3})*(\.\d{1,2})?

Complex Password – only accept a string that has 1 uppercase alphabet, 1 lowercase alphabet, 2 digits and 1 special character. Also the minimum allowed length is 8 characters.
(?=.*[A-Z])(?=.*[a-z])(?=.*[0-9].*[0-9])(?=.*[^a-zA-Z0-9]).{8,}