SSWebTechIO

⬅ Previous Next ➡

Regular Expressions

Regex Basics (Regular Expressions)

Regex is a pattern language used to search, match, and replace text.
Used in validation (email/phone), text processing, log analysis, scraping, etc.
Python provides regex via re module.

import re

text = "I love Python"
m = re.search("Python", text)
print(m.group())

re Module (Common Functions)

re.search() → finds first match anywhere in string
re.match() → match only at start of string
re.findall() → returns list of all matches
re.finditer() → returns iterator of match objects
re.sub() → replace matches
re.split() → split string using pattern

import re

s = "ab12 cd34 ef56"
print(re.findall(r"\d+", s))          # ['12','34','56']
print(re.sub(r"\d+", "#", s))         # ab# cd# ef#

Pattern Matching (search, match, fullmatch)

search: match anywhere
match: match from beginning
fullmatch: whole string must match pattern

import re

text = "Python is easy"
print(re.search(r"easy", text).group())
print(re.match(r"Python", text).group())
print(re.fullmatch(r"Python is easy", text) is not None)

Metacharacters (Core Symbols)

. any character except newline
^ start of string
$ end of string
* 0 or more
+ 1 or more
? 0 or 1
{m,n} repeat count
[] character set
() group
| OR

import re

print(re.findall(r"a.", "a1 a2 ab ac"))   # ['a1','a2','ab','ac']
print(re.findall(r"^Hi", "Hi there"))     # ['Hi']
print(re.findall(r"end$", "the end"))     # ['end']

Character Sets and Ranges

[abc] any one of a/b/c
[a-z] lowercase letters
[0-9] digits
[^0-9] NOT digits

import re

s = "A1 b2 C3"
print(re.findall(r"[A-Z]", s))        # ['A', 'C']
print(re.findall(r"[0-9]", s))        # ['1', '2', '3']
print(re.findall(r"[^0-9\s]+", s))   # ['A', 'b', 'C']

Special Sequences

\d digit (0-9), \D non-digit
\w word char (a-zA-Z0-9_), \W non-word
\s whitespace, \S non-whitespace
\b word boundary

import re

t = "Email: test123@gmail.com"
print(re.findall(r"\w+", t))
print(re.findall(r"\d+", t))
print(re.findall(r"\btest\w+", t))

Groups and Capturing

Use () to capture parts of match.
Use group(1), group(2) to access captured values.

import re

date = "2026-01-16"
m = re.search(r"(\d{4})-(\d{2})-(\d{2})", date)

print(m.group(1))   # year
print(m.group(2))   # month
print(m.group(3))   # day

Practical Regex Example: Validate Mobile Number (India)

Validates 10-digit mobile number starting with 6-9.
Uses fullmatch to match complete string.

import re

mobile = input("Enter mobile: ").strip()

if re.fullmatch(r"[6-9]\d{9}", mobile):
    print("Valid mobile number")
else:
    print("Invalid mobile number")

Practical Regex Example: Validate Email

Simple email validation using regex.
Note: Real-world email rules are complex; this is exam-friendly.

import re

email = input("Enter email: ").strip()
pattern = r"[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}"

if re.fullmatch(pattern, email):
    print("Valid email")
else:
    print("Invalid email")

Mini Project: Extract All Emails and Phones from Text

Finds all email IDs and mobile numbers from a paragraph.
Useful in data cleaning and text mining.

import re

text = """
Contact: souravshu562@gmail.com, admin@site.in
Phones: 8144305808, 9876543210
"""

emails = re.findall(r"[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}", text)
phones = re.findall(r"\b[6-9]\d{9}\b", text)

print("Emails:", emails)
print("Phones:", phones)

⬅ Previous Next ➡