What Is a RegEx?
- RegEx is the short form of Regular Expression.
- You can consider as a smart tool to search, check, or edit text using patterns instead of checking each character manually.
- The RegEx allows us to find specific strings from a larger text or validate input formats like email addresses, phone numbers or dates.
- In simple words, you tell Python what kind of text you are looking for, and Python finds it for you.
Why We Use RegEx In Python?
Normally, we want multiple things in our project, like:
- Check whether an email is valid or not
- Find all phone numbers in a paragraph
- replace extra spaces
- Extract dates from text
We would need long code logic with loops and conditions, but RegEx reduces all that logic into short and readable patterns.
How To Work With RegEx In Python?
If you want to work with RegEx, you need a Python built-in module called“re”. Import this module with the following command:
import re
Simple Example of RegEx
We write the code that can check if a username is valid or not.
import re
username = "user123"
pattern = r"^[a-zA-Z0-9]{5,}$"
if re.match(pattern, username):
print("Valid username")
else:
print("Invalid username")
Output of the code:
Valid username
Explanation of the code:
- ^ → start of text
- {5,} → at least 5 characters
- $ → end of text
- No loops and no confusion in this code.
Example 2:
Extract prices from a shopping message
import re
message = "Apple costs ₹120, Banana costs ₹40 and Mango costs ₹90"
prices = re.findall(r"\₹\d+", message)
print(prices)
Final Output:
['₹120', '₹40', '₹90']
- Here, RegEx directly prints only prices, ignoring everything else.
Where We Can Use RegEx In Our Project?
We can use RegEx everywhere in our real-life project. Below are the important points about it:
1) Search Features (Website/App)
- You can use this module in search bars, filtering data, keyword matching, and more.
- For example, “Search products that start with ‘lap’ like laptop, laptop bag.”
- Now you think it’s like a suggestion word module, but it’s not the suggested words by itself.
Regular Expression checks only patterns. It means:
- User types lap
- RegEx checks product names
- It matches names that start with lap
- RegEx is not suggesting, it is matching existing data
2) Form Validation
- Regular Expression (RegEx) checks the fixed pattern of the user’s input (such as email or phone format) before the form is submitted.
- Regular expression (RegEx) verifies that an email contains @, a domain name, and valid characters; otherwise, it rejects the input.
3) Data Cleaning & Processing
- RegEx finds and removes unwanted characters, extra spaces, or incorrect symbols from raw data.
- For example, it can automatically delete unnecessary spaces or special characters.
4) Log File Analysis
- RegEx extracts specific patterns (like errors or IP addresses) from large log files.
- It is filters only error messages from thousands of log lines.
5) Security & Input Sanitization
- It can block inputs that do not match the allowed patterns.
- For example, RegEx allows only letters and numbers in usernames and rejects special characters.
Basic RegEx Functions in Python
The re module provides us with multiple functions for working with patterns. let’s understand each with examples:
- re.search()
- re.match()
- re.findall()
- re.finditer()
- re.sub()
1) re.search()
- re.search() is a useful pattern when you want to find something inside a text.
- Python reads the full sentence from left to right and stops when it finds the first match.
- Imagine you reading a paragraph and looking for one specific pattern and you stop reading.
Let’s write the example: You are checking a delivery message and want to detect if it contains a tracking number that starts with TRK.
import re
message = "Your order is packed. Tracking ID: TRK83921 will be active soon."
result = re.search(r"TRK\d+", message)
if result:
print("Tracking found:", result.group())
else:
print("No tracking number found")
Output:
Tracking found: TRK83921
In this code:
- TRK → fixed prefix
- \d+ → one or more digits
- group() → gives the exact matched text
2) re.match()
- This pattern useful when you want to check whether specific word appears at the beginning of a string.
- It does not scan the full text like re.search().
- If the pattern is not found starting position 0, it returns None.
- Sometimes we search products in search bar and if our search terms not meet so it is giving None.
Example of re.match() pattern: Suppose you are building an app where every command must start with run
import re
user_input = "run:backup"
result = re.match(r"run:", user_input)
if result:
print("Valid command")
else:
print("Invalid command")
Output:
Valid command
3) re.findall()
- re.findall() find all matching patterns in a text at once.
- Instead of stopping at the first match, it scans the whole string and returns every match as a list.
- If any matches, it returns a list.
- If nothing matches, it returns an empty list.
Example: From a message, extract all order numbers that start with ORD.
import re
message = "ORD102 was shipped, ORD205 is pending, ORD309 is cancelled"
orders = re.findall(r"ORD\d+", message)
print(orders)
Output:
['ORD102', 'ORD205', 'ORD309']
- Here, ORD is the fixed text
\d+means one or more digits- re.findall() collects all order IDs into a list.
4) re.sub()
- re.sub() replace something in a text automatically using a pattern.
- You don’t need to change words or numbers manually; re.sub() finds matching patterns and substitutes them with new text.
Example code: We want to hide student marks and show CONFIDENTIAL.
import re
report = "Rahul scored 78, Neha scored 85, Amit scored 91"
secured_report = re.sub(r"\d+", "CONFIDENTIAL", report)
print(secured_report)
Output:
Rahul scored CONFIDENTIAL, Neha scored CONFIDENTIAL, Amit scored CONFIDENTIAL
5) re.sub()
- re.sub() is find a pattern in text and replace it with something else.
- Its like a “search and replace”, but understands patterns, not works with only exact words.
- This pattern we use to hide sensitive data like phone number, emails, and more.
Simple Example: Masking Phone Digits
import re
message = "Call me at 9876543210 for details."
# Replace all digits with X
safe_message = re.sub(r"\d", "X", message)
print(safe_message)
Output:
Call me at XXXXXXXXXX for details.
Flags in RegEx
Flags are like special switches that change how a search pattern behaves.
Regular expressions follow strict rules like case-sensitive, line-by-line, etc, so Flags relax or modify these rules so matching becomes more flexible in real programs.
Without flags:
- Hello and hello are treated as different
- . does not match a new line
- ^ and $ work only for the full text, not line by line
These are all the most useful RegEx Flags:
- re.IGNORECASE (re.I): Ignore the uppercase and lowercase differences
- re.MULTILINE (re.M): Treat each line as a separate text
- re.DOTALL (re.S): Allow . to match even newline characters
Example: Case-Insensitive Search in User Feedback
import re
feedback = "This Product is AWESOME!"
pattern = r"awesome"
result = re.search(pattern, feedback, re.IGNORECASE)
if result:
print("Word found:", result.group())
Output of this code:
Word found: AWESOME
- What Are Modules In Python?
- What Is a Function In Python?
- What Is Python String Formatting?
- What Is a Python List?
- How Can We Use a Lambda Function?
- What are Modules in Python?
- How we can use Dates in Python?

M.Sc. (Information Technology). I explain AI, AGI, Programming and future technologies in simple language. Founder of BoxOfLearn.com.