re | erbiaoger

发布日期: 2023-08-22

阅读次数:

正则表达式（Regular Expression，简称为 RegExp 或 RegEx）是一种用于匹配和操作文本的表达式。它允许您通过定义一些模式来查找、提取、替换和验证文本中的内容。Python 中的 re 模块提供了对正则表达式的支持，允许您在字符串中进行高级的模式匹配和操作。

以下是一些常用的正则表达式元字符和模式示例：

普通字符匹配：
- a: 匹配字符 “a”。
- hello: 匹配字符串 “hello”。
元字符：
- .: 匹配任意一个字符（除了换行符）。
- \d: 匹配任意一个数字。
- \w: 匹配任意一个字母、数字或下划线。
- \s: 匹配任意一个空白字符（空格、制表符、换行等）。
量词：
- *: 匹配前一个元素零次或多次。
- +: 匹配前一个元素一次或多次。
- ?: 匹配前一个元素零次或一次。
- {n}: 匹配前一个元素恰好 n 次。
- {n,}: 匹配前一个元素至少 n 次。
- {n,m}: 匹配前一个元素至少 n 次但不超过 m 次。
字符类：
- [abc]: 匹配字符 “a”、”b” 或 “c” 中的一个。
- [a-z]: 匹配任意一个小写字母。
- [A-Z]: 匹配任意一个大写字母。
边界匹配：
- ^: 匹配字符串的开头。
- $: 匹配字符串的结尾。
- \b: 匹配单词边界。
转义字符：
- \\: 转义字符，用于匹配特殊字符本身（如 \. 匹配 “.”）。
分组和捕获：
- (pattern): 创建一个捕获组，将匹配的内容提取出来。
反义：
- [^abc]: 匹配除 “a”、”b” 和 “c” 之外的任意一个字符。

正则表达式的语法和用法非常丰富，以上只是一些基本示例。在使用正则表达式时，可以结合具体的需求来构建适合的模式。在 Python 中，可以使用 re 模块的函数如 re.match()、re.search()、re.findall()、re.sub() 等来执行相应的匹配和操作。

## re 使用
import re

# 查找匹配的字符串
text = "Hello, my email is example@example.com"
pattern = r"\w+@\w+\.\w+"  # 正则表达式模式
matches = re.findall(pattern, text)
print(matches)  # 输出: ['example@example.com']

# 替换字符串中的文本
text = "Hello, my name is John. Hello, my name is Alice."
pattern = r"Hello"  # 要替换的模式
replacement = "Hi"  # 替换后的文本
new_text = re.sub(pattern, replacement, text)
print(new_text)  # 输出: "Hi, my name is John. Hi, my name is Alice."

# 分割字符串
text = "apple,banana,grape,orange"
pattern = r","  # 以逗号分割
items = re.split(pattern, text)
print(items)  # 输出: ['apple', 'banana', 'grape', 'orange']

# 匹配文本开头
text = "Hello, world!"
pattern = r"^Hello"  # 以 "Hello" 开头
if re.match(pattern, text):
    print("Match found!")  # 输出: "Match found!"
    
# 查找多个匹配项
text = "apple banana apple apple orange"
pattern = r"apple"  # 要查找的模式
for match in re.finditer(pattern, text):
    print("Match found at:", match.start())  # 输出: "Match found at: 0", "Match found at: 12", "Match found at: 18"