7.33. Regex RE Substitute

  • re.sub()

  • Replace matched substring with string

7.33.1. SetUp

>>> import re

7.33.2. Problem

>>> email = 'alice@example.com'
>>>
>>> email.replace('@example.com', '@example.edu')
'alice@example.edu'

What if there are multiple top-level domains (TLDs)?

>>> email = 'alice@example.com'
>>> email = 'alice@example.net'
>>> email = 'alice@example.edu'
>>> email = 'alice@example.org'

String method str.replace() will fail...

7.33.3. Solution

>>> email = 'alice@example.com'
>>>
>>> pattern = r'^(?P<username>[a-z]+)@example.[a-z]+$'
>>> replace = r'\g<username>@example.com'
>>>
>>> re.sub(pattern, replace, email)
'alice@example.com'

7.33.4. Use Case - 1

Usage of re.sub():

>>> import re
>>>
>>>
>>> string = 'Baked Beans And Spam'
>>> pattern = r'\s[a-z]{3}\s'
>>> replace = ' & '
>>>
>>> re.sub(pattern, replace, string, flags=re.IGNORECASE)
'Baked Beans & Spam'