Now you have two problems
Years ago, Jamie Zawinski wrote the seminal thesis on regular expressions:
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
Regular expressions are kind of like recursion for most programmers.
Mind-boggling at first. Then a period of, "I sort of get it, but when would I ever actually use this?" Followed by the first time you stumble upon an otherwise intractable problem where it offers an elegant solution. After which, you hoist your shiny new hammer above your head, bellow to the heavens, "THERE CAN BE ONLY ONE!" and stride off in search of more nails (or anything that even vaguely resembles a nail). Until finally, with a cascade of snapped screws and bent bolts in your wake, you place the now-rusting hammer in your toolbox where it belongs.
Well, no matter where you are on your allegorical regular expression journey, I have a bit of help for you.
First, the absolute best resource I have found for writing, testing, and understanding regular expressions is the website regex101.com. It uses color-coding to break down each step of any regular expression pattern you feed it. You can test your regex against sample text. You can even create multiple unit tests to help you tweak your regex to meet your exact needs.
Next, I offer up a few of my own VBA regex functions. These are nothing more than wrappers around the VBScript regular expression library.
NOTE: The '>>>
lines in the code below are Python-inspired doc tests.
'---------------------------------------------------------------------------------------
' Procedure : RegEx
' Author : Mike Wolfe <mike@nolongerset.com>
' Date : 9/1/2010
' Purpose : Perform a regular expression search on a string and return the first match
' or the null string if no matches are found.
' Usage : If Len(RegEx("\d{1,2}[/-]\d{1,2}[/-]\d{2,4}", txt)) = 0 Then MsgBox "No date in " & txt
' : TheDate = RegEx("\d{1,2}[/-]\d{1,2}[/-]\d{2,4}", txt)
' : CUSIP = Regex("[A-Za-z0-9]{8}[0-9]",txt)
'---------------------------------------------------------------------------------------
'
Function RegEx(Pattern As String, TextToSearch As String, Optional IgnoreCase As Boolean = False) As String
Dim RE As Object, REMatches As Object
Set RE = CreateObject("vbscript.regexp")
With RE
.MultiLine = False
.Global = False
.IgnoreCase = IgnoreCase
.Pattern = Pattern
End With
Set REMatches = RE.Execute(TextToSearch)
If REMatches.Count > 0 Then
RegEx = REMatches(0)
Else
RegEx = vbNullString
End If
End Function
'---------------------------------------------------------------------------------------
' Procedure : RegExReplace
' Author : Mike Wolfe <mike@nolongerset.com>
' Date : 11/4/2010
' Purpose : Attempts to replace text in the TextToSearch with text and back references
' from the ReplacePattern for any matches found using SearchPattern.
' Notes - If no matches are found, TextToSearch is returned unaltered. To get
' specific info from a string, use RegExExtract instead.
'>>> RegExReplace("(.*)(\d{3})[\)\s.-](\d{3})[\s.-](\d{4})(.*)", "My phone # is 570.555.1234.", "$1($2)$3-$4$5")
'My phone # is (570)555-1234.
'---------------------------------------------------------------------------------------
'
Function RegExReplace(SearchPattern As String, TextToSearch As String, ReplacePattern As String, _
Optional GlobalReplace As Boolean = True, _
Optional IgnoreCase As Boolean = False, _
Optional MultiLine As Boolean = False) As String
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
With RE
.MultiLine = MultiLine
.Global = GlobalReplace
.IgnoreCase = IgnoreCase
.Pattern = SearchPattern
End With
RegExReplace = RE.Replace(TextToSearch, ReplacePattern)
End Function
'---------------------------------------------------------------------------------------
' Procedure : RegExExtract
' Author : Mike Wolfe <mike@nolongerset.com>
' Date : 11/4/2010
' Purpose : Extracts specific information from a string. Returns empty string if not found.
'>>> RegExExtract("(.*)(\d{3})[\)\s.-](\d{3})[\s.-](\d{4})(.*)", "My phone # is 570.555.1234.", "$2$3$4")
' 5705551234
'>>> RegExExtract("(.*)(\d{3})[\)\s.-](\d{3})[\s.-](\d{4})(.*)", "My name is Mike.", "$2$3$4")
'
'>>> RegExReplace("(.*)(\d{3})[\)\s.-](\d{3})[\s.-](\d{4})(.*)", "My name is Mike.", "$2$3$4")
' My name is Mike.
'---------------------------------------------------------------------------------------
'
Function RegExExtract(SearchPattern As String, TextToSearch As String, PatternToExtract As String, _
Optional GlobalReplace As Boolean = True, _
Optional IgnoreCase As Boolean = False, _
Optional MultiLine As Boolean = False) As String
Dim MatchFound As Boolean
MatchFound = Len(RegEx(SearchPattern, TextToSearch, IgnoreCase)) > 0
If MatchFound Then
RegExExtract = RegExReplace(SearchPattern, TextToSearch, PatternToExtract, _
GlobalReplace, IgnoreCase, MultiLine)
Else
RegExExtract = vbNullString
End If
End Function
Image by Andy Gries from Pixabay