A regular expressions library for Swift
This library uses NSRegularExpression to perform the actual logic of regular expression pattern matching. However, it presents a much cleaner interface and was expliticty designed to take full advantage of swift. The supported syntax for the regular expressions can be found here. You should also see NSRegularExpression.Options and NSRegularExpression.MatchingOptions. I reccomend using https://regex101.com/ for testing your regular expression patterns.
- In Xcode, open the project that you want to add this package to.
- From the menu bar, select File > Swift Packages > Add Package Dependency...
- Paste the url for this repository into the search field.
- Follow the prompts for adding the package.
Add this to your Podfile
:
pod 'RegularExpressions', :git => 'https://github.com/Peter-Schorn/RegularExpressions.git'
String.regexMatch
, String.regexFindAll
, String.regexSub
, and String.regexSplit
all accept an object that conforms to RegexProtocol
. This object holds information about a regular expression, including:
var pattern: String { get }
- The regular expression pattern.var regexOptions: NSRegularExpression.Options { get }
- The regular expression options (see NSRegularExpression.Options).var matchingOptions: NSRegularExpression.MatchingOptions { get }
- See NSRegularExpression.MatchingOptions.var groupNames: [String]? { get }
- The names of the capture groups.
RegexProtocol
also defines a number of convienence methods:
func asNSRegex() throws -> NSRegularExpression
Converts self to an NSRegularExpression.
func numberOfCaptureGroups() throws -> Int
Returns the number of capture groups in the regular expression.
func patternIsValid() -> Bool
Returns true if the regular expression pattern is valid. Else false.
NSRegularExpression
has been extended to conform to RegexProtocol
, but it ALWAYS returns []
and nil
for the matchingOptions
and groupNames
properties, respectively. Use Regex
or another type that conforms to RegexProtocol
to customize these options.
The Regex
struct provided by this library conforms to RegexProtocol
and is the simplest way to create a regular expression object that can be used with the methods in this library.
init(
pattern: String,
regexOptions: NSRegularExpression.Options = [],
matchingOptions: NSRegularExpression.MatchingOptions = [],
groupNames: [String]? = nil
) throws
Throws if the pattern is invalid.
init(
_ pattern: String,
_ regexOptions: NSRegularExpression.Options = []
) throws
Throws is the pattern is invalid.
init(
nsRegularExpression: NSRegularExpression,
matchingOptions: NSRegularExpression.MatchingOptions = [],
groupNames: [String]? = nil
)
Creates a Regex
object from an NSRegularExpression
.
String.regexMatch
and String.regexFindAll
both use the RegexMatch
struct to hold the information about a regular expression match. It contains the following properties:
let sourceString: Substring
- The string that was matched against. A substring is used to reduce memory usage. Note thatSubString
presents the same interface asString
.let fullMatch: String
- The full match of the pattern in the source string.let range: Range<String.Index>
- The range of the full match in the source string.let groups: [RegexGroup?]
- The capture groups.
RegexMatch
also has a method to retrieve a group by name:
func group(named name: String) -> RegexGroup?
This function will return nil if the name was not found, OR if the group was not matched becase it was specified as optional in the regular expression pattern.
The RegexGroup
struct, which holds information about the capture groups, has the following properties:
let name: String?
- The name of the capture group.let match: String
- The matched capture group.let range: Range<String.Index>
- The range of the capture group in the source string.
String.regexMatch
will return the first match for a regular expression in a string, or nil if no match was found. It has two overloads:
func regexMatch<RegularExpression: RegexProtocol>(
_ regex: RegularExpression,
range: Range<String.Index>? = nil
) throws -> RegexMatch?
func regexMatch(
_ pattern: String,
regexOptions: NSRegularExpression.Options = [],
matchingOptions: NSRegularExpression.MatchingOptions = [],
groupNames: [String]? = nil,
range: Range<String.Index>? = nil
) throws -> RegexMatch?
The pattern
, regexOptions
, matchingOptions
, and groupNames
parameters correspond to the instance properties of RegexProtocol.
range
represents the range of the string in which to search for the pattern.
These methods with throw if the pattern is invalid, or if the number of group names does not match the number of capture groups (See RegexError). They will Never throw an error if no matches are found.
See Extracting the match and capture groups for information about the RegexMatch
returned by these functions.
Warning: The ranges of the matches and capture groups may be invalidated if you mutate the source string. Use String.regexsub to perform multiple replacements.
Examples:
var inputText = "name: Chris Lattner"
// If you use comments in the pattern,
// you MUST use `.allowCommentsAndWhitespace` for the `regexOptions`
let pattern = #"""
name: # the literal string 'name'
\s+ # one more more whitespace characters
([a-z]+) # one or more lowercase letters
\s+ # one more more whitespace characters
([a-z]+) # one or more lowercase letters
"""#
// create the regular expression object
let regex = try! Regex(
pattern: pattern,
regexOptions: [.caseInsensitive, .allowCommentsAndWhitespace],
groupNames: ["first name", "last name"]
// the names of the capture groups
)
if let match = try inputText.regexMatch(regex) {
print("full match: '\(match.fullMatch)'")
print("first capture group: '\(match.groups[0]!.match)'")
print("second capture group: '\(match.groups[1]!.match)'")
// perform a replacement on the first capture group
inputText.replaceSubrange(
match.groups[0]!.range, with: "Steven"
)
print("after replacing text: '\(inputText)'")
}
// full match: 'name: Chris Lattner'
// first capture group: 'Chris'
// second capture group: 'Lattner'
// after replacing text: 'name: Steven Lattner'
let inputText = """
Man selects only for his own good: \
Nature only for that of the being which she tends.
"""
let pattern = #"Man selects ONLY FOR HIS OWN (\w+)"#
let searchRange =
(inputText.startIndex)
..<
(inputText.index(inputText.startIndex, offsetBy: 40))
let match = try inputText.regexMatch(
pattern,
regexOptions: [.caseInsensitive],
matchingOptions: [.anchored], // anchor matches to the beginning of the string
groupNames: ["word"], // the names of the capture groups
range: searchRange // the range of the string in which to search for the pattern
)
if let match = match {
print("full match:", match.fullMatch)
print("capture group:", match.group(named: "word")!.match)
}
// full match: Man selects only for his own good
// capture group: good
String.regexFindAll
will return all matches for a regular expression in a string, or an empty array if no matches were found. It has the exact same overloads as String.regexMatch
:
func regexFindAll<RegularExpression: RegexProtocol>(
_ regex: RegularExpression,
range: Range<String.Index>? = nil
) throws -> [RegexMatch]
func regexFindAll(
_ pattern: String,
regexOptions: NSRegularExpression.Options = [],
matchingOptions: NSRegularExpression.MatchingOptions = [],
groupNames: [String]? = nil,
range: Range<String.Index>? = nil
) throws -> [RegexMatch]
Warning: The ranges of the matches and capture groups may be invalidated if you mutate the source string. Use String.regexsub to perform multiple replacements.
The pattern
, regexOptions
, matchingOptions
, and groupNames
parameters correspond to the instance properties of RegexProtocol.
As with String.regexMatch
, range
represents the range of the string in which to search for the pattern.
These methods with throw if the pattern is invalid, or if the number of group names does not match the number of capture groups (See RegexError). They will Never throw an error if no matches are found.
See Extracting the match and capture groups for information about the RegexMatch
returned by these functions.
Examples:
var inputText = "season 8, EPISODE 5; season 5, episode 20"
// create the regular expression object
let regex = try Regex(
pattern: #"season (\d+), Episode (\d+)"#,
regexOptions: [.caseInsensitive],
groupNames: ["season number", "episode number"]
// the names of the capture groups
)
let results = try inputText.regexFindAll(regex)
for result in results {
print("fullMatch: '\(result.fullMatch)'")
print("capture groups:")
for captureGroup in result.groups {
print(" \(captureGroup!.name!): '\(captureGroup!.match)'")
}
print()
}
let firstResult = results[0]
// perform a replacement on the first full match
inputText.replaceSubrange(
firstResult.range, with: "new value"
)
print("after replacing text: '\(inputText)'")
// fullMatch: 'season 8, EPISODE 5'
// capture groups:
// 'season number': '8'
// 'episode number': '5'
//
// fullMatch: 'season 5, episode 20'
// capture groups:
// 'season number': '5'
// 'episode number': '20'
//
// after replacing text: 'new value; season 5, episode 20'
String.regexSplit
will split a string by occurences of a pattern.
func regexSplit(
_ pattern: String,
regexOptions: NSRegularExpression.Options = [],
matchingOptions: NSRegularExpression.MatchingOptions = [],
ignoreIfEmpty: Bool = false,
maxLength: Int? = nil,
range: Range<String.Index>? = nil
) throws -> [String]
func regexSplit<RegularExpression: RegexProtocol>(
_ regex: RegularExpression,
ignoreIfEmpty: Bool = false,
maxLength: Int? = nil,
range: Range<String.Index>? = nil
) throws -> [String]
The pattern
, regexOptions
, matchingOptions
, and groupNames
parameters correspond to the instance properties of RegexProtocol.
ignoreIfEmpty
- If true, all empty strings will be removed from the array. If false (default), they will be included.maxLength
- The maximum length of the returned array. If nil (default), then the string is split on every occurence of the pattern.- Returns An array of strings split on each occurence of the pattern. If no occurences of the pattern are found, then a single-element array containing the entire string will be returned.
Examples:
let colors = "red,orange,yellow,blue"
let array = try colors.regexSplit(",")
print(array)
// array = ["red", "orange", "yellow", "blue"]
let colors = "red and orange ANDyellow and blue"
// create the regular expression object
let regex = try Regex(#"\s*and\s*"#, [.caseInsensitive])
let array = try colors.regexSplit(regex, maxLength: 3)
print(array)
// array = ["red", "orange", "yellow"]
// note that "blue" is not returned because the length of the
// array was limited to 3 items.
String.regexSub
and String.regexSubInPlace
will perform regular expression replacements. They have the exact same arguments and overloads.
func regexSub(
_ pattern: String,
with template: String = "",
regexOptions: NSRegularExpression.Options = [],
matchingOptions: NSRegularExpression.MatchingOptions = [],
range: Range<String.Index>? = nil
) throws -> String
func regexSub<RegularExpression: RegexProtocol>(
_ regex: RegularExpression,
with template: String = "",
range: Range<String.Index>? = nil
) throws -> String
The pattern
, regexOptions
, matchingOptions
, and groupNames
parameters correspond to the instance properties of RegexProtocol.
with
- The template string to replace matching patterns with. See Template Matching Format for how to format the template. Defaults to an empty string.- Returns The new string after the subsitutions have been made. If no matches are found, the string is returned unchanged.
Examples:
let name = "Peter Schorn"
// The .anchored matching option only looks for matches
// at the beginning of the string.
// Consequently, only the first word will be matched.
let regexObject = try Regex(
pattern: #"\w+"#,
regexOptions: [.caseInsensitive],
matchingOptions: [.anchored]
)
let replacedText = try name.regexSub(regexObject, with: "word")
print(replacedText)
// replacedText = "word Schorn"
let name = "Charles Darwin"
let reversedName = try name.regexSub(
#"(\w+) (\w+)"#,
with: "$2 $1"
// $1 and $2 represent the
// first and second capture group, respectively.
// $0 represents the entire match.
)
print(reversedName)
// reversedName = "Darwin Charles"
If you need to further customize regular expression replacements, you can use the following methods:
func regexSub<RegularExpression: RegexProtocol>(
_ regex: RegularExpression,
range: Range<String.Index>? = nil,
replacer: (_ matchIndex: Int, _ match: RegexMatch) -> String?
) throws -> String
func regexSub(
_ pattern: String,
regexOptions: NSRegularExpression.Options = [],
matchingOptions: NSRegularExpression.MatchingOptions = [],
groupNames: [String]? = nil,
range: Range<String.Index>? = nil,
replacer: (_ matchIndex: Int, _ match: RegexMatch) -> String?
) throws -> String
The pattern
, regexOptions
, matchingOptions
, and groupNames
parameters correspond to the instance properties of RegexProtocol.
replacer
- A closure that accepts the index of a regular expression match and a regular expression match and returns a new string to replace it with. Return nil from within the closure to indicate that the match should not be changed.
Examples:
let inputString = """
Darwin's theory of evolution is the \
unifying theory of the life sciences.
"""
let pattern = #"\w+"# // match each word in the input string
let replacedString = try inputString.regexSub(pattern) { indx, match in
if indx > 5 { return nil } // only replace the first 5 matches
return match.fullMatch.uppercased() // uppercase the full match
}
print(replacedString)
// replacedString = """
// DARWIN'S THEORY OF EVOLUTION IS the \
// unifying theory of the life sciences.
// """
If you need to perform replacements for each individual capture group, you can use the replaceGroups
method of the RegexMatch
struct:
func replaceGroups(
_ replacer: (
_ groupIndex: Int, _ group: RegexGroup
) -> String?
) -> String
replacer
- A closure that accepts the index of a capture group and a capture group and returns a new string to replace it with. Return nil from within the closure to indicate that the capture group should not be changed.
Examples:
let inputText = "name: Peter, id: 35, job: programmer"
let pattern = #"name: (\w+), id: (\d+)"#
let groupNames = ["name", "id"]
let match = try inputText.regexMatch(
pattern, groupNames: groupNames
)!
let replacedMatch = match.replaceGroups { indx, group in
if group.name == "name" { return "Steven" }
if group.name == "id" { return "55" }
return nil
}
print(replacedMatch)
// match.fullMatch = "name: Peter, id: 35"
// replacedMatch = "name: Steven, id: 55"
You can compose together the above methods in the following manner:
let inputString = """
name: sally, id: 26
name: alexander, id: 54
"""
let regexObject = try Regex(
pattern: #"name: (\w+), id: (\d+)"#,
groupNames: ["name", "id"]
)
let replacedText = try inputString.regexSub(regexObject) { indx, match in
if indx == 0 { return nil }
return match.replaceGroups { indx, group in
if group.name == "name" {
return group.match.uppercased()
}
if group.name == "id" {
return "redacted"
}
return nil
}
}
print(replacedText)
// replacedText = """
// name: sally, id: 26
// name: ALEXANDER, id: redacted
// """
The pattern matching operator ~=
has been overloaded to support checking for matches to a regular expression in a switch statement. For example:
let inputStrig = #"user_id: "asjhjcb""#
switch inputStrig {
case try Regex(#"USER_ID: "[a-z]+""#, [.caseInsensitive]):
print("valid user id")
case try? Regex(#"[!@#$%^&]+"#):
print("invalid character in user id")
case try! Regex(#"\d+"#):
print("user id cannot contain numbers")
default:
print("no match")
}
// prints "valid user id"
try
, try?
, and try!
can all be used depending on how you want to handle an error arising from an invalid regular expression pattern.
Unfortunately, there is no way to bind the match of the regular expression pattern to a variable.