Compound Character Classes
Character classes are a versatile tool when combined with various pieces of the regex syntax. Compound character classes can help clarify and define sophisticated searches, test for certain conditions in a program, and filter wanted e-mail from spam. This section uses compound character classes to build meaningful expressions with the regex syntax.
Using compound character classes with the regex syntax.
Example 1: Find a partial e-mail address. Use a character class to denote a match for any number between 0 and 9. Use a range to restrict the number of times a digit matches.
- Regex:
smith[0-9]{2}@
- Matches:
smith44@ smith42@
- Doesn't Match:
Smith34 smith6 Smith0a
Example 2: Search an HTML file to find each instance of a header tag. Allow matches on whitespace after the tag but before the ">".
- Regex:
(<[Hh][1-6] *>)
- Matches:
<H1> <h6> <H3 > <h2 >
- Doesn't Match:
<H1 < h2> <a1>
Example 3: Match a regular 7-digit phone number. Prevent the digit "0" from leading the string.
- Regex:
([1-9][0-9]{2}-[0-9]{4})
- Matches:
555-5555 123-4567
- Doesn't Match:
555.5555 1234-567 023-1234
Example 4: Match a valid web-based protocol. Escape the two front slashes.
- Regex:
[a-z]+:\/\/
- Matches:
http:// ftp:// tcl:// https://
- Doesn't Match:
http http: 1a3://
Example 5: Match a valid e-mail address.
- Regex:
[a-z0-9_-]+(\.[a-z0-9_-]+)*@[a-z0-9_-]+(\.[a-z0-9_-]+)+
- Matches:
j_smith@foo.com j.smith@bc.canada.ca smith99@foo.co.uk 1234@mydomain.net
- Doesn't Match:
@foo.com .smith@foo.net smith.@foo.org www.myemail.com
NoteThis regular expression will actually also match thesmith@foo.net
part of the.smith@foo.net
example.