About Regular Expressions

Regular expressions are used to describe patterns of characters that match against text strings. They can be used as a tool to search for and replace text, manipulate data, or test for a certain condition in a string of characters. Many everyday tasks can be accomplished with regular expressions, such as checking for the occurrence of a specific word or phrase in the body of an e-mail message, or finding specific file types, such as .txt files, in a folder or directory. Regular expressions are often called “regex”, “regexes”, “regexps”, and “RE”. This primer uses the terms “regular expressions”, “regex”, and “regexes” equally.

About Regex Syntax

Regular expressions use syntax elements comprised of alphanumeric characters and symbols. For example, the regex (2) searches for the number 2, while the regex ([1-9][0-9]{2}-[0-9]{4}) matches a regular 7-digit phone number.

There are many flavors and types of regular expression syntax. These variations are found in various tools, languages and operating systems. For example, Perl, Python, Tcl, grep, sed, vi, and Unix all use variations on standard regex syntax. This primer focuses on standard regex patterns not tied to a specific language or tool. This standard syntax can be later applied to the specific language, tool or application of your choice.