The special characters recognized by regular expressions are:
.*[]^${}\+?|()
Anchor characters
caret character (^) Starting at the beginning
dollar sign ($) Looking for the ending
dot character (.) Match any single character except a newline character.
Character classes [abc] [0123456789] Looking for a character contained in the
class
Negating character classes [^ch]at Looking for any character that’s not in the
class
Using ranges [a-c] [0-9] noncontinuous ranges [a-ch-k]
Special character classes [[:digit:]] [[:alpha:]]
The asterisk () Placing an asterisk after a character signifies that the
character must appear zero or more times in the text to match the pattern
[test]$ echo “I ate a potatoe with my lunch.” | sed -n /potatoe/p
I ate a potatoe with my lunch.
[test]$ echo “I ate a potato with my lunch.” | sed -n ‘/potatoe/p’
I ate a potato with my lunch.
combining the dot special character with the asterisk special character to
match any number of any characters.
[test]$ echo “this is a regular pattern expression” | sed -n
‘/regular.expression/p’
this is a regular pattern expression
The question mark (?) indicates that the preceding character can appear zero
or one time
[test]$ echo “bt” | gawk ‘/be?t/{print $0}’
bt
[test]$ echo “bet” | gawk ‘/be?t/{print $0}’
bet
[test]$ echo “beet” | gawk ‘/be?t/{print $0}’
no match
The plus sign indicates that the preceding character can appear one or more
times, but must be present at least once.
[test]$ echo “beet” | gawk ‘/be+t/{print $0}’
beet
[test]$ echo “bet” | gawk ‘/be+t/{print $0}’
bet
[test]$ echo “bt” | gawk ‘/be+t/{print $0}’
no match
Curly braces to allow you to specify a limit on a repeatable regular expres-
sion. This is often referred to as an interval
[test]$ echo “bt” | gawk –re-interval ‘/be{1}t/{print $0}’
no match
[test]$ echo “bet” | gawk –re-interval ‘/be{1}t/{print $0}’
bet
[test]$ echo “beet” | gawk –re-interval ‘/be{1}t/{print $0}’
no match
[test]$ echo “beet” | gawk –re-interval ‘/be{1,2}t/{print $0}’
beet
[test]$ echo “beeet” | gawk –re-interval ‘/be{1,2}t/{print $0}’
no match
[test]$ echo “bet” | gawk –re-interval ‘/be{1,2}t/{print $0}’
bet
[jialiang@ht191w script]$ echo “bt” | gawk –re-interval ‘/be{1,2}t/{print
$0}’
no match
The pipe symbol (|) allows to you to specify two or more patterns that the
regular expression engine uses in a logicalORformula when examining the data
stream.
[test]$ echo “The cat is asleep” | gawk ‘/cat|dog/{print $0}’
The cat is asleep
[test]$ echo “The dog is asleep” | gawk ‘/cat|dog/{print $0}’
The dog is asleep
[test]$ echo “The dogcat is asleep” | gawk ‘/cat|dog/{print $0}’
The dogcat is asleep
[test]$ echo “The dog cat is asleep” | gawk ‘/cat|dog/{print $0}’
The dog cat is asleep
[test]$ echo “The cdogat is asleep” | gawk ‘/cat|dog/{print $0}’
The cdogat is asleep
Grouping expressions, When you group a reg-ular expression pattern, the group
is treatedlike a standard character.
[test]$ echo “Sat” | gawk ‘/Sat(urday)?/{print $0}’
Sat
[test]$ echo “Satur” | gawk ‘/Sat(urday)?/{print $0}’
Satur
[test]$ echo “Saturdd” | gawk ‘/Sat(urday)?/{print $0}’
Saturdd
[test]$ echo “Saturday” | gawk ‘/Sat(urday)?/{print $0}’
Saturday