www.delorie.com/gnu/docs/gawk/gawk_32.html   search  
Buy the book!

The GNU Awk User's Guide

[ < ] [ > ]   [ << ] [ Up ] [ >> ]         [Top] [Contents] [Index] [ ? ]

3.7 How Much Text Matches?

Consider the following:

echo aaaabcd | awk '{ sub(/a+/, "<A>"); print }'

This example uses the sub function (which we haven't discussed yet; see section String Manipulation Functions) to make a change to the input record. Here, the regexp /a+/ indicates "one or more `a' characters," and the replacement text is `<A>'.

The input contains four `a' characters. awk (and POSIX) regular expressions always match the leftmost, longest sequence of input characters that can match. Thus, all four `a' characters are replaced with `<A>' in this example:

$ echo aaaabcd | awk '{ sub(/a+/, "<A>"); print }'
-| <A>bcd

For simple match/no-match tests, this is not so important. But when doing text matching and substitutions with the match, sub, gsub, and gensub functions, it is very important. See section String Manipulation Functions, for more information on these functions. Understanding this principle is also important for regexp-based record and field splitting (see section How Input Is Split into Records, and also see section Specifying How Fields Are Separated).

  webmaster     delorie software   privacy  
  Copyright 2003   by The Free Software Foundation     Updated Jun 2003