Regex Interface

DESCRIPTION

SILC regular expression interface provides Unix and POSIX compliant regular expression compilation and matching.

The interface also provides many convenience functions to make the use of regular expressions easier. Especially the silc_regex allows very simple way to match strings against regular expressions and get the exact match or matches as a return. The silc_subst provides simple and familiar way to match and substitute strings (Sed syntax).

The regex syntax follows POSIX regex syntax:

Expressions:

   ^        Match start of line/string
              '^a' matches 'ab' but not 'ba'
   $        Match end of line/string
              'a$' matches 'ba' but not 'ab'
   .        Match any single character (except new line (\n))
              '.a' matches 'ba' but not 'a'
   +        Preceding item is matched one or more times
              'a+b' matches 'aaab' but not 'b'
   *        Preceding item is matched zero or more times
              'a*b' matches 'ab', 'aab' and 'b'
   ?        Preceding item is matched zero or one time
              'ca?b' matches 'cb' and 'cab' but not 'caab'
   |        Joins two expressions and matches either of them (OR)
              'foo|bar' matches 'foo' or 'bar'
   {n}      Preceding item is matched exactly n times (n can be 0-255)
              'a{2}' matches 'aa' but not 'aaa'
   {n,}     Preceding item is matched n or more times
              'a{2,} matches 'aa' and 'aaaa' but not 'a'
   {n,m}    Preceding item is matched at least n times and at most m times
              'a{2,4}' matches 'aa', 'aaa' and 'aaaa' but not 'aaaaa'
   [ ]      Match any single character in the character list inside [ ]
              '[0123]' matches only '0', '1', '2' or '3'
   [ - ]    Match any single character in the specified range
              '[0-5]' matches digits 0-5.
   [^ ]     Match any character not in the character list or range
              '[^09]]' matches any other character except '0' and '9'
   ( )      Subexpression, grouping

Escaping (C-language style, '\' is written as '\\'):

   \\       Considers following character literal ('\\{' is '{')
   \\\\     Matches literal \
   \a       Matches bell (BEL)
   \t       Matches horizontal tab (HT)
   \n       Matches new line (LF)
   \v       Matches vertical tab (VT)
   \f       Matches form feed (FF)
   \r       Matches carriage ret (CR)
   \\<      Match null string at the start of a word
   \\>      Match null string at the end of a word
   \\b      Match null string at the edge of a wrod
   \\B      Match null string when not at the edge of a word

EXAMPLE

 SilcRegexStruct reg;

 // Compile regular expression
 if (!silc_regex_compile(&reg, "foo[0-9]*", 0))
   error;

 // Match string against the compiled regex
 if (!silc_regex_match(&reg, "foo20", 0, NULL, 0))
   no_match;

 // Free the compiled regular expression
 silc_regex_free(&reg);

 // Simple match
 if (!silc_regex("foobar", "foo.", NULL))
   no_match;

 // Replace all foos with bar on all lines in the buffer
 silc_subst(buffer, "s/foo/bar/g");

TABLE OF CONTENTS