Thread: Regexps

    #1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2003
    Location
    Bogotá, Colombia
    Posts
    43
    Rep Power
    12

    Regexps


    Hello, I was wondering if there's a standard way of running regular expressions in C++, like in STL or some kind of standard library?
  2. #2
  3. unix hermit
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2003
    Location
    http://www.rfc791.org
    Posts
    18
    Rep Power
    0
    there is on freebsd

    check the man page for regex on your system in section 3. Failing that, try looking for man pages for regcomp, regexec, regerror, and regfree.

    If that all fails, learn perl. Perl is designed to do regular expressions form the ground up and you can do quite a bit of what you can do in C in perl (I've written TCP/IP socket programs in it)

    Here's a poorly designed part of a CGI forum system I wrote which was no where near as advanced as devshed's :)

    This is part of a function which checked to see if part of an article was an HTML link, or just a stray greater than or less than, and changed it into appropriate HTML (heavily pruned so as not to give away any trade secrets ... or at least be more readable):





    ...

    #include <regex.h>

    ...

    regex_t re;
    char *rebuf;
    char goodlinktypes[] = 'http|ftp|mailto';

    rebuf = (char*)malloc(strlen(goodlinktypes)+40);

    sprintf(rebuf, "<a href=\\\"(%s):[^\"<>]*\\\">[^\"<>]*</a>",
    goodlinktypes);
    regcomp(&re, rebuf,
    REG_EXTENDED | REG_ICASE | REG_NOSUB);
    free(rebuf);
    if (!regexec(&re, str, 0, NULL, 0)) {
    /* valid HTML link */
    } else {
    /* not a valid HTML link */
    }
    regfree(&re);


    Now I remember why I've been thinking of rewriting it in perl :)
    Last edited by rfc791; April 29th, 2003 at 12:48 AM.
  4. #3
  5. No Profile Picture
    .
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2002
    Posts
    296
    Rep Power
    13
    here's a c version you can get, that also looks like it has a c++ wrapper version too :

    http://www.pcre.org/

    PCRE - Perl Compatible Regular Expressions

    The PCRE library is a set of functions that implement regular expression pattern matching using the same syntax and semantics as Perl_5. PCRE has its own native API, as well as a set of wrapper functions that correspond to the POSIX regular expression API. The PCRE library is free, even for building commercial software.
  6. #4
  7. No Profile Picture
    Junior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2003
    Posts
    1
    Rep Power
    0

    Multi-Pattern Matching?


    I can't answer the question, not because I don't know any regexp library for C/C++, but because of the opposite reason: I know too many: 2 which are included in most of the standard libc's, 1 that is included as an include-file in most of the UNIX/Linux distributions, and some free (such as PCRE that was mentioned above).

    But the thing that I really lack, is an efficient library that can search for multiple patterns simultaneously. It is not hard to code, and there are even some programs (such as flex) that do it quite well.

    The lack of such a library, is inherited to most of the programs that deal with pattern matching. For example, grep/fgrep supports VERY fast multiple-pattern matching (can search 300,000 patterns in a mega-byte file in less than 1 second!), and supports regular-expressions, but not the combination of them (you CAN search for multiple regular expressions, but it is done in the most primitive way - Brute-Force (!)).

    Does anybody know any library/function/tool that can look for many regular expressions simultaneously, but yet efficiently?
  8. #5
  9. No Profile Picture
    .
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2002
    Posts
    296
    Rep Power
    13
    the thing i lack in regex (in any implementation (c or anything else)) is good / comprehensive unicode support

IMN logo majestic logo threadwatch logo seochat tools logo