#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2003
    Location
    UK
    Posts
    7
    Rep Power
    0

    Unicode PROPERTY NAME


    I am trying to take advantage of the regex functionality : \p{UNICODE PROPERTY NAME}

    However, I am struggling to find a map of those property names.

    I went direct to the unicode website and downloaded a file 'UnicodeData.txt' which has the catagory listed... but this only shows 27,268 character values.
    But there are 65k characters in utf-8 or ucs-2 ....

    ... am I missing a point here somewhere ?
  2. #2
  3. Transforming Moderator
    Devshed Supreme Being (6500+ posts)

    Join Date
    Mar 2007
    Location
    Washington, USA
    Posts
    14,249
    Rep Power
    9400
    The documentation for whatever language you're using's regular expressions should talk about it. What language, by the way?
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2003
    Location
    UK
    Posts
    7
    Rep Power
    0
    I'm working in SAP Information Steward.

    According to the 'User Guide':
    based on the POSIX standard..
    ...POSIX refers to the POSIX.1 standard (IEEE Std 1003.1)....
    ....The XPG3, XPG4, Single Unix Specification (SUS) and other standards include POSIX.1 as a subset.....

    They also give a link to Unicode website : 302 Found .. but tbh I cannot find the info I am looking for there.

IMN logo majestic logo threadwatch logo seochat tools logo