+
, *
, ...)(...)
)\d
for digits, \s
for whitespaces)\p
and its negation \P
; while the general categories, e.g. Lu
for upper case letters, are the same, the script names differ when compared to PCRE)|
)[...]
)DATA(regex1) = cl_abap_regex=>create_xsd( pattern = `\d{5} \w+` ).
" --> matches 5 digits, followed by a space,
" followed by one or more 'word' characters
DATA(regex2) = cl_abap_regex=>create_xsd( pattern = `(?:Hello)` ).
" --> ERROR: '(?:' is not supported
+?
, *?
, ...)(?:...)
)\n
, where n
is a number identifying a capture group)^
and $
; XSD regular expressions do not give these characters any special meaning and will match them literally)DATA(regex1) = cl_abap_regex=>create_xpath2( pattern = `\w+ is (?:happy|sad)` ).
" --> matches one or more 'word' characters, followed by ' is ' literally,
" followed by either 'happy' or 'sad'
DATA(regex2) = cl_abap_regex=>create_xpath2( pattern = `(?<my_group>Hello) World` ).
" --> ERROR: '(?<' is not supported
CL_ABAP_REGEX
and CL_ABAP_MATCHER
:" 1. create an XSD regular expression:
DATA(xsd_regex) = cl_abap_regex=>create_xsd( pattern = `[0-9]+` ).
DATA(xsd_matcher) = xsd_regex->create_matcher( text = `123456 HelloWorld` ).
" ...
" 2. create an XPath regular expression:
DATA(xpath_regex) = cl_abap_regex=>create_xpath2( pattern = `\w+` ).
DATA(xpath_matcher) = xpath_regex->create_matcher( text = `123456 HelloWorld` ).
" ...
CL_ABAP_REGEX
and CL_ABAP_MATCHER
can also be performed on XSD and XPath regular expression based instances, including FIND
and REPLACE
:DATA(xsd_result) = xsd_matcher->find_next( ).
" --> finds '123456'
DATA(xpath_result) = xpath_matcher->replace_next( newtext = `789` ).
" --> replaces '123456' with '789', yielding '789 HelloWorld'
matches
, match
and count
:DATA(result) = xsdbool( matches( val = `lower and UPPER case` xpath = `[a-z ]+` ) ).
" --> false
\i
matches any character that may be the first character of an XML name\c
matches any character that may occur after the first character in an XML name" Match only valid XML tags
DATA(regex) = cl_abap_regex=>create_xsd( pattern = `<\i\c*>` ).
DATA(matcher1) = regex->create_matcher( text = `<Hellö>` ).
DATA(result1) = matcher1->match( ).
" --> true
DATA(matcher2) = regex->create_matcher( text = `<.INVALID.>` ).
DATA(result2) = matcher2->match( ).
" --> false, '.' is not a valid first character in an XML tag
\I
and \C
to match all characters that do not fulfill the criteria described above:" Match only tags with invalid XML name
DATA(regex) = cl_abap_regex=>create_xpath2( pattern = `<(\I|\i\C)` ).
DATA(matcher1) = regex->create_matcher( text = `<Hello>` ).
DATA(result1) = matcher1->find_next( ).
" --> nothing found, name is valid
DATA(matcher2) = regex->create_matcher( text = `<...>` ).
DATA(result2) = matcher2->find_next( ).
" --> found, name is invalid!
" 1. character class containing a single character range '0-9':
DATA(result1) = xsdbool( matches( val = `06227` pcre = `[0-9]+` ) ).
" --> true
" 2. character class containing two character ranges 'a-z' and 'A-Z',
" as well as the characters ' ' and '!':
DATA(result2) = xsdbool( matches( val = `Hello World!` pcre = `[a-zA-Z !]+` ) ).
" --> true
B
from character class A
results in a character class that matches everything that A
matches, unless it is matched by B
.[abcde]
which we will refer to as A
; it matches the characters from a
to e
; we could also use the character range notation to write it as [a-e]
[defg]
which we will refer to as B
; it matches the characters from d
to g
; we could also use the character range notation to write it as [d-g]
B
from A
, we write [abcde-[defg]]
(note that the subtraction seems to take place inside the first character class; this is not a typo). The resulting character class is equivalent to [abc]
, as outlined in green in the following diagram:A
from B
, we write [defg-[abcde]]
. The resulting character class is equivalent to [fg]
, as outlined in purple in the diagram above.-
character, similar to the character range syntax. In combination with character ranges, this can get a bit confusing: [a-e-[d-g]]
for example is equivalent to [abcde-[defg]]
" 1. match all Greek characters
DATA(result1_1) = xsdbool( matches( xpath = `\p{IsGreek}+` val = `ΑβΓδΕ` ) ).
" --> true
DATA(result1_2) = xsdbool( matches( xpath = `\p{IsGreek}+` val = `안녕` ) ).
" --> false
" 2. match all uppercase letters
DATA(result2_1) = xsdbool( matches( xpath = `\p{Lu}+` val = `ABΓ` ) ).
" --> true
DATA(result2_2) = xsdbool( matches( xpath = `\p{Lu}+` val = `ABγ` ) ).
" --> false
" 3. match all Greek characters that are NOT uppercase letters
DATA(result3_1) = xsdbool( matches( xpath = `[\p{IsGreek}-[\p{Lu}]]+` val = `αβγδε` ) ).
" --> true
DATA(result3_2) = xsdbool( matches( xpath = `[\p{IsGreek}-[\p{Lu}]]+` val = `αβγδεfgh` ) ).
" --> false (not all Greek)
DATA(result3_3) = xsdbool( matches( xpath = `[\p{IsGreek}-[\p{Lu}]]+` val = `ΑβΓδΕ` ) ).
" --> false (contains uppercase letters)
A
and B
, expressed by the white center in the diagram above), using a simple trick:" NOTE: the 'Nd' in '\p{Nd}' stands for the 'number, decimal' property
" as specified by the Unicode standard
" method 1: match all Thai numerals by subtracting everything
" that is not Thai ('\P{IsThai}') from the set of numerals:
DATA(result1) = xsdbool( matches( xpath = `[\p{Nd}-[\P{IsThai}]]+` val = `๐๖๒๒๗` ) ).
" --> true
" method 2: match all Thai numerals; same principal as above,
" but using character class negation (indicated by '^' at the start
" of a character class) instead of '\P'
DATA(result2) = xsdbool( matches( xpath = `[\p{Nd}-[^\p{IsThai}]]+` val = `๐๖๒๒๗` ) ).
" --> true
" NOTE: PCRE uses slightly different names when referring to scripts
" inside '\p' and '\P'; instead of 'IsGreek' as used in XSD / XPath,
" we simply write 'Greek'
" 1. character class subtraction (using negative lookbehind):
DATA(result1) = xsdbool( matches( pcre = `(\p{Greek}(?<!\p{Lu}))+` val = `αβγδε` ) ).
" --> true
" 2. character class intersection (using positive lookbehind):
DATA(result2) = xsdbool( matches( pcre = `(\p{Nd}(?<=\p{Thai}))+` val = `๐๖๒๒๗` ) ).
" --> true
\i
and \c
shorthandsTool | Description |
---|---|
reports DEMO_REGEX / DEMO_REGEX_TOY | supports POSIX, PCRE, XSD and XPath regular expressions |
regex101.com | supports PCRE and others (but not POSIX / XSD / XPath); has basic debugging capabilities for PCRE |
Source | Description |
---|---|
www.regular-expressions.info | great source covering a lot of different regular expression implementations, including lot's of examples |
official PCRE documentation | the one stop shop for everything PCRE related; especially useful: NOTE: not all operations and settings described there can be performed or influenced from within ABAP; if in doubt, consult the official ABAP documentation |
XSD standard | regular expressions as specified by the XSD standard; very technical |
XPath 2.0 standard | regular expressions as specified by the XPath standard; also very technical |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
5 | |
3 | |
2 | |
2 | |
2 | |
2 | |
1 | |
1 |