Search Strings REGEXP_SUBSTR

Your rating: None Average: 3 (3 votes)

As of Oracle 10g, the SUBSTR has been enhanced to support searches for regular expressions. The new function is named REGEXP_SUBSTR. You can use the new function to perform advanced searches against a string.

Syntax:

REGEXP_SUBSTR(source_string, pattern
                            [,position
                               [, occurrence
                                   [, match_parameter]
                               ]
                             ]
                            )
Where:
source_string: is the text to search within
pattern: is the regular expression
position: the position in the string to start searching from (default = 1)
occurrence: the occurrence to search for (default = 1)
match_parameter: can include one or more of the following modifiers to change the
default matching algorithm:
'c' - case-sensitive matching (default);
' ' – case-insensitive matching;
'n' – allow match-any-character operator;
'm' - treat source string as multiple line.

REGEXP_SUBSTR returns the string as VARCHAR2 or CLOB data in the same character set as the source_string.

Lets look at an example:

SELECT regexp_substr( '655-236-4567', '-[^-]+' ) "REGEXP_SUBSTR"
FROM dual;
/* '- tells oracle to start the string at the '-' character.  [^-] tells oracle to continue until it finds another '-' character. */

REGEXP_SUBSTR
-236

Note that if you add an extra '-' at the end of the regular expression you will get the trailing '-' as part of the returned string:

SELECT regexp_substr( '655-236-4567', '-[^-]+-' ) "REGEXP_SUBSTR"
FROM dual;

REGEXP_SUBSTR
-236-

A good understanding for Character Class Syntax is critical for effective use of the regular expression search capabilities.

Character Class   Syntax Meaning
[:alnum:]             All alphanumeric characters
[:alpha:]              All alphabetic characters
[:blank:]              All blank space characters
[:cntrl:]                All control characters (nonprinting)
[:digit:]                All numeric digits
[:graph:]             All [:punct:], [:upper:], [:lower:], and [:digit:] characters
[:lower:]              All lowercase alphabetic characters
[:print:]               All printable characters
[:punct:]              All punctuation characters
[:space:]             All space characters (nonprinting)
[:upper:]             All uppercase alphabetic characters
[:xdigit:]              All valid hexadecimal characters

SELECT regexp_substr( 'Employee Name and Age: Adam, Dana 28', '[:punct:]' ) "REGEXP_SUBSTR"
FROM dual;

REGEXP_SUBSTR
:

What if you want to include the punctuation and some more characters?

SELECT regexp_substr( 'Employee Name and Age: Adam, Dana 28', '[:punct:][^,]+,' ) "REGEXP_SUBSTR"
FROM dual;
/* This is telling Oracle to start at the punct and include all characters until u reach a comma (include the comma as well). */

REGEXP_SUBSTR
: Adam,

What if you want to find the number in the string?

SELECT regexp_substr( 'Employee Name and Age: Adam, Dana 28', '[[:digit:]]+' ) "REGEXP_SUBSTR"
FROM dual;

REGEXP_SUBSTR
28

Please try to help out with unanswered topics on the forum. Chances are you have had the same issue/question some time in your IT career!

Comments

Have a question? Please ask it on the forum instead.
Guest's picture
Guest (not verified)
Re: Search Strings REGEXP_SUBSTR

Hi Lepa,
Here are some errors in the two query,please tell me the reason,thanks.
SELECT regexp_substr( 'Employee Name and Age: Adam, Dana 28', '[:punct:]' ) "REGEXP_SUBSTR"
FROM dual;
Result:p ----why not Employee Name and Age: Adam

SELECT regexp_substr( 'Employee Name and Age: Adam, Dana 28', '[:punct:][^,]+,' ) "REGEXP_SUBSTR"
FROM dual;
Result:ployee Name and Age: Adam, ----why not Employee Name and Age: Adam,

Pavony Alejandro's picture
Pavony Alejandro (not verified)
Re: Search Strings REGEXP_SUBSTR

Hi, I have the next case.
I need your colaboration. Thanks:
I have the next select:
select REGEXP_SUBSTR('a|b||d', '[^|]+',1,1) param from dual; -- work good, extract value: 'a'
select REGEXP_SUBSTR('a|b||d', '[^|]+',1,2) param from dual; -- work good, extract value: 'b'
select REGEXP_SUBSTR('a|b||d', '[^|]+',1,3) param from dual; -- work bad, extract value: 'd' ** why no extract "null" ?
select REGEXP_SUBSTR('a|b||d', '[^|]+',1,4) param from dual; -- work bad, extract value: 'null' ** no found the 4th parameter.

Post new comment

The content of this field is kept private and will not be shown publicly.
CAPTCHA
The question below is to prevent automated spam submissions.
2 + 0 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.