There are times where you need to query data from the database that matches a particular regular expression. Depending on your requirements, you might need to use a positive look ahead in your regex.
For example, let's say that you need to get all the data where the string value meets the following requirement:
- Must be between 5 and 30 characters long.
- Must have at least one of the following:
- List item
- Alphabet (a-z, A-Z)
- Numbers (0-9)
- Latin extended-A set ([0100] - [017F])
- It can only include the following:
- Alphabet (a-z, A-Z)
- Number (0-9)
- Latin extended-A set ([0100] - [017F])
- Spaces
- Special Characters ( , - . ? @ !)
We can translate the above requirements into the following regex:
(?=.*[a-z|A-Z|0-9|-]+)(?=^.{4,50}$)(^[a-z|A-Z|0-9|-|\s|\,|''|\-|\.|\?|@|!]*$)
This expression works fine in most frameworks (such as Javascript, Python ,PHP, etc). However, unfortunately the above regex won't work with Oracle.
The reason is that Oracle does not support positive look ahead in their regex matching functions. So to overcome this issue, there are two approaches.
One approach is to compose a single expression which handles all possible alternatives. However, it might be hard and possibly a nightmare to combine all these conditions in a single regex.
The other simpler approach I found is to breakdown the look aheads into individual expressions and validate the targeted data against each one of them.
Here is a simple example using the same above requirements, but without the lookahead functionality:
SET SERVEROUTPUT ON;
DECLARE
P_VAL varchar(200);
BEGIN
P_VAL :='I''m `k: /m H, and it''s the year 3015 @ neverland.';
IF(REGEXP_LIKE (P_VAL, '(^.{5,30}$)')
AND REGEXP_LIKE (P_VAL, '(.*[a-z|A-Z|0-9|-]+)')
AND REGEXP_LIKE (P_VAL, '(^[a-z|A-Z|0-9|-| |\,|''|[.-.]|\.|\?|@|!]*$)')
) THEN
DBMS_OUTPUT.PUT_LINE('MATCH');
ELSE
DBMS_OUTPUT.PUT_LINE('NO MATCH');
END IF;
END;
Please leave a comment if you think there's a better approach.
No comments:
Post a Comment