Delete specific word using regexp_replace
Hi All,
I'm using below command to delete .co but it is not removing.
SELECT REGEXP_REPLACE('110 Companies .co' ,('^.co$'), '',1,0,'i')
Instead I tried SELECT REGEXP_REPLACE('110 Companies .co' ,'.co', '',1,0,'i') but ended up in wrong result.
I could not find much info about regexp_replace and it is really hard time for me.
Tagged:
0
Answers
Try using a backslash in your search string, as in '\.co'
SELECT REGEXP_REPLACE('[110 Companies .co]','\.co'); REGEXP_REPLACE ------------------ [110 Companies ] (1 row)And if you want to delete only the last occurrence of '.co' do:
SELECT REGEXP_REPLACE('1st.co 2nd.co 3rd.co','\.co$'); REGEXP_REPLACE ------------------- 1st.co 2nd.co 3rd (1 row)If you want to be - at least slightly - more performant, try:
$ vsql -c "SELECT REPLACE('110 Companies .co' ,'.co','')" SET REPLACE ---------------- 110 CompaniesYou are not loading all the REGEX environment, and "just" perform a string operation .....
What might be more useful here instead, is to just return the integer portion of that string - the 110. This function will do that.
CREATE OR REPLACE FUNCTION fn_GetOnlyNumeric (Source varchar(255))
return varchar(50)
AS BEGIN
-- use regex to take all the character fields in a string, leaving only numeric fields, and return that. \D says find all digits, and just return those. super simple.
RETURN REGEXP_REPLACE(Source, '\D') ;
END ;
ok, it seems for one particular pattern string I can use REGEXP_REPLACE, but here I need to operate on dynamic pattern
something like this; REGEXP_REPLACE(column1, '\'||column2 , '') but it throws error. Is there a limitation only to use pattern as string?
Where as for I'm able to do above operation using REPLACE(column1, column2 , '') but again REPLACE has limitations like case sensitivity, replace only for exact match etc. Obviously I can use LOWER() for comparing the strings but final output case is getting changed.
Eg.
Get Well Soon --> Get
SELECT REPLACE('Get Well Soon', 'Well.' , '') -- This is not replacing and I need to trim 'Well Soon' by using 'Well'
REGEXP_REPLACE('Get Well Soon', 'Well.' , '') --> Get (But I want to use a column for second argument)
What is the best way to perform this operation where argument 1 and argument 2 are columns (dynamic values).
Try this:
CREATE TABLE A (f1 varchar(50)); COPY A (f1) FROM STDIN DELIMITER ',' ABORT ON ERROR; Get Well Soon, gET wELL sOON, GET WELL SOON, \. SELECT f1, REPLACE(f1,' Well Soon','') FROM A; f1 | REPLACE ---------------+--------------- gET wELL sOON | gET wELL sOON Get Well Soon | Get GET WELL SOON | GET WELL SOON (3 rows) SELECT f1, REGEXP_REPLACE(f1,' Well Soon','',1,1,'c') -- c force the match to be case sensitive (the default) FROM A; f1 | REGEXP_REPLACE ---------------+---------------- gET wELL sOON | gET wELL sOON Get Well Soon | Get GET WELL SOON | GET WELL SOON (3 rows) SELECT f1, REGEXP_REPLACE(f1,' Well Soon','',1,1,'i') -- i force the match to be case insensitive FROM A; f1 | REGEXP_REPLACE ---------------+---------------- gET wELL sOON | gET Get Well Soon | Get GET WELL SOON | GET (3 rows)You can also try this:
You have the same string, and different parts to remove (or replace by an empty string):
Play with different values in the initial WITH clause to find out if you get the right behaviour ....
WITH input(s,to_remove) AS ( SELECT 'get well soon','soon' UNION ALL SELECT 'get well soon','get' UNION ALL SELECT 'get well soon','well' UNION ALL SELECT 'GET WELL SOON','soon' UNION ALL SELECT 'GET WELL SOON','get' UNION ALL SELECT 'GET WELL SOON','well' ) SELECT * , REGEXP_REPLACE(s,to_remove||'\W*','',1,1,'i') FROM input; -- out s | to_remove | REGEXP_REPLACE -- out ---------------+-----------+---------------- -- out get well soon | soon | get well -- out get well soon | get | well soon -- out get well soon | well | get soon -- out GET WELL SOON | soon | GET WELL -- out GET WELL SOON | get | WELL SOON -- out GET WELL SOON | well | GET SOONThe solution is feasible for single pattern, but as I mentioned Requirement is for entire column which consists thousands of pattern which I need to match with one more column and remove the sub string.
The following example on one million rows takes 32 ms.
Does it address your need?
CREATE TABLE A (f1 varchar(50), f2 varchar(50)) ORDER BY f1; -- The following generates a 10 char random string CREATE OR REPLACE FUNCTION randomstring() RETURN VARCHAR AS BEGIN RETURN CHR(RANDOMINT(26) + 65) || CHR(RANDOMINT(26) + 97) || CHR(RANDOMINT(26) + 97) || CHR(RANDOMINT(26) + 97) || CHR(RANDOMINT(26) + 97) || CHR(RANDOMINT(26) + 97) || CHR(RANDOMINT(26) + 97) || CHR(RANDOMINT(26) + 97) || CHR(RANDOMINT(26) + 97) || CHR(RANDOMINT(26) + 97); END; -- The following generates one million rows in table A \set DEMO_ROWS 1000000 INSERT INTO A with myrows as (select randomstring() as f1 from ( select 1 from ( select now() as se union all select now() + :DEMO_ROWS - 1 as se) a timeseries ts as '1 day' over (order by se)) b) select f1, right(f1,2) as f2 from myrows; OUTPUT --------- 1000000 (1 row) \timing on SELECT f1, f2, REGEXP_REPLACE(f1,f2,'',1,1,'i') FROM A limit 10; f1 | f2 | REGEXP_REPLACE ------------+----+---------------- Aaaavpejhr | hr | Aaaavpej Aaaerkrmds | ds | Aaaerkrm Aaamuokksi | si | Aaamuokk Aaaobononi | ni | Aaaobono Aaaqczihnt | nt | Aaaqczih Aaaqfryryb | yb | Aaaqfryr Aaaqvscxuf | uf | Aaaqvscx Aaargzuonb | nb | Aaargzuo Aaatjuyhjg | jg | Aaatjuyh Aaauczaiah | ah | Aaauczai (10 rows) Time: First fetch (10 rows): 32.226 ms. All rows formatted: 32.372 msI think this should help, let me try and confirm you!