Need sql query to string split and normalize data

05-11-2017

Registered User

49, 0

Join Date: Mar 2007

Last Activity: 12 May 2017, 4:29 AM EDT

Posts: 49

Thanks Given: 20

Thanked 0 Times in 0 Posts

Need sql query to string split and normalize data

Hello gurus,
I have data in one of the oracle tables as as below:

Code:

Column 1    Column 2
1               NY,NJ,CA
2               US,UK,
3               AS,EU,NA

fyi, Column 2 above has data delimited with a comma as shown.

I need a sql query the produce the below output in two columns as.

Code:

Column 1        Column 2
1                   NY
1                   NJ
1                   CA
2                   US
2                   UK
3                   AS
3                   EU
3                   NA

Basically, I need to split data in one field based on a delimiter which is a comma and then normalize the data to get the required output. I have been trying sql using regex but without success. Any inputs are appreciated.
Thanks,
Carl

Moderator's Comments:

edit by bakunin: please use CODE-tags for data and file content too. Thank you.

Last edited by bakunin; 05-11-2017 at 10:05 AM..

calredd

View Public Profile for calredd

Find all posts by calredd

05-11-2017

Moderator

8,825, 1,112

Join Date: Feb 2005

Last Activity: 23 August 2021, 11:26 AM EDT

Location: Foxborough, MA

Posts: 8,825

Thanks Given: 579

Thanked 1,112 Times in 1,003 Posts

as a workaround for post-processing:

Code:

awk 'FNR==1{print;next}{n=split($2,a,",");for(i=1;i<=n;i++) if(a[i]) print $1,a[i]}' mySQLextactedFile

vgersh99

View Public Profile for vgersh99

Find all posts by vgersh99

05-11-2017

Registered User

49, 0

Join Date: Mar 2007

Last Activity: 12 May 2017, 4:29 AM EDT

Posts: 49

Thanks Given: 20

Thanked 0 Times in 0 Posts

Quote:

Originally Posted by vgersh99

as a workaround for post-processing:

Code:

awk 'FNR==1{print;next}{n=split($2,a,",");for(i=1;i<=n;i++) if(a[i]) print $1,a[i]}' mySQLextactedFile

Thanks but need the solution in sql query. Its easy to implement using sed/awk.

calredd

View Public Profile for calredd

Find all posts by calredd

05-11-2017

Registered User

503, 195

Join Date: Sep 2013

Last Activity: 22 January 2021, 1:52 PM EST

Location: France

Posts: 503

Thanks Given: 43

Thanked 195 Times in 176 Posts

Hi,
Example (in red for you) :

Code:

with T As
  (select 123 as c1, 'NY,NJ,CA' as c2 from dual
  union
  select 124 as c1, 'NY,PA,' as c2 from dual)
  SELECT DISTINCT C1, regexp_substr(C2,'[^,]+', 1, LEVEL)
  FROM T
  CONNECT BY regexp_substr(C2, '[^,]+', 1, LEVEL) IS NOT NULL
  ORDER BY C1;

Result:

Code:

        C1 REGEXP_SUBSTR(C2,'[^,]+'
---------- ------------------------
       123 CA
       123 NJ
       123 NY
       124 NY
       124 PA

Regards.

This User Gave Thanks to disedorgue For This Post:

disedorgue

View Public Profile for disedorgue

Find all posts by disedorgue

05-11-2017

Registered User

49, 0

Join Date: Mar 2007

Last Activity: 12 May 2017, 4:29 AM EDT

Posts: 49

Thanks Given: 20

Thanked 0 Times in 0 Posts

Quote:

Originally Posted by disedorgue

Hi,
Example (in red for you) :

Code:

with T As
  (select 123 as c1, 'NY,NJ,CA' as c2 from dual
  union
  select 124 as c1, 'NY,PA,' as c2 from dual)
  SELECT DISTINCT C1, regexp_substr(C2,'[^,]+', 1, LEVEL)
  FROM T
  CONNECT BY regexp_substr(C2, '[^,]+', 1, LEVEL) IS NOT NULL
  ORDER BY C1;

Result:

Code:

        C1 REGEXP_SUBSTR(C2,'[^,]+'
---------- ------------------------
       123 CA
       123 NJ
       123 NY
       124 NY
       124 PA

Regards.

works great although runs a little slow, thanks a lot.

calredd

View Public Profile for calredd

Find all posts by calredd

05-12-2017

Registered User

2,100, 402

Join Date: Apr 2009

Last Activity: 11 February 2020, 10:24 AM EST

Posts: 2,100

Thanks Given: 26

Thanked 402 Times in 360 Posts

Code:

SQL>
SQL> --
SQL> select * from t;
  
         X Y
---------- ----------------------------------------
         1 NY,NJ,CA
         2 US,UK
         3 AS,EU,NA
         4 AAA,BBBB,C,DDDDD,EE,F,GGGGGG
         5
         6 XYZ
  
 6 rows selected.
  
SQL>
SQL> -- Using SUBSTR, INSTR functions.
SQL> select x,
  2         case when iter.pos = 1 and length(y)-length(replace(y,','))+1 = 1 then y
  3              when iter.pos = 1 then substr(y,1,instr(y,',',1,iter.pos)-1)
  4              when iter.pos = length(y)-length(replace(y,','))+1 then substr(y,instr(y,',',1,iter.pos-1)+1)
  5              else substr(y, instr(y,',',1,iter.pos-1)+1, instr(y,',',1,iter.pos) - instr(y,',',1,iter.pos-1) - 1)
  6         end as token
  7    from t,
  8         (  select level as pos
  9              from dual
 10           connect by level <= (select max(length(y)-length(replace(y,','))+1) from t)
 11         ) iter
 12   where iter.pos <= nvl(length(y)-length(replace(y,','))+1,1)
 13   order by x, pos
 14  ;
  
         X TOKEN
---------- ----------------------------------------
         1 NY
         1 NJ
         1 CA
         2 US
         2 UK
         3 AS
         3 EU
         3 NA
         4 AAA
         4 BBBB
         4 C
         4 DDDDD
         4 EE
         4 F
         4 GGGGGG
         5
         6 XYZ
  
 17 rows selected.
  
SQL>
SQL> -- Same query in stages.
SQL> with iter(pos) as (
  2      select level as pos
  3        from dual
  4     connect by level <= (select max(length(y) - length(replace(y,',')) + 1) from t)
  5  ),
  6  data(x, y, token_count) as (
  7      select x, y, length(y) - length(replace(y, ',')) + 1 as token_count
  8        from t
  9  ),
 10  combined as (
 11      select d.x, d.y, iter.pos, d.token_count,
 12             case when iter.pos > 1 then instr(y, ',', 1, iter.pos-1)
 13             end as prev_indx,
 14             instr(y, ',', 1, iter.pos) as indx
 15        from data d, iter
 16       where iter.pos <= nvl(d.token_count, 1)
 17  )
 18  select x,
 19         case when pos = 1 and token_count = 1 then y
 20              when pos = 1 then substr(y, 1, indx - 1)
 21              when pos = token_count then substr(y, prev_indx + 1)
 22              else substr(y, prev_indx + 1, indx - prev_indx - 1)
 23         end as token
 24    from combined
 25   order by x, pos
 26  ;
  
         X TOKEN
---------- ----------------------------------------
         1 NY
         1 NJ
         1 CA
         2 US
         2 UK
         3 AS
         3 EU
         3 NA
         4 AAA
         4 BBBB
         4 C
         4 DDDDD
         4 EE
         4 F
         4 GGGGGG
         5
         6 XYZ
  
 17 rows selected.
  
SQL>
SQL>
SQL> -- Another one using regular expressions.
SQL> -- regexp_count is in version 11g Release 1 and higher
SQL> select x,
  2         regexp_substr(y,'[^,]+',1,iter.pos) as token
  3    from t,
  4         (  select level as pos
  5              from dual
  6           connect by level <= (select max(regexp_count(y,',')+1) from t)
  7         ) iter
  8   where iter.pos <= nvl(regexp_count(y,',')+1,1)
  9   order by x, pos
 10  ;
  
         X TOKEN
---------- ----------------------------------------
         1 NY
         1 NJ
         1 CA
         2 US
         2 UK
         3 AS
         3 EU
         3 NA
         4 AAA
         4 BBBB
         4 C
         4 DDDDD
         4 EE
         4 F
         4 GGGGGG
         5
         6 XYZ
  
 17 rows selected.
  
SQL>
SQL>

Last edited by durden_tyler; 05-12-2017 at 12:52 PM..

This User Gave Thanks to durden_tyler For This Post:

durden_tyler

View Public Profile for durden_tyler

Find all posts by durden_tyler

Programming

Need sql query to string split and normalize data

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Awk: split and gensub query

Discussion started by: mrcool4

2. Shell Programming and Scripting

Run sql query in shell script and output data save as delimited text

Discussion started by: Jaganjag

3. Web Development

Iplanet webserver retaining the URI query string data.

Discussion started by: raghur77

4. Shell Programming and Scripting

Shell scripting unable to send the sql query data in table in body of email

Discussion started by: Sharanakumar

5. Shell Programming and Scripting

How to pass string into sql query?

Discussion started by: ken6503

6. Shell Programming and Scripting

Run SQL thru shell script: how to get a new line when run sql query?

Discussion started by: Kapom

7. UNIX for Dummies Questions & Answers

Normalize Data and write to a flat file

Discussion started by: sp999

8. Shell Programming and Scripting

How to use sql data file in unix csv file as input to an sql query from shell

Discussion started by: Nareshp

9. Shell Programming and Scripting

how to use data in unix text file as input to an sql query from shell

Discussion started by: rdhanek

10. UNIX for Dummies Questions & Answers

How do I use SQL to query based off file data?

Discussion started by: whoknows