Strings between 2 characters


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Strings between 2 characters
# 1  
Old 06-13-2017
Strings between 2 characters

Hi
I have a wired string pattern ( mongo output) which I need to convert to only values.

Code:
"_id" : ObjectId("59280d9b95385c78b73252e4"), "categorySetId" : NumberLong(1100000041), "categorySetName" : "PROD GROUP", "serviceableProductFlag" : "N", "categoryId" : NumberLong(1053), "pid" : "800-319-03", "productFamily" : "PP", "productType" : "SEATS", "subGroup" : "PP SER", "description" : "^AY,EECH, NG-C", "inventoryItemId" : NumberLong(200699), "itemStatusMfg" : "S-INTV", "organizationIdMfg" : NumberLong(90000), "src" : "orcl", "syncedOn" : NumberLong("1495797136138"), "CreationDate" : ISODate("2017-05-26T11:12:16.138Z"), "CreatedBy" : "tool", "LastUpdatedDate" : ISODate("2017-05-26T11:12:16.138Z"), "LastUpdatedBy" : "tool", "itemFamilyDesc" : "PP FAMILY", "itemFamilyGroupId" : 750, "itemFamilyGroupName" : "PP SERIES PRODUCTS"

I want output like

Code:
59280d9b95385c78b73252e4,1100000041,PROD GROUP,N,1053,800-319-03,PP, SEATS,PP SER ,'^AY,EECH, NG-C', 200699,S-INTV,90000,orcl,1495797136138,2017-05-26T11:12:16.138Z,tool,2017-05-26T11:12:16.138Z,tool,PP FAMILY,750,PP SERIES PRODUCTS

I have tried to achieve with some help by using below ask


Code:
awk -F':|, *"' '{ r=""; for(i=2;i<=NF;i+=2) {gsub(/^ *([^(]+\()?|"|\)$/,"",$i); if(index($i,",")!=0){ $i="\047"$i"\047" } r=(r!="")? r","$i : $i } print r }' text.txt

But the date is getting split which I want like

2017-05-26T11:12:16.138Z but coming as 2017-05-26T11,16.138Z. Can someone help me with same ?


Regards

Moderator's Comments:
Mod Comment Start using code tags, thanks.

Last edited by zaxxon; 06-13-2017 at 06:16 AM.. Reason: code tags
# 2  
Old 06-13-2017
Code:
awk -F: '
{
ll=split($0, b, ",");
for (i=1; i<=ll; i++) {
   if ((gsub("\"", "\"", b[i]) % 2) || ! match(b[i], "\"")) {
      if (match(b[i], "\" *$")) { t=t b[i]; b[i]=t; t=""; }
      else {t=t b[i] ","; continue; }
   }
   a[c++]=b[i];
}
NF=0;
for (i=0; i<c; i++) {
   f=split(a[i],e, "[()\"]");
   if (f==3) {sub("[^:]*: *", "", e[3]); o[d++]=e[3];}
   if (f==5) {o[d++]=e[f-1];}
   if (f==7) {o[d++]=e[f-2];}
   if (o[d-1] ~ /,/) o[d-1]="\x027" o[d-1] "\x027";
   $(++lo)=o[(d-1)];
}
print $0;
}
' OFS=, infile

# 3  
Old 06-14-2017
Thank you so much
# 4  
Old 06-14-2017
The demanding part is that the field separator , is part of some fields as well, so we need to spend some effort dealing with that. Try
Code:
awk -F"\"" -vOFS="" '
        {for (i=2; i<=NF; i+=2) {if (gsub (/,/, "\001", $i)) $i = "\047" $i "\047"
                                 gsub (/:/, "\002", $i)
                                }
        }
1
' file |  awk -F, '
        {for (i=1; i<=NF; i++)  {sub (/^.*: /, _, $i)
                                 gsub (/^[^(]*\( ?| ?\)[^)]*$/, _, $i)
                                }
         gsub ("\001", ",")
         gsub ("\002", ":")
        }
1' OFS=","
59280d9b95385c78b73252e4,1100000041,PROD GROUP,N,1053,800-319-03,PP,SEATS,PP SER,'^AY,EECH, NG-C',200699,S-INTV,90000,orcl,1495797136138,2017-05-26T11:12:16.138Z,tool,2017-05-26T11:12:16.138Z,tool,PP FAMILY,750,PP SERIES PRODUCTS

# 5  
Old 06-14-2017
Or, to condense it into one single awk script:

Code:
awk -vSQ="'" -vCA=$'\001' -vCB=$'\002' '
        {FS  = "\""
         OFS = ""
         $0  = $0
         for (i=2; i<=NF; i+=2) {if (gsub (/,/, CA, $i)) $i = SQ $i SQ
                                 gsub (/:/, CB, $i)
                                }
         FS  = ","
         OFS = ","
         $0  = $0
         for (i=1; i<=NF; i++)  {sub (/^.*: /, _, $i)
                                 gsub (/^[^(]*\( ?| ?\)[^)]*$/, _, $i)
                                }
         gsub (CA, ",")
         gsub (CB, ":")
        }
1
' file

# 6  
Old 06-14-2017
And here's another approach in case you're comfortable with Perl and regular expressions:

Code:
$
$ perl -plne 's/"\w+"\s+:(\s+\w+\("|\s+\w+\(|\s+"|\s+)//g;
              s/"\s*$//;
              s/"\),\s+|\),\s+|",\s+/~/g;
              s/(\d+),\s+/$1~/g;
              s/([^~]*,[^~]*)/chr(39).$1.chr(39)/eg;
              s/~/,/g
             ' text.txt
59280d9b95385c78b73252e4,1100000041,PROD GROUP,N,1053,800-319-03,PP,SEATS,PP SER,'^AY,EECH, NG-C',200699,S-INTV,90000,orcl,1495797136138,2017-05-26T11:12:16.138Z,tool,2017-05-26T11:12:16.138Z,tool,PP FAMILY,750,PP SERIES PRODUCTS
$
$

The idea is to transform the input string via a series of substitutions using regexes.
This User Gave Thanks to durden_tyler For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to pass strings from a list of strings from another file and create multiple files?

Hello Everyone , Iam a newbie to shell programming and iam reaching out if anyone can help in this :- I have two files 1) Insert.txt 2) partition_list.txt insert.txt looks like this :- insert into emp1 partition (partition_name) (a1, b2, c4, s6, d8) select a1, b2, c4, (2 Replies)
Discussion started by: nubie2linux
2 Replies

2. Shell Programming and Scripting

Finding Strings between 2 characters in a file

Hi All, Assuming i have got a file test.dat which has contains as follows: Unix = abc def fgt jug 111 2222 3333 Linux = gggg pppp qqq C# = ccc ffff llll I would like to traverse through the file, get the 1st occurance of "=" and then need to get the sting... (22 Replies)
Discussion started by: rtagarra
22 Replies

3. Shell Programming and Scripting

finding the strings beween 2 characters "/" & "/" in .txt file

Hi all. I have a .txt file that I need to sort it My file is like: 1- 88 chain0 MASTER (FF-TE) FFFF 1962510 /TCK T FD2TQHVTT1 /jtagc/jtag_instreg/updateinstr_reg_1 dff1 (TI,SO) 2- ... (10 Replies)
Discussion started by: Behrouzx77
10 Replies

4. Shell Programming and Scripting

Return error if - or certain characters are present in a list of strings

I have a list of strings, for example: set strLst = "file1 file2 file3 file4" I want to log an error if some of the fields happen to begin with -, or have characters like ; : ' , ? ] { = Which means for example setting set ierr = 1 (2 Replies)
Discussion started by: kristinu
2 Replies

5. UNIX for Dummies Questions & Answers

Finding specific series of strings or characters

After spending sometime playing around with my script I just cannot get it to do what I want. So I decided to ask. My file looks something like this: I am using the following code to extract sequences that contain dashes awk '/^>/{id=$0;next}{if (match($1,"-")) print id "\n" $0}' infile ... (17 Replies)
Discussion started by: Xterra
17 Replies

6. Shell Programming and Scripting

Read file and remove special characters or strings

Hello all I am getting data like col1 | col2 | col3 asdafa | asdfasfa | asf*&^sgê 345./ |sdfasd23425^%^&^ | sdfsa23 êsfsfd | sf(* | sdfsasf My requirement is like I have to to read the file and remove all special characters and hex characters ranging form 00-1f from 1st column, remove %"'... (1 Reply)
Discussion started by: vasuarjula
1 Replies

7. Shell Programming and Scripting

sed: remove characters between and including 2 strings

I have the following line: 4/23/2010 0:00:38.000: Copying $$3MSYDDC02$I would like to use sed (or similiar) to remove everthing between and including $ that appears in the line so it ends up like this. 4/23/2010 0:00:38.000: Copying 3MSYDDC02I have been trying these but i'm really just... (5 Replies)
Discussion started by: jelloir
5 Replies

8. Shell Programming and Scripting

remove strings of lowercase characters (with minimum length requirement)

Hi all, I want to delete all lowercase characters from my file, but only strings of length 7 and more. For example, how can I go from: JHGEFigeIGDUIirfyfiyhgfoiyfKJHGuioyrDHG To: JHGEFigeIGDUIKJHGuioyrDHG There should be a trick to add to sed 's///g', but I can't figure it out.... (2 Replies)
Discussion started by: elbuzzo
2 Replies

9. UNIX for Dummies Questions & Answers

matching characters between strings

please send the logic or program to find the matching characters between two strings for ex string1 :abc string2 :adc no .of matching characters is 2(a,c) (9 Replies)
Discussion started by: akmtcs
9 Replies

10. Shell Programming and Scripting

Using sed with strings of nonprintable characters

Hey, I'm having trouble figuring out the syntax for using sed with string of non-printable characters. What I have is the following format: <field>@@;@@<field>@@;@@...@@;@@<field>@@^@@<field>@@;@@<field>@@;@@...@@;@@<field>@@^@@ ... With the @@;@@ being the delimeters between fields and the... (3 Replies)
Discussion started by: Dickalicious
3 Replies
Login or Register to Ask a Question