Unix/Linux Go Back    


Shell Programming and Scripting BSD, Linux, and UNIX shell scripting — Post awk, bash, csh, ksh, perl, php, python, sed, sh, shell scripts, and other shell scripting languages questions here.

Strings between 2 characters

Shell Programming and Scripting


Reply    
 
Thread Tools Search this Thread Display Modes
    #1  
Old Unix and Linux 06-13-2017
Abhayman Abhayman is offline
Registered User
 
Join Date: Aug 2010
Last Activity: 4 September 2017, 11:12 AM EDT
Posts: 22
Thanks: 3
Thanked 0 Times in 0 Posts
Strings between 2 characters

Hi
I have a wired string pattern ( mongo output) which I need to convert to only values.


Code:
"_id" : ObjectId("59280d9b95385c78b73252e4"), "categorySetId" : NumberLong(1100000041), "categorySetName" : "PROD GROUP", "serviceableProductFlag" : "N", "categoryId" : NumberLong(1053), "pid" : "800-319-03", "productFamily" : "PP", "productType" : "SEATS", "subGroup" : "PP SER", "description" : "^AY,EECH, NG-C", "inventoryItemId" : NumberLong(200699), "itemStatusMfg" : "S-INTV", "organizationIdMfg" : NumberLong(90000), "src" : "orcl", "syncedOn" : NumberLong("1495797136138"), "CreationDate" : ISODate("2017-05-26T11:12:16.138Z"), "CreatedBy" : "tool", "LastUpdatedDate" : ISODate("2017-05-26T11:12:16.138Z"), "LastUpdatedBy" : "tool", "itemFamilyDesc" : "PP FAMILY", "itemFamilyGroupId" : 750, "itemFamilyGroupName" : "PP SERIES PRODUCTS"

I want output like


Code:
59280d9b95385c78b73252e4,1100000041,PROD GROUP,N,1053,800-319-03,PP, SEATS,PP SER ,'^AY,EECH, NG-C', 200699,S-INTV,90000,orcl,1495797136138,2017-05-26T11:12:16.138Z,tool,2017-05-26T11:12:16.138Z,tool,PP FAMILY,750,PP SERIES PRODUCTS

I have tried to achieve with some help by using below ask



Code:
awk -F':|, *"' '{ r=""; for(i=2;i<=NF;i+=2) {gsub(/^ *([^(]+\()?|"|\)$/,"",$i); if(index($i,",")!=0){ $i="\047"$i"\047" } r=(r!="")? r","$i : $i } print r }' text.txt

But the date is getting split which I want like

2017-05-26T11:12:16.138Z but coming as 2017-05-26T11,16.138Z. Can someone help me with same ?


Regards

Moderator's Comments:
Strings between 2 characters Start using code tags, thanks.

Last edited by zaxxon; 06-13-2017 at 05:16 AM.. Reason: code tags
Sponsored Links
    #2  
Old Unix and Linux 06-13-2017
rdrtx1 rdrtx1 is offline
Registered User
 
Join Date: Sep 2012
Last Activity: 19 October 2017, 2:34 PM EDT
Location: Houston, Texas, USA
Posts: 971
Thanks: 0
Thanked 332 Times in 314 Posts

Code:
awk -F: '
{
ll=split($0, b, ",");
for (i=1; i<=ll; i++) {
   if ((gsub("\"", "\"", b[i]) % 2) || ! match(b[i], "\"")) {
      if (match(b[i], "\" *$")) { t=t b[i]; b[i]=t; t=""; }
      else {t=t b[i] ","; continue; }
   }
   a[c++]=b[i];
}
NF=0;
for (i=0; i<c; i++) {
   f=split(a[i],e, "[()\"]");
   if (f==3) {sub("[^:]*: *", "", e[3]); o[d++]=e[3];}
   if (f==5) {o[d++]=e[f-1];}
   if (f==7) {o[d++]=e[f-2];}
   if (o[d-1] ~ /,/) o[d-1]="\x027" o[d-1] "\x027";
   $(++lo)=o[(d-1)];
}
print $0;
}
' OFS=, infile

Sponsored Links
    #3  
Old Unix and Linux 06-14-2017
Abhayman Abhayman is offline
Registered User
 
Join Date: Aug 2010
Last Activity: 4 September 2017, 11:12 AM EDT
Posts: 22
Thanks: 3
Thanked 0 Times in 0 Posts
Thank you so much
    #4  
Old Unix and Linux 06-14-2017
RudiC RudiC is offline Forum Staff  
Moderator
 
Join Date: Jul 2012
Last Activity: 22 October 2017, 5:17 PM EDT
Location: Aachen, Germany
Posts: 11,483
Thanks: 310
Thanked 3,561 Times in 3,276 Posts
The demanding part is that the field separator , is part of some fields as well, so we need to spend some effort dealing with that. Try

Code:
awk -F"\"" -vOFS="" '
        {for (i=2; i<=NF; i+=2) {if (gsub (/,/, "\001", $i)) $i = "\047" $i "\047"
                                 gsub (/:/, "\002", $i)
                                }
        }
1
' file |  awk -F, '
        {for (i=1; i<=NF; i++)  {sub (/^.*: /, _, $i)
                                 gsub (/^[^(]*\( ?| ?\)[^)]*$/, _, $i)
                                }
         gsub ("\001", ",")
         gsub ("\002", ":")
        }
1' OFS=","
59280d9b95385c78b73252e4,1100000041,PROD GROUP,N,1053,800-319-03,PP,SEATS,PP SER,'^AY,EECH, NG-C',200699,S-INTV,90000,orcl,1495797136138,2017-05-26T11:12:16.138Z,tool,2017-05-26T11:12:16.138Z,tool,PP FAMILY,750,PP SERIES PRODUCTS

Sponsored Links
    #5  
Old Unix and Linux 06-14-2017
RudiC RudiC is offline Forum Staff  
Moderator
 
Join Date: Jul 2012
Last Activity: 22 October 2017, 5:17 PM EDT
Location: Aachen, Germany
Posts: 11,483
Thanks: 310
Thanked 3,561 Times in 3,276 Posts
Or, to condense it into one single awk script:


Code:
awk -vSQ="'" -vCA=$'\001' -vCB=$'\002' '
        {FS  = "\""
         OFS = ""
         $0  = $0
         for (i=2; i<=NF; i+=2) {if (gsub (/,/, CA, $i)) $i = SQ $i SQ
                                 gsub (/:/, CB, $i)
                                }
         FS  = ","
         OFS = ","
         $0  = $0
         for (i=1; i<=NF; i++)  {sub (/^.*: /, _, $i)
                                 gsub (/^[^(]*\( ?| ?\)[^)]*$/, _, $i)
                                }
         gsub (CA, ",")
         gsub (CB, ":")
        }
1
' file

Sponsored Links
    #6  
Old Unix and Linux 06-14-2017
durden_tyler's Unix or Linux Image
durden_tyler durden_tyler is offline Forum Advisor  
Registered User
 
Join Date: Apr 2009
Last Activity: 9 September 2017, 1:30 PM EDT
Posts: 2,083
Thanks: 21
Thanked 383 Times in 346 Posts
And here's another approach in case you're comfortable with Perl and regular expressions:


Code:
$
$ perl -plne 's/"\w+"\s+:(\s+\w+\("|\s+\w+\(|\s+"|\s+)//g;
              s/"\s*$//;
              s/"\),\s+|\),\s+|",\s+/~/g;
              s/(\d+),\s+/$1~/g;
              s/([^~]*,[^~]*)/chr(39).$1.chr(39)/eg;
              s/~/,/g
             ' text.txt
59280d9b95385c78b73252e4,1100000041,PROD GROUP,N,1053,800-319-03,PP,SEATS,PP SER,'^AY,EECH, NG-C',200699,S-INTV,90000,orcl,1495797136138,2017-05-26T11:12:16.138Z,tool,2017-05-26T11:12:16.138Z,tool,PP FAMILY,750,PP SERIES PRODUCTS
$
$

The idea is to transform the input string via a series of substitutions using regexes.
The Following User Says Thank You to durden_tyler For This Useful Post:
Aia (06-14-2017)
Sponsored Links
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Linux More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Finding Strings between 2 characters in a file rtagarra Shell Programming and Scripting 22 03-10-2013 02:29 PM
Finding specific series of strings or characters Xterra UNIX for Dummies Questions & Answers 17 10-07-2011 05:22 PM
sed: remove characters between and including 2 strings jelloir Shell Programming and Scripting 5 06-11-2010 11:03 AM
matching characters between strings akmtcs UNIX for Dummies Questions & Answers 9 12-08-2006 08:04 AM
Using sed with strings of nonprintable characters Dickalicious Shell Programming and Scripting 3 05-25-2006 05:36 PM



All times are GMT -4. The time now is 12:06 AM.