The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers
Google UNIX.COM


UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !!

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
odd behaviour with quoted input strings iron_horse Shell Programming and Scripting 0 01-29-2008 08:02 PM
awk search for Quoted strings (') kprattip Shell Programming and Scripting 13 07-10-2007 12:30 PM
quoted-printable content-transfer problems marcpascual UNIX for Advanced & Expert Users 0 03-05-2007 11:56 PM
Question on files - adding commas at the end jingi1234 UNIX for Dummies Questions & Answers 5 09-27-2005 01:06 PM
tab delimited file to commas hcclnoodles Shell Programming and Scripting 6 08-16-2004 10:23 AM

Reply
 
Submit Tools LinkBack Thread Tools Display Modes
  #1  
Old 08-08-2007
Registered User
 

Join Date: Aug 2005
Posts: 14
csv files (with quoted commas) and awk

I have a file as follows:

Code:
1,"This is field 2",3,4,5
2,"This is field 2 it can contain one , comma",3,4,5
3,"This is field 2 it also, can, contain, more",3,4,5
4,"This is field 2 without extra commas",3,4,5

and i pass this through to awk:

Code:
            awk -F, ' {
                if ( $3 == "3" ) {
                    print $0
                }
            } ' file
The output I currently get is:

Code:
1,"This is field 2",3,4,5
4,"This is field 2 without extra commas",3,4,5
but in this instance all rows should display. So my question is:

How can i get awk to split the file on commas but ignore any commas when they are within double quotes (there could be more description fields than just the one, this is the first data extract we have done from Discovery).

Thanks
Reply With Quote
Forum Sponsor
  #2  
Old 08-08-2007
robotronic's Avatar
Can I play with madness?
 

Join Date: Apr 2002
Location: Italy
Posts: 370
Try this...

Code:
awk -F'"' '{
   line="";
   for (i=1; i<=NF; i++) {
      if (i != 2) line=line $i;
   }

   split(line, v, ",")
   if (v[3] == "3") {
      print;
   }
} ' file
...but I've the feeling that it can be done in a better and more elegant way
Reply With Quote
  #3  
Old 08-08-2007
manas_ranjan's Avatar
Registered User
 

Join Date: Jul 2007
Location: PUNE
Posts: 157
hey Robo,

U want to display the bold lines also as o/p ,
1,"This is field 2",3,4,5
2,"This is field 2 it can contain one , comma",3,4,5
3,"This is field 2 it also, can, contain, more",3,4,5
4,"This is field 2 without extra commas",3,4,5

if the field 3 is of value 3 ???
Reply With Quote
  #4  
Old 08-08-2007
Registered User
 

Join Date: Aug 2005
Posts: 14
Robotronic's way works.. many thanks...

Basically we have an export from an application where description fields can contain any characters. Currently on the third column this is defined as enabled, if the value is 3 we keep the record, if anything else we disregard it. As in my initial example if i just did a test on the third field in awk (using -F, as the delimeter) then in row 2 the third field would be "comma", in 3 it would be "can". So would get excluded when they should be included.
Reply With Quote
Google The UNIX and Linux Forums
Reply

Thread Tools
Display Modes




All times are GMT -7. The time now is 01:25 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008. All Rights Reserved.Ad Management by RedTyger Visit The Complex Event Processing Blog

Content Relevant URLs by vBSEO 3.2.0