Home Man
Search
Today's Posts
Register

BSD, Linux, and UNIX shell scripting — Post awk, bash, csh, ksh, perl, php, python, sed, sh, shell scripts, and other shell scripting languages questions here.

awk, comma as field separator and text inside double quotes as a field.

Tags
awk, csv parsing, field separator, shell scripts

Login to Reply

 
Thread Tools Search this Thread
# 1  
Old 11-15-2010
awk, comma as field separator and text inside double quotes as a field.

Hi, all
I need to get fields in a line that are separated by commas, some of the fields are enclosed with double quotes, and they are supposed to be treated as a single field even if there are commas inside the quotes.
sample input:
Quote:
aaa,"hell world, test text",bbb,ccc," test text"
for this line, 5 fields are supposed to be extracted, they are:
Quote:
1. aaa
2. "hell world, test text"
3. bbb
4. ccc
5. " test text"
Is there an easy way to achieve this using awk?
# 2  
Old 11-15-2010
If Perl is acceptable:

Code:
perl -MText::ParseWords -nle'
  print ++$c, ". ", $_ 
    for parse_line(",", 1, $_);
  ' infile

Do you need to reset the counter on every row?

---------- Post updated at 05:11 PM ---------- Previous update was at 05:04 PM ----------

As far as CSV parsing with awk is concerned see lorance.freeshell.org/csv/
# 3  
Old 11-15-2010
Quote:
Originally Posted by radoulov
If Perl is acceptable:

Code:
perl -MText::ParseWords -nle'
  print ++$c, ". ", $_ 
    for parse_line(",", 1, $_);
  ' infile

Do you need to reset the counter on every row?

---------- Post updated at 05:11 PM ---------- Previous update was at 05:04 PM ----------

As far as CSV parsing with awk is concerned see lorance.freeshell.org/csv/
Hi, radoulov
Thank you so much, the perl code is neat, but I have to choose to stick with awk for the moment, cause I don't know much about perl, I just want to analyze a simple accesslog file produced by HTTP server.

Thank you for the link, I'll take a look at it.
# 4  
Old 11-15-2010
you can give this awk script a try...
Code:
awk -F, '{
  for (i=1; i<=NF; i++) {
    if (s) {
      if ($i ~ "\"$") {print s","$i; s=""}
      else s = s","$i
    }
    else {
      if ($i ~ "^\".*\"$") print $i
      else if ($i ~ "^\"") s = $i
      else print $i
    }
  }
}' file

# 5  
Old 11-15-2010
I'd suggest sticking with the CSV parser linked, it deals with a lot of things that come up in CSV files. Like field with imbedded CRs or quotes:

Code:
Test,csv,file,"Multi
line field", rest
Also,some,imbedded,"Quoted ""strings""",can exist

# 6  
Old 11-15-2010
Try:
Code:
sed 's/,\("[^"]*"\)*/\n\1/g'

Code:
$ echo 'aaa,"hell world, test text",bbb,ccc," test text"' | sed 's/,\("[^"]*"\)*/\n\1/g'
aaa
"hell world, test text"
bbb
ccc
" test text"

# 7  
Old 11-15-2010
Quote:
Originally Posted by Scrutinizer
Try:
Code:
sed 's/,\("[^"]*"\)*/\n\1/g'

Nope:
Code:
$ echo '"hello world, test text", aaa, bbb, ccc' | sed 's/,\("[^"]*"\)*/\n\1/g'

"hello world
 test text"
 aaa
 bbb
 ccc

The Following User Says Thank You to Chubler_XL For This Useful Post:
Scrutinizer (11-15-2010)
Login to Reply

« Previous Thread | Next Thread »
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
awk to parse comma separated field and removing comma in between number and double quotes as7951 Shell Programming and Scripting 3 04-14-2018 02:09 PM
How can awk ignore the field delimiter like comma inside a field? gopal.biswal Shell Programming and Scripting 6 11-29-2016 05:49 AM
Inserting a field without disturbing field separator on other fields am24 Shell Programming and Scripting 15 05-16-2016 03:16 AM
awk to parse field and include the text of 1 pipe in field 4 cmccabe Shell Programming and Scripting 7 11-07-2015 07:05 PM
Add a field separator (comma) inside a line of a CSV file Tawpie UNIX for Dummies Questions & Answers 9 06-15-2014 08:09 PM
change field separator only from nth field until NF beca123456 UNIX for Dummies Questions & Answers 1 08-17-2012 10:28 PM
awk - single quotes as field separator locoroco Shell Programming and Scripting 1 09-11-2011 07:22 AM
awk - double quotes as record separator locoroco Shell Programming and Scripting 4 03-11-2011 02:45 AM
To Replace comma with Pipe inside double quotes prabhutkl Shell Programming and Scripting 3 04-26-2009 10:24 PM
sed removing comma inside double quotes joanneho Shell Programming and Scripting 2 06-30-2008 12:13 AM


All times are GMT -4. The time now is 05:53 PM.

Unix & Linux Forums Content Copyright©1993-2018. All Rights Reserved.
UNIX.COM Login
Username:
Password:  
Show Password