Sponsored Content
Top Forums Shell Programming and Scripting awk, comma as field separator and text inside double quotes as a field. Post 302471867 by kevintse on Monday 15th of November 2010 11:32:01 AM
Old 11-15-2010
Quote:
Originally Posted by radoulov
If Perl is acceptable:

Code:
perl -MText::ParseWords -nle'
  print ++$c, ". ", $_ 
    for parse_line(",", 1, $_);
  ' infile

Do you need to reset the counter on every row?

---------- Post updated at 05:11 PM ---------- Previous update was at 05:04 PM ----------

As far as CSV parsing with awk is concerned see lorance.freeshell.org/csv/
Hi, radoulov
Thank you so much, the perl code is neat, but I have to choose to stick with awk for the moment, cause I don't know much about perl, I just want to analyze a simple accesslog file produced by HTTP server.

Thank you for the link, I'll take a look at it.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

sed removing comma inside double quotes

I have a csv file with lines like the followings 123456,"ABC CO., LTD","XXX" 789012,"DEF LIMITED", "XXX" before I bcp this file to database, the comma in "CO.," need to be removed first. My script is cat <filename> | sed 's/"CO.,"/"CO."/g' but it doesn't work. Can anyone here able to... (2 Replies)
Discussion started by: joanneho
2 Replies

2. Shell Programming and Scripting

To Replace comma with Pipe inside double quotes

Hi, I have a requirement to replace the comma's inside the double quotes. The comma's inside the double quotes will get changed dynamically. Input Record: "Washington, DC,Prabhu,aju",New York Output Record: "Washington| DC|Prabhu|aju",New York I tried with the below command but it... (3 Replies)
Discussion started by: prabhutkl
3 Replies

3. Shell Programming and Scripting

awk - double quotes as record separator

How do I use double quotes as a record seperator in awk? (4 Replies)
Discussion started by: locoroco
4 Replies

4. Shell Programming and Scripting

awk - single quotes as field separator

How can I use single quotes as field separator in awk? (1 Reply)
Discussion started by: locoroco
1 Replies

5. Shell Programming and Scripting

Awk Search text string in field, not all in field.

Hello, I am using awk to match text in a tab separated field and am able to do so when matching the exact word. My problem is that I would like to match any sequence of text in the tab-separated field without having to match it all. Any help will be appreciated. Please see the code below. awk... (3 Replies)
Discussion started by: rocket_dog
3 Replies

6. UNIX for Dummies Questions & Answers

Add a field separator (comma) inside a line of a CSV file

Hi... I can't find my little red AWK book and it's been a long while since I've awk'd. But I need to take a CSV file and convert the first word of the fifth field to its own field by replacing a space with a comma. This is for importing a spreadsheet of issues into JIRA... Example: a line... (9 Replies)
Discussion started by: Tawpie
9 Replies

7. Shell Programming and Scripting

awk to parse field and include the text of 1 pipe in field 4

I am trying to parse the input in awk to include the |gc= in $4 but am not able to. The below is close: awk so far: awk '{sub(/\|]+]++/, ""); print }' input.txt Input chr1 955543 955763 AGRN-6|pr=2|gc=75 0 + chr1 957571 957852 AGRN-7|pr=3|gc=61.2 0 + chr1 970621 ... (7 Replies)
Discussion started by: cmccabe
7 Replies

8. Shell Programming and Scripting

Inserting a field without disturbing field separator on other fields

Hi All, I have the input as below: cat input 032016002 2.891 97.109 16.605 27.172 24.017 32.207 0.233 0.021 39.810 0.077 0.026 19.644 13.882 0.131 11.646 0.102 11.449 76.265 23.735 16.991 83.009 8.840 91.160 0.020 99.980 52.102 47.898 44.004 55.996 39.963 18.625 0.121 1.126 40.189... (15 Replies)
Discussion started by: am24
15 Replies

9. Shell Programming and Scripting

How can awk ignore the field delimiter like comma inside a field?

We have a csv file as mentioned below and the requirement is to change the date format in file as mentioned below. Current file (file.csv) ---------------------- empname,date_of_join,dept,date_of_resignation ram,08/09/2015,sales,21/06/2016 "akash,sahu",08/10/2015,IT,21/07/2016 ... (6 Replies)
Discussion started by: gopal.biswal
6 Replies

10. Shell Programming and Scripting

awk to parse comma separated field and removing comma in between number and double quotes

Hi Experts, Please support I have below data in file in comma seperated, but 4th column is containing comma in between numbers, bcz of which when i tried to parse the file the column 6th value(5049641141) is being removed from the file and value(222.82) in column 5 becoming value of column6. ... (3 Replies)
Discussion started by: as7951
3 Replies
ParseWords(3)						User Contributed Perl Documentation					     ParseWords(3)

NAME
Text::ParseWords - parse text into an array of tokens or array of arrays SYNOPSIS
use Text::ParseWords; @lists = nested_quotewords($delim, $keep, @lines); @words = quotewords($delim, $keep, @lines); @words = shellwords(@lines); @words = parse_line($delim, $keep, $line); @words = old_shellwords(@lines); # DEPRECATED! DESCRIPTION
The &nested_quotewords() and &quotewords() functions accept a delimiter (which can be a regular expression) and a list of lines and then breaks those lines up into a list of words ignoring delimiters that appear inside quotes. &quotewords() returns all of the tokens in a single long list, while &nested_quotewords() returns a list of token lists corresponding to the elements of @lines. &parse_line() does tokenizing on a single string. The &*quotewords() functions simply call &parse_line(), so if you're only splitting one line you can call &parse_line() directly and save a function call. The $keep argument is a boolean flag. If true, then the tokens are split on the specified delimiter, but all other characters (quotes, backslashes, etc.) are kept in the tokens. If $keep is false then the &*quotewords() functions remove all quotes and backslashes that are not themselves backslash-escaped or inside of single quotes (i.e., &quotewords() tries to interpret these characters just like the Bourne shell). NB: these semantics are significantly different from the original version of this module shipped with Perl 5.000 through 5.004. As an additional feature, $keep may be the keyword "delimiters" which causes the functions to preserve the delimiters in each string as tokens in the token lists, in addition to preserving quote and backslash characters. &shellwords() is written as a special case of &quotewords(), and it does token parsing with whitespace as a delimiter-- similar to most Unix shells. EXAMPLES
The sample program: use Text::ParseWords; @words = quotewords('s+', 0, q{this is "a test" of quotewords "for you}); $i = 0; foreach (@words) { print "$i: <$_> "; $i++; } produces: 0: <this> 1: <is> 2: <a test> 3: <of quotewords> 4: <"for> 5: <you> demonstrating: 0 a simple word 1 multiple spaces are skipped because of our $delim 2 use of quotes to include a space in a word 3 use of a backslash to include a space in a word 4 use of a backslash to remove the special meaning of a double-quote 5 another simple word (note the lack of effect of the backslashed double-quote) Replacing "quotewords('s+', 0, q{this is...})" with "shellwords(q{this is...})" is a simpler way to accomplish the same thing. SEE ALSO
Text::CSV - for parsing CSV files AUTHORS
Maintainer: Alexandr Ciornii <alexchornyATgmail.com>. Previous maintainer: Hal Pomeranz <pomeranz@netcom.com>, 1994-1997 (Original author unknown). Much of the code for &parse_line() (including the primary regexp) from Joerk Behrends <jbehrends@multimediaproduzenten.de>. Examples section another documentation provided by John Heidemann <johnh@ISI.EDU> Bug reports, patches, and nagging provided by lots of folks-- thanks everybody! Special thanks to Michael Schwern <schwern@envirolink.org> for assuring me that a &nested_quotewords() would be useful, and to Jeff Friedl <jfriedl@yahoo-inc.com> for telling me not to worry about error-checking (sort of-- you had to be there). POD ERRORS
Hey! The above document had some coding errors, which are explained below: Around line 250: Expected text after =item, not a number Around line 254: Expected text after =item, not a number Around line 258: Expected text after =item, not a number Around line 262: Expected text after =item, not a number Around line 266: Expected text after =item, not a number perl v5.16.3 2013-03-17 ParseWords(3)
All times are GMT -4. The time now is 12:27 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy