11-15-2010
awk, comma as field separator and text inside double quotes as a field.
Hi, all
I need to get fields in a line that are separated by commas, some of the fields are enclosed with double quotes, and they are supposed to be treated as a single field even if there are commas inside the quotes.
sample input:
Quote:
aaa,"hell world, test text",bbb,ccc," test text"
for this line, 5 fields are supposed to be extracted, they are:
Quote:
1. aaa
2. "hell world, test text"
3. bbb
4. ccc
5. " test text"
Is there an easy way to achieve this using awk?
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
I have a csv file with lines like the followings
123456,"ABC CO., LTD","XXX"
789012,"DEF LIMITED", "XXX"
before I bcp this file to database, the comma in "CO.," need to be removed first.
My script is cat <filename> | sed 's/"CO.,"/"CO."/g'
but it doesn't work. Can anyone here able to... (2 Replies)
Discussion started by: joanneho
2 Replies
2. Shell Programming and Scripting
Hi,
I have a requirement to replace the comma's inside the double quotes. The comma's inside the double quotes will get changed dynamically.
Input Record:
"Washington, DC,Prabhu,aju",New York
Output Record:
"Washington| DC|Prabhu|aju",New York
I tried with the below command but it... (3 Replies)
Discussion started by: prabhutkl
3 Replies
3. Shell Programming and Scripting
How do I use double quotes as a record seperator in awk? (4 Replies)
Discussion started by: locoroco
4 Replies
4. Shell Programming and Scripting
How can I use single quotes as field separator in awk? (1 Reply)
Discussion started by: locoroco
1 Replies
5. Shell Programming and Scripting
Hello, I am using awk to match text in a tab separated field and am able to do so when matching the exact word. My problem is that I would like to match any sequence of text in the tab-separated field without having to match it all. Any help will be appreciated. Please see the code below.
awk... (3 Replies)
Discussion started by: rocket_dog
3 Replies
6. UNIX for Dummies Questions & Answers
Hi... I can't find my little red AWK book and it's been a long while since I've awk'd. But I need to take a CSV file and convert the first word of the fifth field to its own field by replacing a space with a comma.
This is for importing a spreadsheet of issues into JIRA...
Example:
a line... (9 Replies)
Discussion started by: Tawpie
9 Replies
7. Shell Programming and Scripting
I am trying to parse the input in awk to include the |gc= in $4 but am not able to. The below is close:
awk so far:
awk '{sub(/\|]+]++/, ""); print }' input.txt Input
chr1 955543 955763 AGRN-6|pr=2|gc=75 0 +
chr1 957571 957852 AGRN-7|pr=3|gc=61.2 0 +
chr1 970621 ... (7 Replies)
Discussion started by: cmccabe
7 Replies
8. Shell Programming and Scripting
Hi All,
I have the input as below:
cat input
032016002 2.891 97.109 16.605 27.172 24.017 32.207 0.233 0.021 39.810 0.077 0.026 19.644 13.882 0.131 11.646 0.102 11.449 76.265 23.735 16.991 83.009 8.840 91.160 0.020 99.980 52.102 47.898 44.004 55.996 39.963 18.625 0.121 1.126 40.189... (15 Replies)
Discussion started by: am24
15 Replies
9. Shell Programming and Scripting
We have a csv file as mentioned below and the requirement is to change the date format in file as mentioned below.
Current file (file.csv)
----------------------
empname,date_of_join,dept,date_of_resignation
ram,08/09/2015,sales,21/06/2016
"akash,sahu",08/10/2015,IT,21/07/2016
... (6 Replies)
Discussion started by: gopal.biswal
6 Replies
10. Shell Programming and Scripting
Hi Experts,
Please support
I have below data in file in comma seperated, but 4th column is containing comma in between numbers, bcz of which when i tried to parse the file the column 6th value(5049641141) is being removed from the file and value(222.82) in column 5 becoming value of column6.
... (3 Replies)
Discussion started by: as7951
3 Replies
JOIN(1) General Commands Manual JOIN(1)
NAME
join - relational database operator
SYNOPSIS
join [-an] [-e s] [-o list] [-tc] file1 file2
DESCRIPTION
Join forms, on the standard output, a join of the two relations specified by the lines of file1 and file2. If file1 is `-', the standard
input is used.
File1 and file2 must be sorted in increasing ASCII collating sequence on the fields on which they are to be joined, normally the first in
each line.
There is one line in the output for each pair of lines in file1 and file2 that have identical join fields. The output line normally con-
sists of the common field, then the rest of the line from file1, then the rest of the line from file2.
Fields are normally separated by blank, tab or newline. In this case, multiple separators count as one, and leading separators are dis-
carded.
These options are recognized:
-an In addition to the normal output, produce a line for each unpairable line in file n, where n is 1 or 2.
-e s Replace empty output fields by string s.
-o list
Each output line comprises the fields specified in list, each element of which has the form n.m, where n is a file number and m is a
field number.
-tc Use character c as a separator (tab character). Every appearance of c in a line is significant.
SEE ALSO
sort(1), comm(1), awk(1).
BUGS
With default field separation, the collating sequence is that of sort -b; with -t, the sequence is that of a plain sort.
The conventions of join, sort, comm, uniq, look and awk(1) are wildly incongruous.
7th Edition April 29, 1985 JOIN(1)