11-15-2010
awk, comma as field separator and text inside double quotes as a field.
Hi, all
I need to get fields in a line that are separated by commas, some of the fields are enclosed with double quotes, and they are supposed to be treated as a single field even if there are commas inside the quotes.
sample input:
Quote:
aaa,"hell world, test text",bbb,ccc," test text"
for this line, 5 fields are supposed to be extracted, they are:
Quote:
1. aaa
2. "hell world, test text"
3. bbb
4. ccc
5. " test text"
Is there an easy way to achieve this using awk?
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
I have a csv file with lines like the followings
123456,"ABC CO., LTD","XXX"
789012,"DEF LIMITED", "XXX"
before I bcp this file to database, the comma in "CO.," need to be removed first.
My script is cat <filename> | sed 's/"CO.,"/"CO."/g'
but it doesn't work. Can anyone here able to... (2 Replies)
Discussion started by: joanneho
2 Replies
2. Shell Programming and Scripting
Hi,
I have a requirement to replace the comma's inside the double quotes. The comma's inside the double quotes will get changed dynamically.
Input Record:
"Washington, DC,Prabhu,aju",New York
Output Record:
"Washington| DC|Prabhu|aju",New York
I tried with the below command but it... (3 Replies)
Discussion started by: prabhutkl
3 Replies
3. Shell Programming and Scripting
How do I use double quotes as a record seperator in awk? (4 Replies)
Discussion started by: locoroco
4 Replies
4. Shell Programming and Scripting
How can I use single quotes as field separator in awk? (1 Reply)
Discussion started by: locoroco
1 Replies
5. Shell Programming and Scripting
Hello, I am using awk to match text in a tab separated field and am able to do so when matching the exact word. My problem is that I would like to match any sequence of text in the tab-separated field without having to match it all. Any help will be appreciated. Please see the code below.
awk... (3 Replies)
Discussion started by: rocket_dog
3 Replies
6. UNIX for Dummies Questions & Answers
Hi... I can't find my little red AWK book and it's been a long while since I've awk'd. But I need to take a CSV file and convert the first word of the fifth field to its own field by replacing a space with a comma.
This is for importing a spreadsheet of issues into JIRA...
Example:
a line... (9 Replies)
Discussion started by: Tawpie
9 Replies
7. Shell Programming and Scripting
I am trying to parse the input in awk to include the |gc= in $4 but am not able to. The below is close:
awk so far:
awk '{sub(/\|]+]++/, ""); print }' input.txt Input
chr1 955543 955763 AGRN-6|pr=2|gc=75 0 +
chr1 957571 957852 AGRN-7|pr=3|gc=61.2 0 +
chr1 970621 ... (7 Replies)
Discussion started by: cmccabe
7 Replies
8. Shell Programming and Scripting
Hi All,
I have the input as below:
cat input
032016002 2.891 97.109 16.605 27.172 24.017 32.207 0.233 0.021 39.810 0.077 0.026 19.644 13.882 0.131 11.646 0.102 11.449 76.265 23.735 16.991 83.009 8.840 91.160 0.020 99.980 52.102 47.898 44.004 55.996 39.963 18.625 0.121 1.126 40.189... (15 Replies)
Discussion started by: am24
15 Replies
9. Shell Programming and Scripting
We have a csv file as mentioned below and the requirement is to change the date format in file as mentioned below.
Current file (file.csv)
----------------------
empname,date_of_join,dept,date_of_resignation
ram,08/09/2015,sales,21/06/2016
"akash,sahu",08/10/2015,IT,21/07/2016
... (6 Replies)
Discussion started by: gopal.biswal
6 Replies
10. Shell Programming and Scripting
Hi Experts,
Please support
I have below data in file in comma seperated, but 4th column is containing comma in between numbers, bcz of which when i tried to parse the file the column 6th value(5049641141) is being removed from the file and value(222.82) in column 5 becoming value of column6.
... (3 Replies)
Discussion started by: as7951
3 Replies
LEARN ABOUT DEBIAN
plan9-join
JOIN(1) General Commands Manual JOIN(1)
NAME
join - relational database operator
SYNOPSIS
join [ options ] file1 file2
DESCRIPTION
Join forms, on the standard output, a join of the two relations specified by the lines of file1 and file2. If one of the file names is the
standard input is used.
File1 and file2 must be sorted in increasing ASCII collating sequence on the fields on which they are to be joined, normally the first in
each line.
There is one line in the output for each pair of lines in file1 and file2 that have identical join fields. The output line normally con-
sists of the common field, then the rest of the line from file1, then the rest of the line from file2.
Input fields are normally separated spaces or tabs; output fields by space. In this case, multiple separators count as one, and leading
separators are discarded.
The following options are recognized, with POSIX syntax.
-a n In addition to the normal output, produce a line for each unpairable line in file n, where n is 1 or 2.
-v n Like -a, omitting output for paired lines.
-e s Replace empty output fields by string s.
-1 m
-2 m Join on the mth field of file1 or file2.
-jn m Archaic equivalent for -n m.
-ofields
Each output line comprises the designated fields. The comma-separated field designators are either 0, meaning the join field, or
have the form n.m, where n is a file number and m is a field number. Archaic usage allows separate arguments for field designators.
-tc Use character c as the only separator (tab character) on input and output. Every appearance of c in a line is significant.
EXAMPLES
sort /etc/passwd | join -t: -1 1 -a 1 -e "" - bdays
Add birthdays to the /etc/passwd file, leaving unknown birthdays empty. The layout of /adm/users is given in passwd(5); bdays con-
tains sorted lines like
tr : ' ' </etc/passwd | sort -k 3 3 >temp
join -1 3 -2 3 -o 1.1,2.1 temp temp | awk '$1 < $2'
Print all pairs of users with identical userids.
SOURCE
/src/cmd/join.c
SEE ALSO
sort(1), comm(1), awk(1)
BUGS
With default field separation, the collating sequence is that of sort -b -ky,y; with -t, the sequence is that of sort -tx -ky,y.
One of the files must be randomly accessible.
JOIN(1)