Remove quotes and commas from field


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Remove quotes and commas from field
# 1  
Old 12-03-2014
Remove quotes and commas from field

Code:
 In the attached file I am trying to remove all the "" and , (quotes and commas) from $2 and $3 and the "" (quotes) from $4.

I tried the below as a start:

Code:
 awk -F"|" '{gsub(/\,/,X,$2)} 1' OFS="\t" enhancer.txt > comma.txt

Thank you Smilie.
# 2  
Old 12-03-2014
1. The file is not pipe delimited, thus use TAB as field separator -F"\t"
2. You can combine regexes in the gsub function like so gsub(/\,|"/,X,$2) It will delete commas and/or quotes in $2
3. You can use several gsub functions consecutively (ideally separated by a semicolon)

Last edited by junior-helper; 12-03-2014 at 07:12 PM.. Reason: s/input file separator/field separator/
# 3  
Old 12-03-2014
Thank you for the very helpful post:

Code:
  awk -F"\t" '{gsub/\,|"/,X,$2):gsub/\,|"/,X,$3:gsub/\"/,X,$4} 1' OFS="\t" enhancer.txt > comma.txt

would output (4 coulmns)
A B C D
Code:
 chr1 3190581 3191428 chr1:3,190,581-3,191,428

Thank you Smilie.
# 4  
Old 12-03-2014
Quote:
Originally Posted by cmccabe
Thank you for the very helpful post:

Code:
  awk -F"\t" '{gsub(/\,|"/,X,$2);gsub(/\,|"/,X,$3);gsub(/\"/,X,$4)} 1' OFS="\t" enhancer.txt > comma.txt

would output (4 coulmns)
A B C D
Code:
 chr1 3190581 3191428 chr1:3,190,581-3,191,428

Thank you Smilie.
Yes, but there are parenthesis missing + the colons should be semicolons.
I've corrected your command.
This User Gave Thanks to junior-helper For This Post:
# 5  
Old 12-03-2014
Another variation:
Code:
awk '{gsub(/,/,x,$2); gsub(/,/,x,$4)}1' FS=\" OFS= file

This User Gave Thanks to Scrutinizer For This Post:
# 6  
Old 12-04-2014
Thank you both Smilie

Scrutinizer how does the script work (i.e. how does it know to remove the "" and , from $2 and $3 and only the "" from $4? Just trying to understand better. Thanks.
# 7  
Old 12-04-2014
Hi cmccabe.

The script replaces the commas in $2 and $4 with a double quote as field separator. In doing so, it modifies these two fields and in awk that means that the record gets recalculated and the field separator FS is replaced by the output field separator OFS, which is the empty string. So in this ways the double quotes are removed from the record..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Shell script that should remove unnecessary commas between double quotes in CSV file

i have data as below 123,"paul phiri",paul@yahoo.com,"po.box 23, BT","Eco Bank,Blantyre,Malawi" i need an output to be 123,"paul phiri",paul@yahoo.com,"po.box 23 BT","Eco Bank Blantyre Malawi" (5 Replies)
Discussion started by: mathias23
5 Replies

2. Shell Programming and Scripting

How to match fields surrounded by double quotes with commas?

Hello to all, I'm trying to match only fields surrounded by double quotes that have one or more commas inside. The text is like this "one, t2o",334,"tst,982-0",881,"kmk 9-l","kkd, 115-001, jj-3",5 The matches should be "one, t2o" "tst,982-0" "kkd, 115-001, jj-3" I'm trying with... (11 Replies)
Discussion started by: Ophiuchus
11 Replies

3. Shell Programming and Scripting

How to delete the commas in a .CSV file that are enclosed in a string with double quotes?

Okay, I would like to delete all the commas in a .CSV file (TEST.CSV) or at least substitute them with empty space, that are enclosed in double quote. Please see the sample file as below: column 1,column 2,column 3,column 4,column 5,column 6,column 7,column 8,column 9,column 10... (8 Replies)
Discussion started by: dhruuv369
8 Replies

4. Linux

How do I format a Date field of a .CSV file with multiple commas in a string field?

I have a .CSV file (file.csv) whose data are all enclosed in double quotes. Sample format of the file is as below: column1,column2,column3,column4,column5,column6, column7, Column8, Column9, Column10 "12","B000QRIGJ4","4432","string with quotes, and with a comma, and colon: in... (3 Replies)
Discussion started by: dhruuv369
3 Replies

5. Shell Programming and Scripting

Preserve commas inside double quotes (perl)

Hi, I have an input file like this $ cat infile hi,i,"am , sam", y hello ,good, morning abcd, " ef, gh " ,ij no, "good,morning", yes, "good , afternoon" from this file I have to split the fields on basis of comma"," however, I the data present inside double qoutes should be treated as... (3 Replies)
Discussion started by: sam05121988
3 Replies

6. Shell Programming and Scripting

HELP with AWK or SED. Need to replace the commas between double quotes in CSV file

Hello experts, I need to validate a csv file which contains data like this: Sample.csv "ABCD","I",23,0,9,,"23/12/2012","OK","Street,State, 91135",0 "ABCD","I",23,0,9,,"23/12/2012","OK","Street,State, 91135",0 I just need to check if all the records contain exactly the number of... (5 Replies)
Discussion started by: shell_boy23
5 Replies

7. Shell Programming and Scripting

Get string between quotes separate by commas

I'm a beginner with shell and tried to do this per hours and everytinhg gives different want i do. So I have a lot of file in *.csv ( a.csv, b.csv ...) in each file csv , it has some fields separeted by commas. ----- "joseph";"21","m";"groups";"j.j@gmail.com,j.j2@hotmail.com"... (6 Replies)
Discussion started by: flaviof
6 Replies

8. Shell Programming and Scripting

CSV with commas in field values, remove duplicates, cut columns

Hi Description of input file I have: ------------------------- 1) CSV with double quotes for string fields. 2) Some string fields have Comma as part of field value. 3) Have Duplicate lines 4) Have 200 columns/fields 5) File size is more than 10GB Description of output file I need:... (4 Replies)
Discussion started by: krishnix
4 Replies

9. Shell Programming and Scripting

Replace field with commas with field without commas

Hey guys, I have the following text: 1,2,3,4,5,6,'NULL','when',NULL,1,2,0,'NULL' 1,2,3,4,5,6,'NULL','what','NULL',1,2,0,1 I need the same text with the word NULL without commas u know something like this: 1,2,3,4,5,6,NULL,'when',NULL,1,2,0,NULL 1,2,3,4,5,6,NULL,'what','NULL',1,2,0,1 ... (1 Reply)
Discussion started by: lmyk72
1 Replies

10. Shell Programming and Scripting

how to find the count of commas in a string excluding the ones in double quotes

Hi, my requirement is to find the count of commas in a string excluding the ones in double quotes. For example: If the input string is abc,xyz.com,lmhgdf,"abc, 401 street","tty,stt",45,23,45 The output should be 7 (7 Replies)
Discussion started by: amitshete
7 Replies
Login or Register to Ask a Question