Removing commas within semicolon in a flat file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Removing commas within semicolon in a flat file
# 15  
Old 06-19-2009
Quote:
Originally Posted by r_t_1601
can anybdy help me with the relevant unix code for removing multiple instances of comma in double quotes in single row
in a comma delimited file
input
====
"ro,han","ra,hul","ku,nal","vishe,sh"
output
===
"rohan","rahul","kunal","vishesh"
r_t_1601 kindly use this code
Code:
sed 's/\("[^,]*\),\([^,]*"\)/\1\2/g' input.txt

BR
# 16  
Old 06-19-2009
Quote:
Originally Posted by ahmad.diab
r_t_1601 kindly use this code
Code:
sed 's/\("[^,]*\),\([^,]*"\)/\1\2/g' input.txt

BR
won't work on more than 2 commas
# 17  
Old 06-20-2009
I have a working example using perl. Here is a sample execution.

Code:
$ cat input.txt
rohan,rahul,kunal,"sw,ati"
rohan,rahul,"k,un,al",swati
rohan,"rah,ul",kunal,swati
"ro,han",rahul,kunal,swati
rohan,rahul,kunal,swati

$ ./script.pl
rohan,rahul,kunal,swati
rohan,rahul,kunal,swati
rohan,rahul,kunal,swati
rohan,rahul,kunal,swati
rohan,rahul,kunal,swati

I did make sure that I had a case with two commas between the quotes. The regex should handle cases with one or more commas.

I matched the quoted string, then fixed it and substituted it back in.

Code:
#!/usr/bin/perl -w
use warnings;
use strict;

open( FH, "input.txt" ) || die "Can't open the input file: $!";

while (<FH>) {
    chomp;
    m/"(\w+[,\w+]+)"/g;
    my $line = $1;
    $line =~ s/,//g;
    s/"(\w+[,\w+]+)"/$line/g;
    print $_. "\n";
}

close(FH);
exit 0;

Hope this helps.

- B
# 18  
Old 06-20-2009
@bwhited , i think OP wants to retain the double quotes
# 19  
Old 06-20-2009
here is a sample record
50119785,IRFE Asia,IRFE:3123146,18-Jun-09,29-Jun-09,MODFOLLOWING,29-Jun-12,MODFOLLOWING,IDR,USD,30900000000,3000000,13
29758,"THE BANK OF TOKYO-MITSUBISHI UFJ, LTD.",N,,,,,GREEN,B,FLOAT,FIXED,--,10.25,,,,,,,,,6M,6M,,6M,A/360,A/360,MODFOLLOWING,M
ODFOLLOWING,USD,GBP,IDR,USD,GBP,IDR,MODFOLLOWING,MODFOLLOWING,USD,GBP,IDR,USD,GBP,IDR,,,,GBP,,,--,6M-USD-LIBOR-BBA,,0,,,,,0,0,
,2,NO,1329758,0,0,0,0,23100,0,0,23100,IDOSW,JKT,665,Cross-Currency Swap,NEW,D121360,N, x


there is a comma in ,"THE BANK OF TOKYO-MITSUBISHI UFJ, LTD." which i want to delete .. sadly none of the scripts provided has worked for me so far and its kinda urgent .. GURUS !! SOS

---------- Post updated at 02:22 AM ---------- Previous update was at 02:20 AM ----------

just to update its a comma delimited file and arnd 91 colms
# 20  
Old 06-20-2009
Well, i checked this code against your example, and it seems to work properly.

Code:
#!/bin/sed -f
	h
 	s/\("[^",]*\),\?\([^"]*"\)/\1\2/g
	x;G;:c;tc
	s/^\(.*\)\n\1$/\1/
	t
	D

# 21  
Old 06-20-2009
Quote:
Originally Posted by ghostdog74
won't work on more than 2 commas

it has worked perfectly on may PC...ok add these to the command and write it literally :-

Code:
sed '
{
:a
sed 's/\("[^,]*\),\([^,]*"\)/\1\2/g' 
ta
}
'  input.txt

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Removing commas from CSV file

Hi I'm creating a sh script to generate a csv file. The CSV contains the values from a sql table. The content looks this: a,b,c,c2,c3,,,,,,,,,,,d,e I have some code that can separate the fields using the comma as delimiter, but some values actually contain commas, such as... (2 Replies)
Discussion started by: preema
2 Replies

2. Shell Programming and Scripting

Replace semicolon within double quotes in a file with semicolon delimiter

Hello Team, Could you please help me with the below question? I have a file with the following properties 1) File Delimiter is ; 2) Text columns are within double quotes 3) Numeric columns will not have double quotes 4) File has total 6 columns Please see a sample record from file ... (3 Replies)
Discussion started by: sam99
3 Replies

3. Shell Programming and Scripting

Removing just the trailing commas :-(

Hi all, I haven't needed to do any shell based editing for nearly 20 years, and no amount of searching around has found me a solution to this very simple problem :-( I have a csv file. Some lines have three commas at the end. This means the invoice hasn't been paid. I'd like to use sed / grep... (4 Replies)
Discussion started by: chardyzulu
4 Replies

4. Shell Programming and Scripting

Help with removing additional commas in string

Hi Experts, I have below strings hello,hi,,,,,,start date age,code,,,,,61,season I am trying to format this string to hello,hi,start date age,code,61,season Can anyone please help me in achieving this? Kind Regards, RB (3 Replies)
Discussion started by: ramakanth_burra
3 Replies

5. Shell Programming and Scripting

Removing commas within semicolon in a flat file

Hi , Im relatively new to unix and have to process a comma serparated flat file . I recieve some of the fields in double quotes and i want to remove it .. INPUT ==== filed1,field2,field3,"fie,ld4" OUTPUT ===== field1,field2,field3,"field4" can anyone tell me how to achieve... (10 Replies)
Discussion started by: r_t_1601
10 Replies

6. Shell Programming and Scripting

Need help in removing commas

i have the below line as output from a script. I want to delete the string "," and get the output without comma, cat D* | grep "bytes free" | awk '{print $3}' | ????? output: 40,966,189,056 Desired O/P: 40966189056 (1 Reply)
Discussion started by: ali560045
1 Replies

7. UNIX for Dummies Questions & Answers

removing semicolon using sed in aix--urgent

hi I have an expression aaabbb; xxx xxxi i need to get the ouput as xxx xxxi i am using sed -e 's/\(*;\)//g' but it is not working..?? can some one suggest.. This is urgent (9 Replies)
Discussion started by: aixjadoo
9 Replies

8. UNIX for Dummies Questions & Answers

Removing trailer from a flat file!!!

Hi, I get some flat files with trailer which gives the totol records count and i want to remove the trailer from the file. i used the following command it works fine with a single file. cat file_name | grep -v 'Total records:' > file1 mv file file_name But i dont know how to remove the... (12 Replies)
Discussion started by: kumarsaravana_s
12 Replies

9. HP-UX

Padding zeros after removing commas in file

Hi Gurus, There is a ASCII file in which a comma is used as a seperator for the amount field when the amount exceed seven digits: e.g. 0001300,000. Now, this comma needs to be removed from this field, after padding leading zeros (to maintain the ASCII positions) e.g. 00001300000.... (1 Reply)
Discussion started by: pranag21
1 Replies

10. UNIX for Dummies Questions & Answers

removing commas from text file

Dear all I have a file which looks like this xxxxxxxxxxxxxx,xxx,xxxxxxxxxx xxxxxxxxxxxxxx,xxx,xxxxxxxxxx etc basically 14 characters then a comma, three characters, then a comma then 10 characters. We are uploading this file to our mainframe and they want the commas removed, so it... (6 Replies)
Discussion started by: hcclnoodles
6 Replies
Login or Register to Ask a Question