Removing commas within semicolon in a flat file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Removing commas within semicolon in a flat file
# 29  
Old 06-20-2009
Quote:
Originally Posted by Franklin52
Wow, do I earn that?
well...at least it gives me a "why didn't i think of that" reaction Smilie
Quote:
Click on the award icon above the relating post ....Smilie
can't click on it, (also its not very prominent Smilie ), anyway, it works fine after i open a new tab on my browser.
# 30  
Old 06-20-2009
Quote:
Originally Posted by ghostdog74
@bwhited , i think OP wants to retain the double quotes
Sorry, this wasn't indicated in the original post. The sample output showed all of the quotes gone.

I'll update my script and repost. I also noted in the real sample input that there are spaces in the strings that need to be accounted for.

---------- Post updated at 12:06 ---------- Previous update was at 11:44 ----------

Quote:
Originally Posted by r_t_1601
here is a sample record
50119785,IRFE Asia,IRFE:3123146,18-Jun-09,29-Jun-09,MODFOLLOWING,29-Jun-12,MODFOLLOWING,IDR,USD,30900000000,3000000,13
29758,"THE BANK OF TOKYO-MITSUBISHI UFJ, LTD.",N,,,,,GREEN,B,FLOAT,FIXED,--,10.25,,,,,,,,,6M,6M,,6M,A/360,A/360,MODFOLLOWING,M
ODFOLLOWING,USD,GBP,IDR,USD,GBP,IDR,MODFOLLOWING,MODFOLLOWING,USD,GBP,IDR,USD,GBP,IDR,,,,GBP,,,--,6M-USD-LIBOR-BBA,,0,,,,,0,0,
,2,NO,1329758,0,0,0,0,23100,0,0,23100,IDOSW,JKT,665,Cross-Currency Swap,NEW,D121360,N, x


there is a comma in ,"THE BANK OF TOKYO-MITSUBISHI UFJ, LTD." which i want to delete .. sadly none of the scripts provided has worked for me so far and its kinda urgent .. GURUS !! SOS

---------- Post updated at 02:22 AM ---------- Previous update was at 02:20 AM ----------

just to update its a comma delimited file and arnd 91 colms
Okay, here you go. Next time, please post the real input and full details. It will save a lot of time and effort.


The following code takes this input:
Code:
rohan,rahul,kunal,"sw,ati"
rohan,rahul,"k,un,al",swati
rohan,"rah,ul",kunal,swati
"ro,han",rahul,kunal,swati
rohan,rahul,kunal,swati
50119785,IRFE Asia,IRFE:3123146,18-Jun-09,29-Jun-09,MODFOLLOWING,29-Jun-12,MODFOLLOWING,IDR,USD,30900000000,3000000,1329758,"THE BANK OF TOKYO-MITSUBISHI UFJ, LTD.",N,,,,,GREEN,B,FLOAT,FIXED,--,10.25,,,,,,,,,6M,6M,,6M,A/360,A/360,MODFOLLOWING,MODFOLLOWING,USD,GBP,IDR,USD,GBP,IDR,MODFOLLOWING,MODFOLLOWING,USD,GBP,IDR,USD,GBP,IDR,,,,GBP,,,--,6M-USD-LIBOR-BBA,,0,,,,,0,0,,2,NO,1329758,0,0,0,0,23100,0,0,23100,IDOSW,JKT,665,Cross-Currency Swap,NEW,D121360,N, x

and provides this output: (saving the double quotes, spaces and dash)
Code:
rohan,rahul,kunal,"swati"
rohan,rahul,"kunal",swati
rohan,"rahul",kunal,swati
"rohan",rahul,kunal,swati
rohan,rahul,kunal,swati
50119785,IRFE Asia,IRFE:3123146,18-Jun-09,29-Jun-09,MODFOLLOWING,29-Jun-12,MODFOLLOWING,IDR,USD,30900000000,3000000,1329758,"THE BANK OF TOKYO-MITSUBISHI UFJ LTD.",N,,,,,GREEN,B,FLOAT,FIXED,--,10.25,,,,,,,,,6M,6M,,6M,A/360,A/360,MODFOLLOWING,MODFOLLOWING,USD,GBP,IDR,USD,GBP,IDR,MODFOLLOWING,MODFOLLOWING,USD,GBP,IDR,USD,GBP,IDR,,,,GBP,,,--,6M-USD-LIBOR-BBA,,0,,,,,0,0,,2,NO,1329758,0,0,0,0,23100,0,0,23100,IDOSW,JKT,665,Cross-Currency Swap,NEW,D121360,N, x

Script:
Code:
#!/usr/bin/perl -w
use warnings;
use strict;

open( FH, "input.txt" ) || die "Can't open the input file: $!";

while (<FH>) {
    chomp;
    my $regex = '("\w+[-,|\w+\s+\.]+")';
    m/$regex/g;
    my $line = $1;
    $line =~ s/,//g;
    s/$regex/$line/g;
    print $_. "\n";
}

close(FH);
exit 0;

I hope this provides the full solution you're looking for.

- B
# 31  
Old 06-24-2009
Hi Try this

Hi Try this, it's a little round about method, but this takes care of everything

Code:
#!/bin/bash
total_recs=`wc -l /etc/dev/test_recs | awk '{print $1}'`
i=1
while [ $i -le $total_recs ]; do
rec=`head -$i /etc/dev/test_recs | tail -1`
rec1=`echo $rec | sed 's/\("[a-z]*\),\(^*\)/\1 \2/g'` # To remove find the pattern which starts with "[a-z] 
rec2=`echo $rec1 | sed 's/,"/,/g'` # To replace ," with only ,
rec3=`echo $rec2 | sed 's/" /,/g'` # To replace "and space with comma
rec4=`echo $rec3 | sed 's/"//g'` # To replace " in the start and end of the line
rec5=`echo $rec4 | sed 's/ //g'` # To replace extra space with none
i=$(($i+1))
echo $rec5
done

Input:

rohan,rahul,"kun,al",swati
rohan,"rah,ul",kunal,swati
"ro,han",rahul,kunal,swati
rohan,rahul,kunal,"sw,ati"

Output.
rohan,rahul,kunal,swati
rohan,rahul,kunal,swati
rohan,rahul,kunal,swati
rohan,rahul,kunal,swati

Thanks,
Jeevs81
# 32  
Old 06-24-2009
The poster hasn't replied whether any of the replies have solved the problem, but something that was pointed out previously is that the double quotes are not suppose to be removed. Only the commas that exist between the double quotes. Also, some real sample data was provided and it was seen that there were other things that had to be accounted for. (i.e. dashes, periods, spaces, etc.)
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Removing commas from CSV file

Hi I'm creating a sh script to generate a csv file. The CSV contains the values from a sql table. The content looks this: a,b,c,c2,c3,,,,,,,,,,,d,e I have some code that can separate the fields using the comma as delimiter, but some values actually contain commas, such as... (2 Replies)
Discussion started by: preema
2 Replies

2. Shell Programming and Scripting

Replace semicolon within double quotes in a file with semicolon delimiter

Hello Team, Could you please help me with the below question? I have a file with the following properties 1) File Delimiter is ; 2) Text columns are within double quotes 3) Numeric columns will not have double quotes 4) File has total 6 columns Please see a sample record from file ... (3 Replies)
Discussion started by: sam99
3 Replies

3. Shell Programming and Scripting

Removing just the trailing commas :-(

Hi all, I haven't needed to do any shell based editing for nearly 20 years, and no amount of searching around has found me a solution to this very simple problem :-( I have a csv file. Some lines have three commas at the end. This means the invoice hasn't been paid. I'd like to use sed / grep... (4 Replies)
Discussion started by: chardyzulu
4 Replies

4. Shell Programming and Scripting

Help with removing additional commas in string

Hi Experts, I have below strings hello,hi,,,,,,start date age,code,,,,,61,season I am trying to format this string to hello,hi,start date age,code,61,season Can anyone please help me in achieving this? Kind Regards, RB (3 Replies)
Discussion started by: ramakanth_burra
3 Replies

5. Shell Programming and Scripting

Removing commas within semicolon in a flat file

Hi , Im relatively new to unix and have to process a comma serparated flat file . I recieve some of the fields in double quotes and i want to remove it .. INPUT ==== filed1,field2,field3,"fie,ld4" OUTPUT ===== field1,field2,field3,"field4" can anyone tell me how to achieve... (10 Replies)
Discussion started by: r_t_1601
10 Replies

6. Shell Programming and Scripting

Need help in removing commas

i have the below line as output from a script. I want to delete the string "," and get the output without comma, cat D* | grep "bytes free" | awk '{print $3}' | ????? output: 40,966,189,056 Desired O/P: 40966189056 (1 Reply)
Discussion started by: ali560045
1 Replies

7. UNIX for Dummies Questions & Answers

removing semicolon using sed in aix--urgent

hi I have an expression aaabbb; xxx xxxi i need to get the ouput as xxx xxxi i am using sed -e 's/\(*;\)//g' but it is not working..?? can some one suggest.. This is urgent (9 Replies)
Discussion started by: aixjadoo
9 Replies

8. UNIX for Dummies Questions & Answers

Removing trailer from a flat file!!!

Hi, I get some flat files with trailer which gives the totol records count and i want to remove the trailer from the file. i used the following command it works fine with a single file. cat file_name | grep -v 'Total records:' > file1 mv file file_name But i dont know how to remove the... (12 Replies)
Discussion started by: kumarsaravana_s
12 Replies

9. HP-UX

Padding zeros after removing commas in file

Hi Gurus, There is a ASCII file in which a comma is used as a seperator for the amount field when the amount exceed seven digits: e.g. 0001300,000. Now, this comma needs to be removed from this field, after padding leading zeros (to maintain the ASCII positions) e.g. 00001300000.... (1 Reply)
Discussion started by: pranag21
1 Replies

10. UNIX for Dummies Questions & Answers

removing commas from text file

Dear all I have a file which looks like this xxxxxxxxxxxxxx,xxx,xxxxxxxxxx xxxxxxxxxxxxxx,xxx,xxxxxxxxxx etc basically 14 characters then a comma, three characters, then a comma then 10 characters. We are uploading this file to our mainframe and they want the commas removed, so it... (6 Replies)
Discussion started by: hcclnoodles
6 Replies
Login or Register to Ask a Question