Visit Our UNIX and Linux User Community


Substituting comma "," for dot "." in a specific column when comma"," is a delimiter


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Substituting comma "," for dot "." in a specific column when comma"," is a delimiter
# 1  
Old 06-05-2012
Substituting comma "," for dot "." in a specific column when comma"," is a delimiter

Hi,

I'm dealing with an issue and losing a lot of hours figuring out how i would solve this.
I have an input file which looks like this:

Code:
('BLABLA +200-GRS','Serviço ','TarifaçãoServiço','wap.bla.us.0000000121',2985,0,55,' de conversão em escada','Dia','Domingos')
('BLABLA +200-GRR','Serviço ','TarifaçãoServiço','wap.bla.us.0000000121',2985,0,55,' de conversão em escada','Reduzido','Sábados')
('BLABLA 45-GPR','Serviço ','TarifaçãoServiço','canais.bla.us.0000000222',3073,0,' de conversão de diferença','Noite','Sábados')
('BLABLA 45-GPR','Serviço ','TarifaçãoServiço','canais.bla.us.0000000222',3073,0,' de conversão de diferença','Noite','Segunda - Sexta')
('BLABLA 35-GMS','Serviço','TarifaçãoServiço','VC3 XXXX YYY-Fixo TIM(on)',3121,1,21,' de conversão em escada','Noite','Segunda - Sexta')
('BLABLA-GRN','Serviço','TarifaçãoServiço','Chamadas',1362,1,961,' de conversão em escada','Noite','Segunda - Sexta')
('BLABLA-GPE','Serviço','TarifaçãoServiço','Chamadas',1362,1,9897,' de conversão em escada','Dia Todo','Domingos')

So, sometimes i have 9 columns (if using comma delimiter) (lines 3 and 4) and sometimes i have 10 columns (lines 1,2,5,6,7)

I'd like to replace only this column, which is a number (decimal or not), for a dot... it will look like this:

0.55
0.55
0
0
1.21
1.961
1.9897

So, my final result should be like this:

Code:
('BLABLA +200-GRS','Serviço ','TarifaçãoServiço','wap.bla.us.0000000121',2985,0.55,' de conversão em escada','Dia','Domingos')
('BLABLA +200-GRR','Serviço ','TarifaçãoServiço','wap.bla.us.0000000121',2985,0.55,' de conversão em escada','Reduzido','Sábados')
('BLABLA 45-GPR','Serviço ','TarifaçãoServiço','canais.bla.us.0000000222',3073,0,' de conversão de diferença','Noite','Sábados')
('BLABLA 45-GPR','Serviço ','TarifaçãoServiço','canais.bla.us.0000000222',3073,0,' de conversão de diferença','Noite','Segunda - Sexta')
('BLABLA 35-GMS','Serviço','TarifaçãoServiço','VC3 XXXX YYY-Fixo TIM(on)',3121,1.21,' de conversão em escada','Noite','Segunda - Sexta')
('BLABLA-GRN','Serviço','TarifaçãoServiço','Chamadas',1362,1.961,' de conversão em escada','Noite','Segunda - Sexta')
('BLABLA-GPE','Serviço','TarifaçãoServiço','Chamadas',1362,1.9897,' de conversão em escada','Dia Todo','Domingos')

Any ideas of how should i do this?
I have one, which will not be very effective... and this file contains more than 16million lines Smilie

Thanks

Last edited by joeyg; 06-05-2012 at 12:07 PM.. Reason: Please wrap data and commands with CodeTags
# 2  
Old 06-05-2012
try this.

Code:
#!/usr/bin/perl
while (<DATA>) {
chomp;
s/\',(\d+),(\d+),(\d+),\'/\',$1,$2\.$3\,\'/g;
print $_,"\n";
}
__DATA__
('BLABLA +200-GRS','Serviç','TarifaçServiç,'wap.bla.us.0000000121',2985,0,55,' de conversãem escada','Dia','Domingos')
('BLABLA +200-GRR','Serviç','TarifaçServiç,'wap.bla.us.0000000121',2985,0,55,' de conversãem escada','Reduzido','Sádos')
('BLABLA 45-GPR','Serviç','TarifaçServiç,'canais.bla.us.0000000222',3073,0,' de conversãde diferenç,'Noite','Sádos')
('BLABLA 45-GPR','Serviç','TarifaçServiç,'canais.bla.us.0000000222',3073,0,' de conversãde diferenç,'Noite','Segunda - Sexta')
('BLABLA 35-GMS','Serviç,'TarifaçServiç,'VC3 XXXX YYY-Fixo TIM(on)',3121,1,21,' de conversãem escada','Noite','Segunda - Sexta')
('BLABLA-GRN','Serviç,'TarifaçServiç,'Chamadas',1362,1,961,' de conversãem escada','Noite','Segunda - Sexta')
('BLABLA-GPE','Serviç,'TarifaçServiç,'Chamadas',1362,1,9897,' de conversãem escada','Dia Todo','Domingos')

# 3  
Old 06-05-2012
Just a thought...

Could you consider the ' (single quote) character to be the delimiter?
I can get to the data in question, like:
Code:
$ awk -F"'" '{print $9}' <sample12.txt | cut -d"," -f3-
0,55,
0,55,
0,
0,
1,21,
1,961,
1,9897,

But then was stuck pondering the 3rd and 4th lines - where that field is zero. Which got me pondering counting number of fields with a , (comma) delimiter.
# 4  
Old 06-05-2012
Something like the following should do it, but Pravin's solution will run faster and is a clever exploitation of the data format involved.

Code:
#!/usr/bin/perl

use strict;
use warnings;

open (DATA , '<', $ARGV[0]);
while(<DATA>){
  my @record=split(/,/,$_);
  splice(@record,5,2,("$record[5].$record[6]"))if (@record > 9);
  print join',',@record;
}


Last edited by Skrynesaver; 06-05-2012 at 12:46 PM.. Reason: To advise using a previous better solution
# 5  
Old 06-05-2012
Try:
Code:
sed "s/,\([^']\)/.\1/3" infile

These 2 Users Gave Thanks to Scrutinizer For This Post:
# 6  
Old 06-05-2012
@Scrutinizer:
Very nice!

I had come up with a similar sed command but not nearly as elegant.

Code:
$ sed -r 's/([0-9]+),([0-9]+),([0-9]+),/\1,\2.\3,/g' ./text.txt 
('BLABLA +200-GRS','Serviç','TarifaçServiç,'wap.bla.us.0000000121',2985,0.55,' de conversãem escada','Dia','Domingos')
('BLABLA +200-GRR','Serviç','TarifaçServiç,'wap.bla.us.0000000121',2985,0.55,' de conversãem escada','Reduzido','Sádos')
('BLABLA 45-GPR','Serviç','TarifaçServiç,'canais.bla.us.0000000222',3073,0,' de conversãde diferenç,'Noite','Sádos')
('BLABLA 45-GPR','Serviç','TarifaçServiç,'canais.bla.us.0000000222',3073,0,' de conversãde diferenç,'Noite','Segunda - Sexta')
('BLABLA 35-GMS','Serviç,'TarifaçServiç,'VC3 XXXX YYY-Fixo TIM(on)',3121,1.21,' de conversãem escada','Noite','Segunda - Sexta')
('BLABLA-GRN','Serviç,'TarifaçServiç,'Chamadas',1362,1.961,' de conversãem escada','Noite','Segunda - Sexta')
('BLABLA-GPE','Serviç,'TarifaçServiç,'Chamadas',1362,1.9897,' de conversãem escada','Dia Todo','Domingos')

Mine matches on (and keeps) 3 consecutive number columns and replaces the comma for a period in the appropriate place when found.

You approach is very similar but far more compact and elegant.

Correct me if I am wrong, but I will try to explain for those that do not understand that sed command.

The match side of the command matches on any text that starts with a comma and *does not* have a single quote after it. Since numbers are not contained in single quotes it will only ever match (and keep) the number fields.

Next he tells the replace side of the sed command to replace the comma and number with a period and number, but only on the 3rd match.

so in plain text:

,11,22,33 would become ,11,22.33 as the first and second commas are left as is!
# 7  
Old 06-11-2012
Thanks for all replies... i used Scrutinizer version and worked great in what i was expecting...

Previous Thread | Next Thread
Test Your Knowledge in Computers #541
Difficulty: Medium
In a statically typed language, a variable cannot be declared before it can be used.
True or False?

9 More Discussions You Might Find Interesting

1. AIX

Apache 2.4 directory cannot display "Last modified" "Size" "Description"

Hi 2 all, i have had AIX 7.2 :/# /usr/IBMAHS/bin/apachectl -v Server version: Apache/2.4.12 (Unix) Server built: May 25 2015 04:58:27 :/#:/# /usr/IBMAHS/bin/apachectl -M Loaded Modules: core_module (static) so_module (static) http_module (static) mpm_worker_module (static) ... (3 Replies)
Discussion started by: penchev
3 Replies

2. Shell Programming and Scripting

Bash script - Print an ascii file using specific font "Latin Modern Mono 12" "regular" "9"

Hello. System : opensuse leap 42.3 I have a bash script that build a text file. I would like the last command doing : print_cmd -o page-left=43 -o page-right=22 -o page-top=28 -o page-bottom=43 -o font=LatinModernMono12:regular:9 some_file.txt where : print_cmd ::= some printing... (1 Reply)
Discussion started by: jcdole
1 Replies

3. UNIX for Dummies Questions & Answers

Using "mailx" command to read "to" and "cc" email addreses from input file

How to use "mailx" command to do e-mail reading the input file containing email address, where column 1 has name and column 2 containing “To” e-mail address and column 3 contains “cc” e-mail address to include with same email. Sample input file, email.txt Below is an sample code where... (2 Replies)
Discussion started by: asjaiswal
2 Replies

4. Shell Programming and Scripting

grep with "[" and "]" and "dot" within the search string

Hello. Following recommendations for one of my threads, this is working perfectly : #!/bin/bash CNT=$( grep -c -e "some text 1" -e "some text 2" -e "some text 3" "/tmp/log_file.txt" ) Now I need a grep success for some thing like : #!/bin/bash CNT=$( grep -c -e "some text_1... (4 Replies)
Discussion started by: jcdole
4 Replies

5. Solaris

The slices "usr", "opt", "tmp" disappeared!!! Help please.

The system don't boot. on the screen appears following: press enter to maintenance (or type CTRL-D to continue)...I checked with format command. ... the slices "0-root","1-swap","2-backup" exist. ...the slises "3-var","6-usr" -unassigned. :( (16 Replies)
Discussion started by: wolfgang
16 Replies

6. Shell Programming and Scripting

how to use "cut" or "awk" or "sed" to remove a string

logs: "/home/abc/public_html/index.php" "/home/abc/public_html/index.php" "/home/xyz/public_html/index.php" "/home/xyz/public_html/index.php" "/home/xyz/public_html/index.php" how to use "cut" or "awk" or "sed" to get the following result: abc abc xyz xyz xyz (8 Replies)
Discussion started by: timmywong
8 Replies

7. Shell Programming and Scripting

awk command to replace ";" with "|" and ""|" at diferent places in line of file

Hi, I have line in input file as below: 3G_CENTRAL;INDONESIA_(M)_TELKOMSEL;SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL My expected output for line in the file must be : "1-Radon1-cMOC_deg"|"LDIndex"|"3G_CENTRAL|INDONESIA_(M)_TELKOMSEL"|LAST|"SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL" Can someone... (7 Replies)
Discussion started by: shis100
7 Replies

8. Shell Programming and Scripting

"Join" or "Merge" more than 2 files into single output based on common key (column)

Hi All, I have working (Perl) code to combine 2 input files into a single output file using the join function that works to a point, but has the following limitations: 1. I am restrained to 2 input files only. 2. Only the "matched" fields are written out to the "matched" output file and... (1 Reply)
Discussion started by: Katabatic
1 Replies

9. UNIX for Dummies Questions & Answers

Explain the line "mn_code=`env|grep "..mn"|awk -F"=" '{print $2}'`"

Hi Friends, Can any of you explain me about the below line of code? mn_code=`env|grep "..mn"|awk -F"=" '{print $2}'` Im not able to understand, what exactly it is doing :confused: Any help would be useful for me. Lokesha (4 Replies)
Discussion started by: Lokesha
4 Replies

Featured Tech Videos