Unix Script to parse a CSV


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Unix Script to parse a CSV
# 1  
Old 09-09-2008
Unix Script to parse a CSV

I am writing a unix script that will parse a CSV and edit the values. My CSV looks like this
0,0,0,0,1,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0
10,11,7,0,4,12,2,3,7,0,11,3,12,4,0,5,5,4,5,0,8,6,12,0,9,3,3,0,2,7,8
19,11,7,0,4,14,16,10,8,2,13,7,15,6,0,76,6,4,10,0,18,10,17,1,11,3,3,0,9,9,8
22,11,13,1,5,14,16,10,9,10,13,7,16,6,0,59,6,4,10,0,18,13,17,1,11,3,3,0,12,9,10
22,11,13,1,5,14,16,10,9,10,13,7,16,6,22,90,6,4,10,0,18,13,17,1,11,3,4,0,12,9,10
41,18,27,9,27,41,59,20,27,54,63,34,28,43,40,131,7,8,19,0,62,16,30,23,25,3,4,9,24,12,19
42,18,27,9,27,41,59,20,27,55,68,36,28,46,41,132,7,8,19,13,64,16,31,25,25,3,4,9,24,12,19
125,124,78,62,97,87,145,70,87,119,150,124,99,95,41,175,85,58,57,88,142,83,92,102,107,80,45,64,64,94, 89
125,126,78,62,99,87,145,70,87,119,161,124,99,95,41,175,85,58,58,88,142,84,112,103,108,80,68,64,65,98 ,89
189,254,164,153,192,153,230,132,188,163,210,210,167,198,93,235,146,110,97,130,211,107,181,140,151,11 9,105,105,178,126,165
189,324,168,192,194,159,233,132,192,169,244,210,167,201,103,235,147,152,180,181,213,107,192,190,212, 119,119,126,195,126,166
189,324,168,255,194,225,233,141,192,230,244,260,167,201,172,283,181,206,217,216,261,107,192,235,212, 119,169,197,264,189,229
366,438,315,319,382,287,398,320,416,382,407,397,342,448,276,392,297,368,237,347,336,332,384,405,412, 284,329,350,396,326,356


This script would run every hour on the hour deleting the first value on a specific line and adding a new record to the end of that same line. So for instance at 8:00am I might delete the 0 from the first line and add a 10 at the end. I also have a version of the CSV that has the time as the first value of each row. I have tried both awk and sed and I can't figure out how to replace a value at a specfic location. I think I can delete the first value in row 1 using this code snippet, but I dont know how to add the value to the end of the same row. Is there a way to parse the csv into a two dimesional array and then output it back into a csv at the end? It probably isn't the most efficient way but it would work.
I hope this isn't too confusing. Please let me know if you need more information. Thanks

Code:
 
sed '1s/^[^,]*,//' file.csv

# 2  
Old 09-09-2008
In perl, you'd 'slurp' in the file, break it into an array.
You'd request the line of the array that you wanted, split it into a new array.
You'd pop the array to shorten it by one, unshift your new value to the front of the new array.
Then, you'd turn the array back into a string, and put it back into the same place in the original array.
Finally, you'd rewrite the original file with the new data.

It's not too hard, but expect it to take a couple of hours to write.
# 3  
Old 09-09-2008
Quote:
Originally Posted by RJ17
So for instance at 8:00am I might delete the 0 from the first line and add a 10 at the end.
Code:
sed -e '1s/^[^,]*,//' -e '1s/$/,10/' file

# 4  
Old 09-10-2008
danmero,
Thanks for the help. It is possible to add the 10 to the end of the line instead of the beginning. This line of code converts
0,0,0,0,1,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0
to
,10,0,1,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0

Ideally the new line would look like this
0,0,0,1,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,10
# 5  
Old 09-10-2008
Here is the output:
Code:
#  echo "0,0,0,0,1,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0" | sed -e '1s/^[^,]*,//' -e '1s/$/,10/'
0,0,0,1,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,10

# 6  
Old 09-10-2008
It does work the when I use the echo command on just the one line but when I run it on my entire csv file it shows
,10,0,1,0,1,0,1,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0
10,11,7,0,4,12,2,3,7,0,11,3,12,4,0,5,5,4,5,0,8,6,12,0,9,3,3,0,2,7,8
19,11,7,0,4,14,16,10,8,2,13,7,15,6,0,76,6,4,10,0,18,10,17,1,11,3,3,0,9,9,8
22,11,13,1,5,14,16,10,9,10,13,7,16,6,0,59,6,4,10,0,18,13,17,1,11,3,3,0,12,9,10
22,11,13,1,5,14,16,10,9,10,13,7,16,6,22,90,6,4,10,0,18,13,17,1,11,3,4,0,12,9,10
41,18,27,9,27,41,59,20,27,54,63,34,28,43,40,131,7,8,19,0,62,16,30,23,25,3,4,9,24,12,19
42,18,27,9,27,41,59,20,27,55,68,36,28,46,41,132,7,8,19,13,64,16,31,25,25,3,4,9,24,12,19
125,124,78,62,97,87,145,70,87,119,150,124,99,95,41,175,85,58,57,88,142,83,92,102,107,80,45,64,64,94, 89
125,126,78,62,99,87,145,70,87,119,161,124,99,95,41,175,85,58,58,88,142,84,112,103,108,80,68,64,65,98 ,89
189,254,164,153,192,153,230,132,188,163,210,210,167,198,93,235,146,110,97,130,211,107,181,140,151,11 9,105,105,178,126,165
189,324,168,192,194,159,233,132,192,169,244,210,167,201,103,235,147,152,180,181,213,107,192,190,212, 119,119,126,195,126,166
189,324,168,255,194,225,233,141,192,230,244,260,167,201,172,283,181,206,217,216,261,107,192,235,212, 119,169,197,264,189,229

I dont know why this is, but it doesn't make sense to me.
# 7  
Old 09-10-2008
I'm beginning to think that I should spend a bit more time reviewing sed....

I threw the script together in perl - as an exercise for me, mostly. But if you find value in it, great. It's much longer than the line that danmero provided, and I'm sure that there are perl experts that can turn my code into a "one-liner"...

My script has two inputs <line number> and <new value> - as well as a debug or help flag:
Code:
-bash-3.00$ ./parse.pl -l 5 -v 300
-bash-3.00$ ./parse.pl -d -l 5 -v 300
input:
 41,18,27,9,27,41,59,20,27,54,63,34,28,43,40,131,7,8,19,0,62,16,30,23,25,3,4,9,24,12,19
output:
 18,27,9,27,41,59,20,27,54,63,34,28,43,40,131,7,8,19,0,62,16,30,23,25,3,4,9,24,12,19,300

And here's the code:
Code:
#!/usr/bin/perl -w

################################################################################
# Pragma
use strict;
use Getopt::Std;
use vars qw/ %opt /;

################################################################################
# Forward declaration of subroutines
sub do_init();

################################################################################
# Variable declaration
my $input = "./file";
my $output = "./file.output";
my @entireList;
my $entireList;
my $listLineNumber;
my @oneLine;
my $newLine;
my $line;
my $DEBUG;

################################################################################
# You don't need to get your command line variables this way
#   - it's a habit that I got into, and it works for me - your mileage may vary
do_init();

# set DEBUG flag according to command line options
if ( $opt{d} )
{
   $DEBUG = 1;
}
else
{
   $DEBUG = 0;
}

################################################################################
# Begin the MAIN portion of the script
################################################################################

# Open your input file and place the contents into the array @entireList
# - each line is a separate element in the array
open(FILE, "<$input") or die "Cannot open $input for read :$!";
chomp (@entireList = <FILE>);
close( FILE );

# I haven't really added any error checking for the options. If you don't
# assign a line item or value, the script will still run - it won't substitute
# unless you provide -l <linenumber> AND -v <value>

if ( $opt{l} && $opt{v} )
{
   $listLineNumber = $opt{l};
   # split string (one line) into a temporary array
   print "input:\n $entireList[$listLineNumber]\n" if $DEBUG;
   @oneLine = split(/,/, $entireList[$listLineNumber]);
   shift @oneLine;
   push @oneLine, $opt{v};
   # put array back into a string - and back into the original array
   $entireList[$listLineNumber] = join(",",@oneLine);
   print "output:\n $entireList[$listLineNumber]\n" if $DEBUG;
}

# Write the file - here I have it writing to a new file - you can have it
# overwrite the original file if you prefer - simply change the variable
# declaration at the top for $output to match $input - or change the code
# below to use $input instead of $output. I recommend changing this in the
# variable declaration - so that you can reuse the script later
open(FILE2, ">$output") or die "Cannot open $output for read :$!";
foreach $line(@entireList)
{
   print FILE2 "$line\n";
}
close( FILE2 );


################################################################################
# Standard handling message - maybe too much for simple utilities - oh well... #
################################################################################
sub do_init()
{
   my $opt_string = 'dhl:v:';
   getopts( "$opt_string", \%opt ) or do_usage();
   do_usage() if $opt{h};
}
sub do_usage()
{
   print "\nusage: $0 [-h] [-l {line number}] [-n {new value}]\n\n";
   exit;
}

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need a script to parse data and output to csv

I am not too savvy with arrays and am assuming that what I am looking for needs arrays. This is my requirement. So I have the raw data that gets updated to a log as shown below StudentInfo: FullInfo = { Address = Newark Age = 20 Name= John } StudentInfo:... (2 Replies)
Discussion started by: sidnow
2 Replies

2. Shell Programming and Scripting

Parse csv files by their names

HI all I have multiple csv files with the names VAR1_VAR2_VAR3_VAR4.csv All the files have the same structure inside just values change. I am trying to retrieve data from those files by fixing at each time one or more VAR. I tried to write a script but I have 2 problems: 2-... (1 Reply)
Discussion started by: Jhon.c
1 Replies

3. Shell Programming and Scripting

BASH script to parse XML and generate CSV

Hi All, Hope all you are doing good! Need your help. I have an XML file which needs to be converted CSV file. I am not an expert of awk/sed so your help is highly appreciated!! XML file looks like this: <l:event dateTime="2013-03-13 07:15:54.713" layerName="OSB" processName="ABC"... (2 Replies)
Discussion started by: bhaskar_m
2 Replies

4. UNIX for Dummies Questions & Answers

Help to parse csv file with shell script

Hello ! I am very aware that this is not the first time this question is asked here, because I have already read a lot of previous answers, but none of them worked, so... As said in the title, I want to read a csv file with a bash script. Here is a sample of the file: ... (4 Replies)
Discussion started by: Grhyll
4 Replies

5. Shell Programming and Scripting

How to parse csv format?

Hi, I have a file with 3 fields in csv format: /tmp/foo/,MODIFY,bar/toto "/tmp/foo, bar/","ATTRIB,ISDIR","toto, tata/foobar"I would like to split fields in order to obtain the following: Line1: /tmp/foo/ MODIFY bar/totoLine2: /tmp/foo, bar/ ATTRIB,ISDIR toto, tata/foobarCan't find my way... (11 Replies)
Discussion started by: chebarbudo
11 Replies

6. Shell Programming and Scripting

Parse csv file

Hi, Our requirement is to parse the input file(.csv format). The each column in the file is delimited with comma. We need to take each column and apply some business validation rule. If data itself contains comma, then those fields are enclosed with double quotes ("). We can see this double... (7 Replies)
Discussion started by: vfrg
7 Replies

7. Shell Programming and Scripting

Need help in creating a Unix Script to parse xml file

Hi All, My requirement is create an unix script to parse the xml file and display the values of the Elements/value between the tags on console. Like say, I would like to fetch the value of errorCode from the below xml which is 'U007' and display it. Can we use SED command for this? I have tried... (10 Replies)
Discussion started by: Anil.Wmg
10 Replies

8. Shell Programming and Scripting

Unix shell script to parse the contents of comma-separated file

Dear All, I have a comma-separated file. 1. The first line of the file(header) should have 4 commas(5 fields). 2. The last line of the file should have 1 comma(2 fields). Pls help me in checking this condition in a shell script. And the number of lines between the first line and last... (11 Replies)
Discussion started by: KrishnaSaran
11 Replies

9. Shell Programming and Scripting

CSV File parse help in Perl

Folks, I have a bit of an issue trying to obtain some data from a csv file using PERL. I can sort the file and remove any duplicates leaving only 4 or 5 rows containing data. My problem is that the data contained in the original file contains a lot more columns and when I try ro run this script... (13 Replies)
Discussion started by: lodey
13 Replies

10. UNIX for Advanced & Expert Users

How to Parse a CSV file into a Different Format

Hi I have a CSV file with me in this format Currency, USD, EUR, USD, 1.00, 1.32, EUR, 0.66, 1.00, How do I transpose the file to get to the format below. currency, currency, rate USD, USD, 1.00 USD, EUR, 1.32 EUR, USD, 0.66 EUR, EUR, 1.00 Thanks for your help We are using... (2 Replies)
Discussion started by: cdesiks
2 Replies
Login or Register to Ask a Question