Format CSV file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Format CSV file
# 1  
Old 05-14-2009
Format CSV file

I have a csv file which I need to process and export back to xlsx file.

For instance, the csv contains:

John Smith, job-title, hours
John Smith, job-title, hours
Mary Smith job-title, hours

etc.

I need to import that to a script, get id of redundant data i.e:

John smith, job-title, hours(hours for the whole month..)
Mary Smith job-title, hours

and export that to an xlsx file.

Does anybody know how I would go about this?

I have heard that OpenOffice's spreadsheet program has modules that could perhaps do this?

I could also probably do this in php... I mean, if I imported the data to a db and then exported everything as I needed? But would this be overkill?

Thanks in advance for your comments Smilie
# 2  
Old 05-14-2009
If there are commas between each of the extra hours then to remove the extraneous data by running:
$ cat file.csv | awk -F"," '{ print $1","$2","$3 }' > outputfile.csv

Excel can import "*.csv" files and then save as "*.xlsx" file but I presume you want this automated?
# 3  
Old 05-14-2009
Quote:
Originally Posted by TonyFullerMalv
If there are commas between each of the extra hours then to remove the extraneous data by running:
$ cat file.csv | awk -F"," '{ print $1","$2","$3 }' > outputfile.csv

Excel can import "*.csv" files and then save as "*.xlsx" file but I presume you want this automated?
Thanks for the reply. Yes, I need this to be automated. Plus, the extraneous data, the hours, I need to add these together to get a total for the month (each line is the total for the week).

The php and db route just seems a little much but I'm not sure of another way to do it.
# 4  
Old 05-14-2009
Quote:
Originally Posted by _tina_
John Smith, job-title, hours
John Smith, job-title, hours
Mary Smith job-title, hours

John smith, job-title, hours(hours for the whole month..)
Mary Smith job-title, hours
As I understand, you would like to sum the third field over the lines having the same first field, and for each distinct first field output a single line with the sum replacing the third field.
Code:
awk -F, -v OFS=, 'FNR==NR{a[$1]+=$3;next}$3=a[$1]{print$0;a[$1]=0}' file.csv file.csv > result.csv

Yes, you have to give to awk the name of the input file two times in a row.

Sorry, I don't know what a .xlsx file is. Perhaps a xml file formatted by/for Excel?
# 5  
Old 05-14-2009
Quote:
Originally Posted by colemar
As I understand, you would like to sum the third field over the lines having the same first field, and for each distinct first field output a single line with the sum replacing the third field.
Code:
awk -F, -v OFS=, 'FNR==NR{a[$1]+=$3;next}$3=a[$1]{print$0;a[$1]=0}' file.csv file.csv > result.csv

Yes, you have to give to awk the name of the input file two times in a row.

Sorry, I don't know what a .xlsx file is. Perhaps a xml file formatted by/for Excel?
Thanks for the reply.

A .xlsx file is a combination of XML architecture and ZIP compression for size reduction. Basically though, I have to output to a spreadsheet.

The other problem is that I have to format the output for each month. I.e:

John Smith, job-title, hours, 01-01-09, 01-15-09
John Smith, job-title, hours, 01-16-09, 01-31-09
Mary Smith job-title, hours, , 01-01-09, 01-31-09
...
John Smith, job-title, hours, 02-01-09, 02-15-09
...

The first two (the total for john Smith for January) have to be added up and displayed separately from John Smith for Feb and separately from Mary Smith.
# 6  
Old 05-14-2009
Quote:
Originally Posted by colemar
As I understand, you would like to sum the third field over the lines having the same first field, and for each distinct first field output a single line with the sum replacing the third field.
Code:
awk -F, -v OFS=, 'FNR==NR{a[$1]+=$3;next}$3=a[$1]{print$0;a[$1]=0}' file.csv file.csv > result.csv

Yes, you have to give to awk the name of the input file two times in a row.

Sorry, I don't know what a .xlsx file is. Perhaps a xml file formatted by/for Excel?
this does not work if the total results in '0'.
Code:
awk -F, -v OFS=, '{sum[$1]+=$3;rec[$1]=$1 OFS $2} END {for(i in sum) print rec[i], sum[i]}' file.csv > result.csv

# 7  
Old 05-14-2009
Quote:
Originally Posted by vgersh99
this does not work if the total results in '0'.
Good point. I was aware of this problem, but assumed that an input record with zero hours would be entirely omitted and that negative hours do not make sense as input.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Match list of strings in File A and compare with File B, C and write to a output file in CSV format

Hi Friends, I'm a great fan of this forum... it has helped me tone my skills in shell scripting. I have a challenge here, which I'm sure you guys would help me in achieving... File A has a list of job ids and I need to compare this with the File B (*.log) and File C (extend *.log) and copy... (6 Replies)
Discussion started by: asnandhakumar
6 Replies

2. Shell Programming and Scripting

Format csv file

Hi, I need to make some changes in a csv file using awk or perl. Unfortunately, all my attempts have led to nothing so I hope you guys can help me. I have the following example input file including header(original file has 35 fields): ABC: DE Time: 2012/09/07... (3 Replies)
Discussion started by: Subbeh
3 Replies

3. UNIX Desktop Questions & Answers

Format csv file using Unix

Hi All, I have an csv file with three rows, where first containing header deatils. is there any way to make the first row to appear bold using UNIX command. Input File: Name Rank arun 1 babu 2 Expected Output: Name Rank arun 1 babu 2 (7 Replies)
Discussion started by: arunmanas
7 Replies

4. Shell Programming and Scripting

Convert the below file to csv format

Hi , i want to change this question, i will post soon.. (6 Replies)
Discussion started by: srikanth2567
6 Replies

5. Shell Programming and Scripting

format output in csv file

I am sending the output of a file to .csv file. The output should look like this: Total Customers Processed:,8 Total Customers Skipped:,0 Total Customers Added:,8 Total Customers Changed:,0 Total Policies Deleted:,0 Total Policies Failed:,0 total:,8 Now i want this output in... (1 Reply)
Discussion started by: Prashant Jain
1 Replies

6. Shell Programming and Scripting

Format txt file to CSV

Hi All, I have a file with content FLIGHT PLANS PRODUCED ON 26.08.2008(SORTED BY FPLAN NUMBER) RUN DATED 27/08/08 PAGE 1 -------------------------------------------------------------- FPLAN FPLAN PRE BTCH BATCH POST BTCH BATCH BATCH ... (1 Reply)
Discussion started by: digitalrg
1 Replies

7. Shell Programming and Scripting

Format a date in a csv file

So I have a csv file where the 3rd field is a date string in the format yyyy-mm-dd. I need to change it to mm/dd/yyyy. So each line in the csv file looks like: StringData,StringData,2009-02-17,12.345,StringData StringData,StringData,2009-02-16,65.789,StringData Any idea how I can keep... (5 Replies)
Discussion started by: rpiller
5 Replies

8. Shell Programming and Scripting

changing the format of CSV file

Hi Experts, Please help me to get the following from script for Unix ENvironment(shell, perl, tr, sed, awk). INPUT FILE: 20K,ME,592971 20K,YOU,2 20K,HE,1244998 50K,YOU,480110 50K,ME,17 50K,HIS,10 50K,HE,1370391 OUTPUT FILE: K,ME,YOU,HE,HIS 20K,592971,2,1244998,0... (5 Replies)
Discussion started by: ashis.tewari
5 Replies

9. Shell Programming and Scripting

AWK CSV to TXT format, TXT file not in a correct column format

HI guys, I have created a script to read 1 column in a csv file and then place it in text file. However, when i checked out the text file, it is not in a column format... Example: CSV file contains name,age aa,11 bb,22 cc,33 After using awk to get first column TXT file... (1 Reply)
Discussion started by: mdap
1 Replies

10. UNIX for Advanced & Expert Users

How to Parse a CSV file into a Different Format

Hi I have a CSV file with me in this format Currency, USD, EUR, USD, 1.00, 1.32, EUR, 0.66, 1.00, How do I transpose the file to get to the format below. currency, currency, rate USD, USD, 1.00 USD, EUR, 1.32 EUR, USD, 0.66 EUR, EUR, 1.00 Thanks for your help We are using... (2 Replies)
Discussion started by: cdesiks
2 Replies
Login or Register to Ask a Question