Perl - extracting data from .csv files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Perl - extracting data from .csv files
# 1  
Old 10-08-2003
Perl - extracting data from .csv files

PROJECT: Extracting data from an employee timesheet. The timesheets are done in excel (for user ease) and then converted to .csv files that look like this (see color code key below):

Code:
,,,,,,,,,,,,,,,,,,,
9/14/2003,<-- Week Ending,,,,,,,,,,,,,,,,,,
Craig Brennan,,,,,,,,,,,,,,,,,,,
83-401-5654,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,
,Monday,,Tuesday,,Wednesday,,Thursday,,Friday,,Saturday,,Sunday,,Totals,,,,
,9/8/2003,,9/9/2003,,9/10/2003,,9/11/2003,,9/12/2003,,9/13/2003,,9/14/2003,,,,,,
Project Number,REG,O.T.,REG,O.T.,REG,O.T.,REG,O.T.,REG,O.T.,REG,O.T.,REG,O.T.,Reg,O.T.,,,
99010 (Vac),,,,,,,,,,,,,,,0.0,0.0,,,
99030 (Hol),,,,,,,,,,,,,,,0.0,0.0,,,
99040 (Pers),,,,,,,,,,,,,,,0.0,0.0,,,
99100 (Admin),5.0,,3.0,,1.0,,1.0,,1.0,,,,,,11.0,0.0,,,
99728 (R&D),3.0,,5.0,,6.0,,6.0,,6.0,,,,,,26.0,0.0,,,
GQ09,,,,,1.0,,1.0,,1.0,,,,,,3.0,0.0,,,
,,,,,,,,,,,,,,,0.0,0.0,,,
,,,,,,,,,,,,,,,0.0,0.0,,,
,,,,,,,,,,,,,,,0.0,0.0,,,
,,,,,,,,,,,,,,,0.0,0.0,,,
,,,,,,,,,,,,,,,0.0,0.0,,,
,,,,,,,,,,,,,,,0.0,0.0,,,
,,,,,,,,,,,,,,,0.0,0.0,,,
,,,,,,,,,,,,,,,0.0,0.0,,,
,,,,,,,,,,,,,,,0.0,0.0,,,
,,,,,,,,,,,,,,,0.0,0.0,,,
,,,,,,,,,,,,,,,0.0,0.0,,,
,,,,,,,,,,,,,,,0.0,0.0,,,
,,,,,,,,,,,,,,,0.0,0.0,,,
,,,,,,,,,,,,,,,0.0,0.0,,,
,,,,,,,,,,,,,,,0.0,0.0,,,
TOTAL,8.0,0.0,8.0,0.0,8.0,0.0,8.0,0.0,8.0,0.0,0.0,0.0,0.0,0.0,40.0,0.0,,,
,,,,,,,,,,,,,,,,,,,
,,SPECIAL PROJECT CODES,,,,,,,,ACCOUNTING USE ONLY,,,,,,,,,
,,,,,,,,,,HOURS PAID,,,,,,,,,
,,99010 - VACATION,,,,,,,,REG,,O.T.,,DISTRIBUTION,,,,,
,,99030 - HOLIDAY,,,,,,,,,,,,,,,,,
,,99040 - PERSONAL,,,,,,,,,,,,,,,,,
,,99050 - DOWN TIME,,,,,,,,,,,,,,,,,
,,99099 - UNASSIGNED,,,,,,,,,,,,,,,,,
,,99100 - ADMINISTRATION,,,,,,,,,,,,,,,,,
,,99728 - RES & DEVELOP.,,,,,,,,,,,,,,,,,
,,99740 - UNASSIGNED - MCS,,,,,,,,,,,,,,,,,
,,99250 - TECH SUPPORT,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,
,,,,,,,,,TOTAL,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,

Most of this stuff I do not need. Here is what I do need:
Week Ending - converted to this format: 20030914
Project Codes - that do not have a null value.
Total hours for the week for that particular project
Total overtime hours for the week for that particular project
Employee Number - the 83-401-5654 number.

The output, ideally, would look similar to this in an already existing text file with headings:
Code:
EMP,Proj,hours,ot_hours,WE
83-401-5654,99100,11.00,0.00,20030914
83-401-5654,99728,26.00,0.00,20030914
83-401-5654,GQ09,3.00,0.00,20030914
44-807-5541,3578,5.00,0.00,20030914
44-807-5541,99100,6.00,0.00,20030914

Here's what I have so far for Perl code. Don't know if this will help at all...
Code:
#!/usr/bin/perl -w
use strict;

my ($ts, $mon, $month, $day, $dy, $year, $mmddyyyy, $filename, $emp, @emp, @temp_array);

@emp = ('CBrennan');
$filename = "TimeSheet";
$mon = (localtime(time))[4];
$mon++;
$dy = (localtime(time))[3];
$year = (localtime(time))[5];
$year += 1900;

if ($mon <= 9) {
        $month = '0'."$mon";
        } else {
        $month = $mon;
        }

if ($dy <= 9) {
        $day = '0'."$dy";
        } else {
        $day = $dy;
        }

#!/usr/bin/perl -w
use strict;

my ($ts, $mon, $month, $day, $dy, $year, $mmddyyyy, $filename, $emp, @emp, @temp_array);

@emp = ('CBrennan');
$filename = "TimeSheet";
$mon = (localtime(time))[4];
$mon++;
$dy = (localtime(time))[3];
$year = (localtime(time))[5];
$year += 1900;

if ($mon <= 9) {
        $month = '0'."$mon";
#!/usr/bin/perl -w
use strict;

my ($ts, $mon, $month, $day, $dy, $year, $mmddyyyy, $filename, $emp, @emp, @temp_array);
############ INITIALIZE VARIABLES/PULL DATE AND TIME FROM SERVER ##################
@emp = ('CBrennan');
##### FOR TESTING PURPOSES, I'M THE ONLY MEMBER OF @emp #####
$filename = "TimeSheet";
$mon = (localtime(time))[4];
$mon++;
$dy = (localtime(time))[3];
$year = (localtime(time))[5];
$year += 1900;
############ APPEND 0 TO SINGLE DIGIT MONTHS AND DAYS #############
if ($mon <= 9) {
        $month = '0'."$mon";
        } else {
        $month = $mon;
        }

if ($dy <= 9) {
        $day = '0'."$dy";
        } else {
        $day = $dy;
        }

########## TEST PRINTS FOR PULLING DATE #####
##### print $month;
##### print $day;
##### print $year;
##### print "\n\n";
#############################################

########## SET DATE FORMAT ##################
$mmddyyyy = $month . $day . $year;

######### READ CONETENTS OF ARRAY WITH LIST OF EMPLOYEES AND FOR EACH EMPLOYEE, OPEN .csv TIMESHEET ########
foreach $emp (@emp) {
        $ts = "/home/cbrennan/"."$emp"."_"."$filename"."$mmddyyyy".".csv";
        open(TIMESHEET, "< $ts") || die "Couldn't open $ts. Please check permissions. \n";
        @temp_array=<TIMESHEET>;
        close TIMESHEET;
        ####### DEAL WITH THE CONTENTS OF TIMESHEET (@temp_array) ##########
        ####### FROM HERE DOWN IS WHERE I AM STUCK ###########
}

Thanks much in advance for the help.

Last edited by kregh99; 10-09-2003 at 10:10 AM..
# 2  
Old 10-08-2003
What exactly is your question ???
# 3  
Old 10-09-2003
Sorry... guess I didn't make it very clear.

My question is: How can I pull only the highlighted data in all the .csv files in a directory and output the data in a single text file as shown (the text file already exists with the EMP, Proj, hours,ot_hours,WE headings)...

Code:
EMP,Proj,hours,ot_hours,WE
83-401-5654,99100,11.00,0.00,20030914
83-401-5654,99728,26.00,0.00,20030914
83-401-5654,GQ09,3.00,0.00,20030914
44-807-5541,3578,5.00,0.00,20030914
44-807-5541,99100,6.00,0.00,20030914

If you look at the comments at the bottom of the Perl code, you can see where I can open a spreadsheet, dump the contents into an array, but I don't know how to deal with it from there.

Thanks.

- Craig
# 4  
Old 10-09-2003
Quote:
Originally posted by kregh99

Code:
EMP,Proj,hours,ot_hours,WE
83-401-5654,99100,11.00,0.00,20030914
83-401-5654,99728,26.00,0.00,20030914
83-401-5654,GQ09,3.00,0.00,20030914
44-807-5541,3578,5.00,0.00,20030914
44-807-5541,99100,6.00,0.00,20030914

If you look at the comments at the bottom of the Perl code, you can see where I can open a spreadsheet, dump the contents into an array, but I don't know how to deal with it from there.

Thanks.

- Craig
This type of work involves pretty basic Perl commands. After you open your file, read each line and call the split() function on each line in order to split the line into fields, delimited by the commas. For example:

-----------------------------------------
open(TIMESHEET, "< $ts") || die "Couldn't open $ts. Please check permissions. \n";

while ($line =<TIMESHEET>) {
chomp $line; # removes the carriage return
@array = split(/,/ , $line); # breaks the line up into fields

## NOW PROCESS EACH LINE HERE

}
close TIMESHEET;
-----------------------------------------

Since each line does NOT have the same format you will have to add code to test for specific lines. For example, the following line could be used to detect the line with the employee number:

if ($array[0] =~ /\d\d\-\d+\-\d+/) {
$emp_number = $&;
}

You'd have to detect other lines in similar way. This is all elementary Perl. If you aren't familiar with Perl, I'd suggest picking up a book such as "Learning Perl".
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extracting part of data from files

Hi All, I have log files as below. log1.txt <table name="content_analyzer" primary-key="id"> <type="global" /> </table> <table name="content_analyzer2" primary-key="id"> <type="global" /> </table> Time taken: 1.008 seconds ID = gd54321bbvbvbcvb <table name="content_analyzer"... (7 Replies)
Discussion started by: ROCK_PLSQL
7 Replies

2. Shell Programming and Scripting

Extracting data from specific rows and columns from multiple csv files

I have a series of csv files in the following format eg file1 Experiment Name,XYZ_07/28/15, Specimen Name,Specimen_001, Tube Name, Control, Record Date,7/28/2015 14:50, $OP,XYZYZ, GUID,abc, Population,#Events,%Parent All Events,10500, P1,10071,95.9 Early Apoptosis,1113,11.1 Late... (6 Replies)
Discussion started by: pawannoel
6 Replies

3. Shell Programming and Scripting

Compare 2 files of csv file and match column data and create a new csv file of them

Hi, I am newbie in shell script. I need your help to solve my problem. Firstly, I have 2 files of csv and i want to compare of the contents then the output will be written in a new csv file. File1: SourceFile,DateTimeOriginal /home/intannf/foto/IMG_0713.JPG,2015:02:17 11:14:07... (8 Replies)
Discussion started by: refrain
8 Replies

4. Programming

Python script for extracting data using two files

Hello, I have two files. File 1 is a list of interested IDs Ex1 Ex2 Ex3File 2 is the original file with over 8000 columns and 20 millions rows and is a compressed file .gz Ex1 xx xx xx xx .... Ex2 xx xx xx xx .... Ex2 xx xx xx xx ....Now I need to extract the information for all the IDs of... (4 Replies)
Discussion started by: nans
4 Replies

5. Shell Programming and Scripting

Script for extracting data from csv file based on column values.

Hi all, I am new to shell script.I need your help to write a shell script. I need to write a shell script to extract data from a .csv file where columns are ',' separated. The file has 5 columns having values say column 1,column 2.....column 5 as below along with their valuesm.... (3 Replies)
Discussion started by: Vivekit82
3 Replies

6. Shell Programming and Scripting

Copy Data from CSV file to Excel Sheet using Perl

Hi All, Firstly I will like to wish A Happy New Year to all. Now my issue is I have one csv file say(data.csv) and one excel file say(result.xls) The result.xls contains two sheet name Sheet1 and Sheet2, Now What I am trying to do is to First I want to delete that data of Sheet2 if present any,... (6 Replies)
Discussion started by: adisky123
6 Replies

7. UNIX for Dummies Questions & Answers

Extracting data from PDF files into CSV file

Hi, I have several hundreds of PDFfiles number 01.pdf, 02.pdf, 03.pdf, etc in one folder. These are vey long documentd with a lot of information (text, tables, figures, etc). I need to extract the information asociated with one disease in particular (Varicella). The information I need to... (5 Replies)
Discussion started by: Xterra
5 Replies

8. UNIX for Dummies Questions & Answers

Extracting data from many compressed files

I have a large number (50,000) of pretty large compressed files and I need only certain lines of data from them (each relevant line contains a certain key word). Each file contains 300 such lines. The individual file names are indexed by file number (file_name.1, file_name.2, ... ,... (1 Reply)
Discussion started by: Boltzmann
1 Replies

9. Shell Programming and Scripting

extracting data from a .csv file

I have a .csv file equipment,bandtype abc,aws def,mmds ghi,umts jkl,mmds I can get the equipment from `hostname`. In my script i want to check what is the hostname. then see if it exists in the.csv file. if it does then i want to store the second parameter(bandtype) for the corresponding... (3 Replies)
Discussion started by: lassimanji
3 Replies

10. Shell Programming and Scripting

extracting data from files..

frnds, I m having prob woth doing some 2-3 task simultaneously... what I want is... I have lots ( lacs ) of files in a dir... I want.. these info from arround 2-3 months files filename convention is - abc20080403sdas.xyz ( for todays files ) I want 1. total no of files for 1 dec... (1 Reply)
Discussion started by: clx
1 Replies
Login or Register to Ask a Question