Insterting column in csv from multiple files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Insterting column in csv from multiple files
# 1  
Old 02-15-2014
Insterting column in csv from multiple files

Hello,


I have a spec file that contains a lot of strings that looks like this:

Code:
PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   
  Intel i3 3220 (Dual Core, 3.30GHz, 3MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"
"PC DELL OptiPlex 3010MT i5 3470/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   
   Intel i5 3470 (Quad Core, 3.20GHz Turbo,6MB, w/ HD2500 Graphics), 4GB (1x4GB)      DDR3, PC3-1600MHz, 750GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr   Basic Warranty NBD on site"CSV template

These strings need to be converted to an html table and then be inserted into a master .csv for uploading
The master .csv looks like
Code:
price,product code, SPECS,other things,
  300.00,CODE 2112334,    ,OTHER STRINGS,
  500.00,CODE 2222222,    ,OTHER STRINGS,Desired .csv output:

And the final .csv file should look like this:

Code:
price,product code, SPECS,other things,
  300.00,CODE 2112334, <table style="width:300px"><tr><td>Proccessor</td><td>Intel i3 3220 (Dual Core, 3.30GHz</td></tr><tr><td>Memmory</td><td> 2GB (1x2GB) DDR3 PC3-1600MHz</td>tr><td>Hard Disk</td><td>500GB HDD SATA III 7200rpm</td></tr><tr><td>VGA</td><td>HD2500 Graphics</td></tr><tr><td>Warranty</td><td>5Yr Basic Warranty NBD on site</td></tr><tr><td>Ohter features</td><td>THIS IS NOT FROM THE SPECFILE</td></tr><tr><td>Ohter features 2</td><td>THIS IS ALSO NOT FROM THE SPECFILE</td></tr></tr></table>,OTHER STRINGS,
  500.00,CODE 2222222, <table style="width:300px"><tr><td>Proccessor</td><td>Intel i5 3470 (Quad Core 3.20GHz)</td></tr><tr><td>Memmory</td><td> 4GB (1x4GB) DDR3 PC3-1600MHz</td>tr><td>Hard Disk</td><td>750GB HDD SATA III 7200rpm</td></tr><tr><td>VGA</td><td>HD2500 Graphics</td></tr><tr><td>Warranty</td><td>5Yr Basic Warranty NBD on site</td></tr><tr><td>Ohter features</td><td>THIS IS NOT FROM THE SPECFILE</td></tr><tr><td>Ohter features 2</td><td>THIS IS ALSO NOT FROM THE SPECFILE</td></tr></tr></table>,OTHER STRINGS,

This is what I have done so far:

Fist I need to "clean" the specs and get proper csv strings using the following script

Code:
#ls is probably a bad idea
 for f in $(ls *.csv)
 do
 #fix newline from file
 sed -i ':a;{N;s/NBD   \n/NBD,/};ba;s/"//g;' "$f" 

 #fix csv & and remove unessesery strings
 sed -i 's/"PC/PC/g;s/Core\,/Core/g;s/3\,/3./g;s/3MB\,//g;s/6MB\,//g;s/6MB//g;s/w   \///g;s/7,200/7200/g;s/site\"/site/g;s/3MB//g;s/3\,/3\./g;s/w\///g;s/3\,/3\./g;s/Cache\,)/Cache/g;s/ Internal Dell Business Audio Speaker\,//g;' "$f"

#don't know how to remove symbols with sed using awk instead
awk 'NR==FNR {a[$1]=$2;next} {for ( i in a) gsub(i,a[i])}1' template $f >temp.txt
mv temp.txt $f
done

Then using this script to populate the html table

Code:
#!/bin/bash

#ls is probably a bad idea
for f in $(ls *.csv)
do
#split csv into 1 line .csv files
split --additional-suffix=.csv -d -l 1 "$f" output/data_

#populate html file and create .html files
for file in $(ls output/*.csv)
do

IFS=","
while read f1 f2 f3 f4 f5 f6 f7 f8 f9 f10
do

echo "<table cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\"> " 
echo "<tbody>"  
echo "<tr>  "   
echo "<td class=\"specsTitle\">Box</td> "
echo "<td class=\"specsDescript stripeBottom\">$f2</td> "
echo "</tr>     "   
echo "<tr>  "   
<snip>
done <$file  > output/temp.txt
mv output/temp.txt $file.html
done
done
#remove not important .csv
rm output/*.csv

So at this point I have several .html files in the output folder that need to go into the final .csv.
This is something I am not sure on how to do.
Any comments/imporvments on the above scripts are also welcome :-)

Last edited by Scott; 02-15-2014 at 07:44 AM.. Reason: Please use CODE tags, not ICODE tags for code blocks. Thanks.
# 2  
Old 02-17-2014
Take a look at the "join" command but you will need to assign a key to each record so that it will join 1 to 1, 2 to 2 and so on.

You could also do it all in one script, this is a quick little perl script as an example
:
Code:
my $in_file_1   =  '/temp/tmp/m';             ## file with quoted strings
my $in_file_2   =  '/temp/tmp/t';             ## csv file with text to join info from file_1 to
#my $out_file    =  '/temp/tmp/new_file.txt';
my $line_f1;
my $line_f2;
my $out_line;

# hash for values that have a replacement with value
my %replacement_values = (
 'PC'        => 'PC',
 'Core,'     => 'Core',
 '3,'        => '3',
 '7,200'     => '7200',
 'site'      => 'site',
 'Cache,\)'  => 'Cache'
);

open ( my $in_file_fh_1, '<', $in_file_1  ) or die "Can't open $in_file_1 $!\n";
open ( my $in_file_fh_2, '<', $in_file_2  ) or die "Can't open $in_file_2 $!\n";
#open ( my $out_file_fh,  '>', $out_file   ) or die "Can't open $out_file $!\n";

while ( ! eof( $in_file_fh_1 ) and ! eof( $in_file_fh_2 ) ) {
  $line_f2    =  <$in_file_fh_2>;
    chomp $line_f2;
    # perform hash replacements
  ( $out_line =  $line_f2 )  =~  s/(@{[join '|', map { quotemeta($_) } keys %replacement_values]})/$replacement_values{$1}/g;
  # Remove specific text from line
  $out_line  =~  s![0-9]MB,*| *w *\/ *| Internal Dell Business Audio Speaker,| {2,}!!g;
  # Remove begininng and ending double quote
  $out_line  =~  s!^\"!!;
  $out_line  =~  s!\"$!!;

  $line_f1   =  <$in_file_fh_1>;
    chomp $line_f1;
  # You could then split line($out_line) into array and write to out_file formatted as desired with with data from
  # csv file, html, etc.
  print $line_f1 . $out_line . "\n";
#  print $out_file_fh $out_line . "\n";

}


Last edited by Franklin52; 02-18-2014 at 07:51 AM.. Reason: Please use code tags
# 3  
Old 02-19-2014
Hi,

Thank you for helping. But I must me doing something wrong.
Using a specfile.txt that looks like
Code:
"PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   
Intel i3 3220 (Dual Core, 3.30GHz, 3MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"
"PC DELL OptiPlex 3010MT i5 3470/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   
Intel i5 3470 (Quad Core, 3.20GHz Turbo,6MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"

and a csvfile.csv input file like this
Code:
,243720,1,
,243721,2,
,244773,3,

I modified your script a bit that now looks like:
Code:
#!/bin/perl

my $in_file_1   =  'specfile.txt';             ## file with quoted strings
my $in_file_2   =  'csvfile.csv';             ## csv file with text to join info from file_1 to
my $out_file    =  'outputfile.txt';
my $line_f1;
my $line_f2;
my $out_line;

# hash for values that have a replacement with value
my %replacement_values = (
 'PC'        => 'PC',
 'Core,'     => 'Core',
 '3,'        => '3',
 '7,200'     => '7200',
 'site'      => 'site',
 'Cache,\)'  => 'Cache',
 '0\)'        => '0',
 'Graphics\)' => ' ',
 'GHz GHz\)'  => ' ',
 'GHz) GHz\)' => ' ',
 'Non-ECCz'  => ' ',
 'Turbo'     => ' ',
 '12,04'     => '12.04',
 '86W\)'      => ' ',
 'Warranty\:' => ' '

);

open ( my $in_file_fh_1, '<', $in_file_1  ) or die "Can't Open $in_file_1 $!\n";
open ( my $in_file_fh_2, '<', $in_file_2  ) or die "Can't open $in_file_2 $!\n";
open ( my $out_file_fh,  '>', $out_file   ) or die "Can't open $out_file $!\n";

while ( ! eof( $in_file_fh_1 ) and ! eof( $in_file_fh_2 ) ) {
  $line_f2    =  <$in_file_fh_2>;
    chomp $line_f2;
    # perform hash replacements
  ( $out_line =  $line_f2 )  =~  s/(@{[join '|', map { quotemeta($_) } keys %replacement_values]})/$replacement_values{$1}/g;
  # Remove specific text from line
  $out_line  =~  s![0-9]MB,*| *w *\/ *| Internal Dell Business Audio Speaker,| {2,}!!g;
  # Remove begininng and ending double quote
  $out_line  =~  s!^\"!!;
  $out_line  =~  s!\"$!!;

  $line_f1   =  <$in_file_fh_1>;
    chomp $line_f1;
  # You could then split line($out_line) into array and write to out_file formatted as desired with with data from
  # csv file, html, etc.
  print $line_f1 . $out_line . "\n";
  print $out_file_fh $out_line . "\n";

}

The output that I get is
Code:
,243720,1,
,243721,2,
,2447733

and if I use > goodcsv as an output I get
Code:
"PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   ,243720,1,
Intel i3 3220 (Dual Core, 3.30GHz, 3MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site",243721,2,
"PC DELL OptiPlex 3010MT i5 3470/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   ,2447733

And the desired output should look more than this:
Code:
'<p><font size=""2""><span style=""font-weight: bold;"">Περιγραφή</span></font></p>
 Ο DellTM OptiPlexTM 380 παρέχει οικονομικές, ισχυρές και επεκτάσιμες λύσεις επιτραπέζιου υπολογιστή σε οργανισμούς με απλά δίκτυα.Τα κύρια χαρακτηριστικά περιλαμβάνουν τα εξής: <br /> Βασικά χαρακτηριστικά σε προσιτό κόστος<br/> Μεγάλος κύκλος ζωής συστημάτων <br />Βασικές δυνατότητες ασφάλειας που προστατεύουν τα δεδομένα ζωτικής σημασίας 
<br/> <hr/> <br/>
<table cellspacing="0" cellpadding="0" border="0" width="100%" class="col4"><tbody>
<snip>
</table>    
',243720,1
'<p><font size=""2""><span style=""font-weight: bold;"">Περιγραφή</span></font></p>
 Ο DellTM OptiPlexTM 380 παρέχει οικονομικές, ισχυρές και επεκτάσιμες λύσεις επιτραπέζιου υπολογιστή σε οργανισμούς με απλά δίκτυα.Τα κύρια χαρακτηριστικά περιλαμβάνουν τα εξής: <br /> Βασικά χαρακτηριστικά σε προσιτό κόστος<br/> Μεγάλος κύκλος ζωής συστημάτων <br />Βασικές δυνατότητες ασφάλειας που προστατεύουν τα δεδομένα ζωτικής σημασίας 
<br/> <hr/> <br/>
<table cellspacing="0" cellpadding="0" border="0" width="100%" class="col4"><tbody>
<tr> <td class="prodTitlespec">    Προδιαγραφές προϊόντος    </td> 
</tr> 
</tbody> 
</table> 
<table cellspacing="0" cellpadding="0" border="0" width="100%"> 
<snip>     
</table>    
',243721,2
,244773,3

I guess I do something wrong but I don;t know what

Kind Regards

Last edited by Franklin52; 02-19-2014 at 03:30 AM.. Reason: Please use CODE tags, not ICODE tags for code blocks. Thanks.
# 4  
Old 02-19-2014
Are the quoted strings in file(csvfile.csv) each on a separate record?
The reason I ask is because your code works for me
:


$ cat csvfile.csv
Code:
"PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD Intel i3 3220 (Dual Core, 3.30GHz, 3MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"
"PC DELL OptiPlex 3010MT i5 3470/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD Intel i5 3470 (Quad Core, 3.20GHz Turbo,6MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"

$ cat specfile.txt
Code:
,243720,1,
,243721,2,
,244773,3,

-- Your code
Code:
#!/usr/bin/perl -w
# Script: testit.pl
use strict;
use warnings;

my $in_file_1   =  '/temp/tmp/specfile.txt';            ## file with quoted strings
my $in_file_2   =  '/temp/tmp/csvfile.csv';             ## csv file with text to join info from file_1 to
#my $out_file    =  'outputfile.txt';
my $line_f1;
my $line_f2;
my $out_line;

# hash for values that have a replacement with value
my %replacement_values = (
 'PC'        => 'PC',
 'Core,'     => 'Core',
 '3,'        => '3',
 '7,200'     => '7200',
 'site'      => 'site',
 'Cache,\)'  => 'Cache',
 '0\)'        => '0',
 'Graphics\)' => ' ',
 'GHz GHz\)'  => ' ',
 'GHz) GHz\)' => ' ',
 'Non-ECCz'  => ' ',
 'Turbo'     => ' ',
 '12,04'     => '12.04',
 '86W\)'      => ' ',
 'Warranty\:' => ' '

);

open ( my $in_file_fh_1, '<', $in_file_1  ) or die "Can't Open $in_file_1 $!\n";
open ( my $in_file_fh_2, '<', $in_file_2  ) or die "Can't open $in_file_2 $!\n";
#open ( my $out_file_fh,  '>', $out_file   ) or die "Can't open $out_file $!\n";

while ( ! eof( $in_file_fh_1 ) and ! eof( $in_file_fh_2 ) ) {
  $line_f2    =  <$in_file_fh_2>;
  chomp $line_f2;
  # perform hash replacements
  ( $out_line =  $line_f2 )  =~  s/(@{[join '|', map { quotemeta($_) } keys %replacement_values]})/$replacement_values{$1}/g;
  # Remove specific text from line
  $out_line  =~  s![0-9]MB,*| *w *\/ *| Internal Dell Business Audio Speaker,| {2,}!!g;
  # Remove begininng and ending double quote
  $out_line  =~  s!^\"!!;
  $out_line  =~  s!\"$!!;

  $line_f1   =  <$in_file_fh_1>;
  chomp $line_f1;
  # You could then split line($out_line) into array and write to out_file formatted as desired with with data from
  # csv file, html, etc.
  print $line_f1 . $out_line . "\n";
#  print $out_file_fh $out_line . "\n";

}


-- output from your code
$ testit.pl
Code:
,243720,1,PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD Intel i3 3220 (Dual Core 3.30GHz, HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site
,243721,2,PC DELL OptiPlex 3010MT i5 3470/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD Intel i5 3470 (Quad Core 3.20GHz,HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site

# 5  
Old 02-20-2014
Hi, thank you for taking the time to help me,

The csvfile.csv is like
Code:
,243720,1,
,243721,2,
,244773,3,

And the specfile is
Code:
"PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   
Intel i3 3220 (Dual Core, 3.30GHz, 3MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"
"PC DELL OptiPlex 3010MT i5 3470/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   
Intel i5 3470 (Quad Core, 3.20GHz Turbo,6MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"

You posted the other way around. (probably a typo)
Regarding the specfile keep in mind that there a normaly about 10 double lines (I didn't post them all to save space) and each double line is 1 spec that continues from NBD
eg. double line
Code:
"PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD   
Intel i3 3220 (Dual Core, 3.30GHz, 3MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"

is 1 specline like
Code:
"PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD Intel i3 3220 (Dual Core, 3.30GHz, 3MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"

If that causes a problem I don't mind editing the specfile manually to have each line right.
So that's not a big deal for me
But one magor thing that is missing from this script is that before the final output each specline has to be inserted inside a table so line
Code:
"PC DELL OptiPlex 3010MT i3 3220/2GB/500GB/DVD-RW/FREE DOS / 5Y NBD  Intel i3 3220 (Dual Core, 3.30GHz, 3MB, w/ HD2500 Graphics), 2GB (1x2GB) DDR3 PC3-1600MHz, 500GB HDD SATA III 7200rpm, DVD+/-RW (16x), FREE DOS, Warranty: 5Yr Basic Warranty NBD on site"

should be converted to an html table before merging with the final .csv
I do this with a script like

Code:
#!/bin/bash

for f in $(ls *.csv)
do
#split csv into 1line .csv files
split --additional-suffix=.csv -d -l 1 "$f" output/data_

#populate html file and create .html files
for file in $(ls output/*.csv)
do

IFS=","
while read f1 f2 f3 f4 f5 f6 f7 f8 f9 f10
do

echo "<table cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\"> "    
echo "<tbody>"    
echo "<tr>     "    
echo "<td class=\"specsTitle\">    Case    </td> "
echo "<td class=\"specsDescript stripeBottom\">    Standard Desktop Tower Chassis     </td> "
echo "</tr>     "    
echo "<tr>     "    
echo "<td class=\"specsTitle highlightRow\">    CPU    </td> "
echo "<td class=\"specsDescript stripeBottom highlightRow\">$f2</td> "
echo "</tr>     "    
echo "<tr>     "    
echo "<td class=\"specsTitle\">    Operating System    </td> "
echo "<td class=\"specsDescript stripeBottom\">$f7</td> "
echo "</tr>     "    
echo "<tr>     "    
echo "<td class=\"specsTitle highlightRow\">    Hard disk    </td> "
echo "<td class=\"specsDescript stripeBottom highlightRow\">$f5</td> "
echo "</tr>     "    
echo "<tr>     "    
echo "<td class=\"specsTitle\">    Memory     </td>"
echo "<td class=\"specsDescript stripeBottom\">$f4</td> "
echo "</tr>     "    
echo "<tr>     "    
echo "<td class=\"specsTitle highlightRow\">    Optical Drive    </td> "
echo "<td class=\"specsDescript stripeBottom highlightRow\">$f6</td> "
echo "</tr>     "    
echo "<tr>     "    
echo "<td class=\"specsTitle\">    VGA    </td> "
echo "<td class=\"specsDescript stripeBottom\">    Mobile Intel® Graphics Media Accelerator $f3</td> "
echo "</tr>     "    
echo "<tr>     "    
echo "<td class=\"specsTitle highlightRow\">    Keyboard    </td> "
echo "<td class=\"specsDescript stripeBottom highlightRow\">    Dell Keyboard    </td> "
echo "</tr>     "    
echo "<tr>     "    
echo "<td class=\"specsTitle\">    Mouse    </td> "
echo "<td class=\"specsDescript stripeBottom\">    Dell USB mouse    </td> "
echo "</tr>     "            
echo "</tbody>     "    
echo "</table>    "

done <$file  > output/temp.txt
mv output/temp.txt $file.html
done
done
#remove not important .csv
rm output/*.csv

The above script produces several .html files in the output folder. The content from these .html files should be then merged with the csvfile.csv file

Kind Regards
# 6  
Old 02-20-2014
Ok, Code has been updated to handle quoted strings on 2 lines from file(specfile) and to create formatted file(i.e. html) as output:
Code:
use strict;
use warnings;

my $in_file_1   =  '/temp/tmp/specfile.txt';            ## file with quoted strings
my $in_file_2   =  '/temp/tmp/csvfile.csv';             ## csv file with text to join info from specfile to
my $out_file    =  '/temp/tmp/outputfile.txt';          ## formatted file
my $line_f1;
my $line_f2;
my $work_line;
my @fields;
my $fields_index;
my $out_line;

# hash for values that have a replacement with value
my %replacement_values = (
 'PC'        => 'PC',
 'Core,'     => 'Core',
 '3,'        => '3',
 '7,200'     => '7200',
 'site'      => 'site',
 'Cache,\)'  => 'Cache',
 '0\)'        => '0',
 'Graphics\)' => ' ',
 'GHz GHz\)'  => ' ',
 'GHz) GHz\)' => ' ',
 'Non-ECCz'  => ' ',
 'Turbo'     => ' ',
 '12,04'     => '12.04',
 '86W\)'      => ' ',
 'Warranty\:' => ' '

);

open ( my $in_file_fh_1, '<', $in_file_1  ) or die "Can't Open $in_file_1 $!\n";
open ( my $in_file_fh_2, '<', $in_file_2  ) or die "Can't open $in_file_2 $!\n";
open ( my $out_file_fh,  '>', $out_file   ) or die "Can't open $out_file $!\n";

LINE: while ( ! eof( $in_file_fh_1 ) and ! eof( $in_file_fh_2 ) ) {
  $line_f1    =  <$in_file_fh_1>;       # Read first part of quoted string from file that is on two lines
  chomp $line_f1;
  $work_line = $work_line . $line_f1;   # Place line just read at end of variable
  next LINE if $line_f1 =~ /^\"/;       # Read next line to get 2nd part of quoted string
  # perform hash replacements
  $work_line  =~  s/(@{[join '|', map { quotemeta($_) } keys %replacement_values]})/$replacement_values{$1}/g;
  # Remove specific text from line
  $work_line  =~  s![0-9]MB,*| *w *\/ *| Internal Dell Business Audio Speaker,| {2,}!!g;
  # Remove begininng and ending double quote
  $work_line  =~  s!^\"!!;
  $work_line  =~  s!\"$!!;

  $line_f2    =  <$in_file_fh_2>;           # Read matching line from csvfile
  chomp $line_f2;
  $work_line = $line_f2 . $work_line;       # Place csvfile line just read at beginning of variable
  
  print $work_line . "\n";                # Print complete line for visual reference/debugging

  # Split line into fields array and print each element for visual reference/debugging
  @fields  =  split( ( ',' ),$work_line );
  for $fields_index ( 0 .. $#fields ) {
      print $fields[$fields_index] . "\n";
  }
  print "\n";

  # Create output formatted lines with text and fields from array
  for $fields_index ( 0 .. $#fields ) {
    $out_line = "<table cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\"> \n" .
                "<tbody>\n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    Case    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">    Standard Desktop Tower Chassis     </td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle highlightRow\">    CPU    </td> \n" .
                "<td class=\"specsDescript stripeBottom highlightRow\">$fields[3]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    Operating System    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">$fields[8]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle highlightRow\">    Hard disk    </td> \n" .
                "<td class=\"specsDescript stripeBottom highlightRow\">$fields[6]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    Memory     </td>\n" .
                "<td class=\"specsDescript stripeBottom\">$fields[5]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle highlightRow\">    Optical Drive    </td> \n" .
                "<td class=\"specsDescript stripeBottom highlightRow\">$fields[7]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    VGA    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">    Mobile Intel® Graphics Media Accelerator $fields[4]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle highlightRow\">    Keyboard    </td> \n" .
                "<td class=\"specsDescript stripeBottom highlightRow\">    Dell Keyboard    </td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    Mouse    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">    Dell USB mouse    </td> \n" .
                "</tr>     \n" .
                "</tbody>     \n" .
                "</table>    \n";
  }
  print $out_file_fh $out_line . "\n";     # Print formatted line to outfile
  $work_line = "";
}
close $in_file_fh_1;
close $in_file_fh_2;
close $out_file_fh;

Hopefully this is an example to show how you can do all in one script!

Last edited by Don Cragun; 02-21-2014 at 09:50 PM.. Reason: Remove font tags.
# 7  
Old 02-21-2014
Hi,

Seems that I managed to get the result I wantend :-D
Mainly an outputfile.csv file that I can import to my website
I decided to manually edit the specfile from a 2line spec to a 1line spec So I can use this script on other products too.
Although it seems to work fine ( I opened the outputfile.csv file with libreoffice and seems to be OK) I hope that I could get a final comment from you regading this script as this is my first attemt with perl and have almost no clue of what I am doing :-)
So here it goes:

Code:
#!/usr/bin/perl -w

my $in_file_2   =  'specfile.txt';             ## file with quoted strings
my $in_file_1   =  'csvfile.csv';             ## csv file with text to join info from file_1 to
my $out_file    =  'outputfile.csv';
my $line_f1;
my $line_f2;
my $out_line;

# hash for values that have a replacement with value
my %replacement_values = (
 'PC'        => 'PC',
 'Core,'     => 'Core',
 '3,'        => '3',
 '7,200'     => '7200',
 'site'      => 'site',
 'Cache,\)'  => 'Cache',
 '0\)'        => '0',
 'Graphics)' => ' ',
 'GHz GHz\)'  => ' ',
 'GHz) GHz\)' => ' ',
 'Non-ECCz'  => ' ',
 'Turbo'     => ' ',
 '12,04'     => '12.04',
 '86W\)'      => ' ',
 'Warranty:' => ' ',
 'Intel'      => ',Intel',
 '00)'      => ',00',
 '(D'      => 'D'

);

open ( my $in_file_fh_1, '<', $in_file_1  ) or die "Can't Open $in_file_1 $!\n";
open ( my $in_file_fh_2, '<', $in_file_2  ) or die "Can't open $in_file_2 $!\n";
open ( my $out_file_fh,  '>', $out_file   ) or die "Can't open $out_file $!\n";

while ( ! eof( $in_file_fh_1 ) and ! eof( $in_file_fh_2 ) ) {
  $line_f2    =  <$in_file_fh_2>;
    chomp $line_f2;
    # perform hash replacements
  ( $out_line =  $line_f2 )  =~  s/(@{[join '|', map { quotemeta($_) } keys %replacement_values]})/$replacement_values{$1}/g;
  # Remove specific text from line
  $out_line  =~  s![0-9]MB,*| *w *\/ *| Internal Dell Business Audio Speaker,| {2,}!!g;
  # Remove begininng and ending double quote
  $out_line  =~  s!^\"!!;
  $out_line  =~  s!\"$!!;

  $line_f1   =  <$in_file_fh_1>;
    chomp $line_f1;
  # You could then split line($out_line) into array and write to out_file formatted as desired with with data from
  # csv file, html, etc.
  print $line_f1 . $out_line . "\n";
  print $out_file_fh $line_f1 . "\' \n";
  $work_line = $line_f1 . $out_line . "\n";


  # Split line into fields array and print each element for visual reference/debugging
  @fields  =  split( ( ',' ),$work_line );
  for $fields_index ( 0 .. $#fields ) {
      print $fields[$fields_index] . "\n";
  }
  print "\n";
  # Create output formatted lines with text and fields from array
  for $fields_index ( 0 .. $#fields ) {
    $out_line = "<table cellspacing=\"0\" cellpadding=\"0\" border=\"0\" width=\"100%\"> \n" .
                "<tbody>\n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    Case    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">    Standard Desktop Tower Chassis     </td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle highlightRow\">    CPU    </td> \n" .
                "<td class=\"specsDescript stripeBottom highlightRow\">$fields[3]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    Operating System    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">$fields[8]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle highlightRow\">    Hard disk    </td> \n" .
                "<td class=\"specsDescript stripeBottom highlightRow\">$fields[6]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    Memory     </td>\n" .
                "<td class=\"specsDescript stripeBottom\">$fields[5]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle highlightRow\">    Optical Drive    </td> \n" .
                "<td class=\"specsDescript stripeBottom highlightRow\">$fields[7]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    VGA    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">    Mobile Intel® Graphics Media Accelerator $fields[4]</td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle highlightRow\">    Keyboard    </td> \n" .
                "<td class=\"specsDescript stripeBottom highlightRow\">    Dell Keyboard    </td> \n" .
                "</tr>     \n" .
                "<tr>     \n" .
                "<td class=\"specsTitle\">    Mouse    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">    Dell USB mouse    </td> \n" .
                "</tr>     \n" .
		 "<tr>     \n" .
                "<td class=\"specsTitle\">    Warranty    </td> \n" .
                "<td class=\"specsDescript stripeBottom\">  $fields[9] </td> \n" .
                "</tr>     \n" .
                "</tbody>     \n" .
                "</table>    \n \'";
  }
  print $out_file_fh $out_line . "\n";     # Print formatted line to outfile

}

Kind Regards
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Export Oracle multiple tables to multiple csv files using UNIX shell scripting

Hello All, just wanted to export multiple tables from oracle sql using unix shell script to csv file and the below code is exporting only the first table. Can you please suggest why? or any better idea? export FILE="/abc/autom/file/geo_JOB.csv" Export= `sqlplus -s dev01/password@dEV3... (16 Replies)
Discussion started by: Hope
16 Replies

2. UNIX for Beginners Questions & Answers

How do I extract specific column in multiple csv files?

file1: Name,Threshold,Curr Samples,Curr Error%,Curr ART GETHome,100,21601,0.00%,47 GETregistry,100,21592,0.00%,13 GEThomeLayout,100,30466,0.00%,17 file2: Name,Threshold,Curr Samples,Curr Error%,Curr ART GETHome,100,21601,0.00%,33 GETregistry,100,21592,0.00%,22... (6 Replies)
Discussion started by: Raghuram717
6 Replies

3. UNIX for Beginners Questions & Answers

How to copy a column of multiple files and paste into new excel file (next to column)?

I have data of an excel files as given below, file1 org1_1 1 1 2.5 100 org1_2 1 2 5.5 98 org1_3 1 3 7.2 88 file2 org2_1 1 1 2.5 100 org2_2 1 2 5.5 56 org2_3 1 3 7.2 70 I have multiple excel files as above shown. I have to copy column 1, column 4 and paste into a new excel file as... (26 Replies)
Discussion started by: dineshkumarsrk
26 Replies

4. Shell Programming and Scripting

Need awk or Shell script to compare Column-1 of two different CSV files and print if column-1 matche

Example: I have files in below format file 1: zxc,133,joe@example.com cst,222,xyz@example1.com File 2 Contains: hxd hcd jws zxc cst File 1 has 50000 lines and file 2 has around 30000 lines : Expected Output has to be : hxd hcd jws (5 Replies)
Discussion started by: TestPractice
5 Replies

5. Shell Programming and Scripting

Row bind multiple csv files having different column headers

All, I guess by this time someone asked this kind of question, but sorry I am unable to find after a deep search. Here is my request I have many files out of which 2 sample files provided below. File-1 (with A,B as column headers) A,B 1,2 File-2 (with C, D as column headers) C,D 4,5 I... (7 Replies)
Discussion started by: ks_reddy
7 Replies

6. UNIX for Dummies Questions & Answers

Merge two csv files using column name

Hi all, I have two separate csv files(comma delimited) file 1 and file 2. File 1 contains PAN,NAME,Salary AAAAA5467D,Raj,50000 AAFAC5467D,Ram,60000 BDCFA5677D,Kumar,90000 File 2 contains PAN,NAME,Dept,Salary ASDFG6756T,Karthik,ABC,450000 QWERT8765Y,JAX,CDR,780000... (5 Replies)
Discussion started by: Nivas
5 Replies

7. Shell Programming and Scripting

Compare 2 files of csv file and match column data and create a new csv file of them

Hi, I am newbie in shell script. I need your help to solve my problem. Firstly, I have 2 files of csv and i want to compare of the contents then the output will be written in a new csv file. File1: SourceFile,DateTimeOriginal /home/intannf/foto/IMG_0713.JPG,2015:02:17 11:14:07... (8 Replies)
Discussion started by: refrain
8 Replies

8. Shell Programming and Scripting

Sar -u generates multiple column headers in csv file

Hi All, The below sar -u command generates multiple column headers in csv file Expected output should print column headers only once in the csv file shell script: $cat sar_cpu_EBS.sh #!/bin/bash while ; do sar -u 15 1 | awk '/^/ {print $1,$2,$4,$6,$7}' | tr -s ' ' ',' >>... (6 Replies)
Discussion started by: a1_win
6 Replies

9. Shell Programming and Scripting

Combine Multiple text or csv files column-wise

Hi All I am trying to combine columns from multiple text files into a single file using paste command but the record length being unequal in the different files the data is running over to the closest empty cell on the left. Please see below. What can i do to resolve this ? File 1 File... (15 Replies)
Discussion started by: venky_ibm
15 Replies

10. Shell Programming and Scripting

Update a column value in csv files

Hi all I am new to scripting and i have an application from which i will export into a csv file as follows Column1, Column2, Column3 Sno1, Folder\Test.txt, Fail Sno2, Folder\Test1.txt, Pass Sno3, Folder\Test2.txt, Fail Now i need to change the column2 in all the rows from "Folder\" ... (1 Reply)
Discussion started by: rajeshrp
1 Replies
Login or Register to Ask a Question