Turning files into an array


View Poll Results: Which language is the best for data processing
Awk 5 50.00%
Perl 5 50.00%
Voters: 10. This poll is closed

 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Turning files into an array
# 1  
Old 03-14-2009
Turning files into an array

I have several files like this

file A
Code:
               Good    Bad      Fair
Strawberry    1         4         5     
Banana       23       12        4  
Plantain       0        0         1 
Orange        0        0         0

file B
Code:
Strawberry    1         1         3  
Banana       2         1         0   
Plantain      0         0         0
Orange       0         0         0

file C
Code:
Strawberry    1         0         0  
Banana       2         1         4  
Plantain      5         0         7 
Orange       0         7         0

Now I want to regroup the data in the following output format

OUTPUT FILE
GOOD
A B C
Strawberry 1 1 1
Banana 23 2 2
Plantain 0 0 5
Orange

BAD
A B C
Strawberry
Banana
Plantain
Orange

FAIR
A B C
Strawberry
Banana
Plantain
Orange

please anybody with an idea on how it can be achieved, while computing sum under each output file. The first column in in the input files is fixed for all.

Thanks.

Last edited by Yogesh Sawant; 04-10-2009 at 04:42 AM.. Reason: added code tags
# 2  
Old 03-14-2009
Based on this example you should be able to figure out how to total the columns in the output files for yourself.
====================================
Code:
#set -vx
echo -n "Working..."

# # set variables
grades="Good Bad Fair"
files="A B C"
names="Strawberry Banana Plantain Orange"

# # process files
for grade in $grades; do
  echo "$grade" > ${grade}.txt
  echo "A B C" >> ${grade}.txt
  for name in $names; do
    typeset -i file_count=1
    for file in $files; do
      if [[ $grade = Good ]]; then
        value=$(grep $name $file | cut -d" " -f2)
      elif [[ $grade = Bad ]]; then
        value=$(grep $name $file | cut -d" " -f3)
      elif [[ $grade = Fair ]]; then
        value=$(grep $name $file | cut -d" " -f4)
      fi
      if (( $file_count == 1 )); then
        val1=$value
      elif (( $file_count == 2 )); then
        val2=$value
      elif (( $file_count == 3 )); then
        val3=$value
      fi
      typeset -i file_count=$(expr $file_count + 1)
    done
    echo "$name $val1 $val2 $val3" >> ${grade}.txt
  done
done

# # sum columns in each file
for grade in $grades; do
  echo ""
  cat ${grade}.txt
done


Last edited by ldapswandog; 03-14-2009 at 03:50 PM..
# 3  
Old 03-14-2009
Would you mind to explain how the BAD and the FAIR paragraphs should be populated? What's the logic?
# 4  
Old 03-14-2009
based on variable grade="Good Bad Fair" you know which column to get the data from.

grade=Good
name=Strawberry
file=A

value=$(grep $name $file | cut -d" " -f2)

value=$(grep Strawberry A | cut -d" " -f2)

get line from file A where line contains text Strawberry then cut line into fields based on (-d" ") space character and get (-f2) field 2
# 5  
Old 03-14-2009
Here is a solution in Perl which is more efficient than just using shell
Code:
#!/usr/bin/perl -w

# # set variables
@files="A B C";

# # open output files
open (GOOD, "> good.txt");
open (BAD, "> bad.txt");
open (FAIR, "> fair.txt");

# # for each file
while ($file = <@files>) {
  # # open file
  open (INPUT, "< $file");
  # # process each line of file
  while ($line = <INPUT>) {
    chomp $line;
    # # IF line contain strawberries
    if ($line =~ m/Strawberry/) {
      # # Split line into each value
      ($name,$good,$bad,$fair) = split (/ /, $line);
      if ($file =~ m/A/) {
        $sA_good=$good;
        $sA_bad=$bad;
        $sA_fair=$fair;
      }
      elsif ($file =~ m/B/) {
        $sB_good=$good;
        $sB_bad=$bad;
        $sB_fair=$fair;
      }
      elsif ($file =~ m/C/) {
        $sC_good=$good;
        $sC_bad=$bad;
        $sC_fair=$fair;
      }
    }
    # # ELSE IF line contain bananas
    elsif ($line =~ m/Banana/) {
      # # Split line into each value
      ($name,$good,$bad,$fair) = split (/ /, $line);
      if ($file =~ m/A/) {
        $bA_good=$good;
        $bA_bad=$bad;
        $bA_fair=$fair;
      }
      elsif ($file =~ m/B/) {
        $bB_good=$good;
        $bB_bad=$bad;
        $bB_fair=$fair;
      }
      elsif ($file =~ m/C/) {
        $bC_good=$good;
        $bC_bad=$bad;
        $bC_fair=$fair;
      }
    }
    # # ELSE IF line contain plantains
    elsif ($line =~ m/Plantain/) {
      # # Split line into each value
      ($name,$good,$bad,$fair) = split (/ /, $line);
      if ($file =~ m/A/) {
        $pA_good=$good;
        $pA_bad=$bad;
        $pA_fair=$fair;
      }
      elsif ($file =~ m/B/) {
        $pB_good=$good;
        $pB_bad=$bad;
        $pB_fair=$fair;
      }
      elsif ($file =~ m/C/) {
        $pC_good=$good;
        $pC_bad=$bad;
        $pC_fair=$fair;
      }
    }
    # # ELSE IF line contain oranges
    elsif ($line =~ m/Orange/) {
      # # Split line into each value
      ($name,$good,$bad,$fair) = split (/ /, $line);
      if ($file =~ m/A/) {
        $oA_good=$good;
        $oA_bad=$bad;
        $oA_fair=$fair;
      }
      elsif ($file =~ m/B/) {
        $oB_good=$good;
        $oB_bad=$bad;
        $oB_fair=$fair;
      }
      elsif ($file =~ m/C/) {
        $oC_good=$good;
        $oC_bad=$bad;
        $oC_fair=$fair;
      }
    }
  }
}

# # calc totals
$tgA=($sA_good + $bA_good + $pA_good + $oA_good);
$tgB=($sB_good + $bB_good + $pB_good + $oB_good);
$tgC=($sC_good + $bC_good + $pC_good + $oC_good);
$tbA=($sA_bad + $bA_bad + $pA_bad + $oA_bad);
$tbB=($sB_bad + $bB_bad + $pB_bad + $oB_bad);
$tbC=($sC_bad + $bC_bad + $pC_bad + $oC_bad);
$tfA=($sA_fair + $bA_fair + $pA_fair + $oA_fair);
$tfB=($sB_fair + $bB_fair + $pB_fair + $oB_fair);
$tfC=($sC_fair + $bC_fair + $pC_fair + $oC_fair);

# # print data to output files
print GOOD "GOOD\n";
print GOOD "A B C\n";
print GOOD "Strawberry\t$sA_good\t$sB_good\t$sC_good\n";
print GOOD "Banana    \t$bA_good\t$bB_good\t$bC_good\n";
print GOOD "Plantain  \t$pA_good\t$pB_good\t$pC_good\n";
print GOOD "Orange    \t$oA_good\t$oB_good\t$oC_good\n";
print GOOD "===================================\n";
print GOOD "Totals    \t$tgA\t$tgB\t$tgC\n";

print BAD "BAD\n";
print BAD "A B C\n";
print BAD "Strawberry\t$sA_bad\t$sB_bad\t$sC_bad\n";
print BAD "Banana    \t$bA_bad\t$bB_bad\t$bC_bad\n";
print BAD "Plantain  \t$pA_bad\t$pB_bad\t$pC_bad\n";
print BAD "Orange    \t$oA_bad\t$oB_bad\t$oC_bad\n";
print BAD "===================================\n";
print BAD "Totals    \t$tbA\t$tbB\t$tbC\n";

print FAIR "FAIR\n";
print FAIR "A B C\n";
print FAIR "Strawberry\t$sA_fair\t$sB_fair\t$sC_fair\n";
print FAIR "Banana    \t$bA_fair\t$bB_fair\t$bC_fair\n";
print FAIR "Plantain  \t$pA_fair\t$pB_fair\t$pC_fair\n";
print FAIR "Orange    \t$oA_fair\t$oB_fair\t$oC_fair\n";
print FAIR "===================================\n";
print FAIR "Totals    \t$tfA\t$tfB\t$tfC\n";

# # close files
close (GOOD);
close (BAD);
close (FAIR);
close (INPUT);

exit 0;

# 6  
Old 03-15-2009
Thanks a lot ldapswandog, I think the perl codes should work for me. While I am still working on it. I will be very grateful if you have more useful hint on your codes. I quickly run the perl codes now I got Zero cloumn sum.

To be more clearly, what I need is to get the output like this:
(1)
first cloumn fixed and it can be more than 4 items(i.e
Strawberry
Banana
Plantain
Orange

(2)

2nd column will be fileA_column2
3nd cloumn will be fileB_column2
4th column will be fileC_column2

while the last row will be
TOTAL, sum_of_col_2, Sum_of_col3, sum_of_col_4

(3)
the output in 3 files( i.e Good Fair Bad)

Thanks a lot, I really appreciate your codes. I am still working on them.
# 7  
Old 03-15-2009
I use perl codes, I got some output errors
with
output files good, fair, bad.txt with empty column contents except the fixed one
for good.txt I have:

GOOD
A B C
Strawberry
Banana
Plantain
Orange
===================================
Totals 0 0 0
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Turning given date to epoch

i can probably script this in bash, but, i was wondering, does anyone know of a simple way to translate a given time to epoch? date -d@"29/Oct/2013:17:53:11" the user would specify the date: 29/Oct/2013:17:53:11 and the script will simply interpret that to epoch: 1348838383 (this is just... (4 Replies)
Discussion started by: SkySmart
4 Replies

2. SCO

Need help turning off bootpd

OSR 5.0.7 patched with MP 5 The syslog is flooded with messages: May 9 13:42:12 asiwc bootpd: IP address not found: 192.168.230.215 May 9 13:42:13 asiwc bootpd: IP address not found: 192.168.230.142 May 9 13:42:50 asiwc bootpd: IP address not found: 192.168.230.202 The system... (4 Replies)
Discussion started by: migurus
4 Replies

3. Shell Programming and Scripting

Comparing files in a directory against an array of files

I hope I can explain this correctly. I am using Bash-4.2 for my shell. I have a group of file names held in an array. I want to compare the names in this array against the names of files currently present in a directory. If the file does not exist in the directory, that is not a problem.... (5 Replies)
Discussion started by: BudMan
5 Replies

4. Shell Programming and Scripting

Turning CSV files into individual Variables

I want to be able to convert the following data from a CSV into individual variables from the columns 2 4 and 8 I can use awk to grab the columns using var1=`cat text.csv | awk "," '{print $2}'` but how do I create separate variables for each line. 595358 ,ECON1010 ,THU ,08:00 - 10:00 ,11 Mar... (6 Replies)
Discussion started by: domsmith
6 Replies

5. AIX

turning CIO on and how to monitor

Hi Guys, I have a database server where we run AIX 5.3 on a power5 box and we just turned on CIO (concurrent I/O) for the database filesystems. Now my assumption is that enabling CIO the database basically will bypass the filesystem cache releasing some extra memory that can be allocated... (1 Reply)
Discussion started by: hariza
1 Replies

6. UNIX for Advanced & Expert Users

turning CIO on and how to monitor

Hi Guys, I have a database server where we run AIX 5.3 on a power5 box and we just turned on CIO (concurrent I/O) for the database filesystems. Now my assumption is that enabling CIO the database basically will bypass the filesystem cache releasing some extra memory that can be allocated... (1 Reply)
Discussion started by: hariza
1 Replies

7. Solaris

Turning in.ftpd on and off

For two straight days someone was running in.ftpd in my server (apparently looking to break in) and when I would do "top" almost every line would read "in.ftpd". I had a unix sysadmin friend of mine shut it down and then start it back up in a day and a half and all seems OK for now. Here's what I... (1 Reply)
Discussion started by: thomi39
1 Replies

8. UNIX for Dummies Questions & Answers

Turning Echo off

Hi, Is there any way like in dos to turn the echo off in a script? i have some lines popping up that i dont wish to be viewed when i am unziping a file it brings up the message updating: log.txt (deflated 72%) and extracting: log.txt i dont want these be viewed. Andy (4 Replies)
Discussion started by: chapmana
4 Replies

9. Gentoo

Turning on/off the network interface

Hi all, I'm trying to write a script that will turn off the network interface eth0 on a linux Gentoo machine and then turn it back on, any help? Thanks, Neked (1 Reply)
Discussion started by: neked
1 Replies

10. UNIX for Advanced & Expert Users

Turning off the CDE

I am running Solaris 9 and wanted the CDE stopped when my users login. Can this be done by adding something to the .profile? Basically when they login they should be at the command line and have to start the CDE themselves. Thanks (11 Replies)
Discussion started by: meyersp
11 Replies
Login or Register to Ask a Question