Sponsored Content
Top Forums Shell Programming and Scripting Joining files in a complex way Post 302401483 by durden_tyler on Saturday 6th of March 2010 10:08:56 AM
Old 03-06-2010
Thanks for the explanation and example.
The base code remains the same; I've added the capability to accept arguments from command line, fill up an array for "input1" files and process each file in it i.e. each array element.

Code:
$ 
$ cat input1a
"aphab"    "S1"    "S2"    "S3"
"a"    "A/A"    "A/A"    "A/A"
"b"    "A/G"    "A/G"    "A/A"
"c"    "A/A"    "G/G"    "A/A"
"d"    "G/G"    "A/G"    "A/G"
"e"    "A/G"    "G/G"    "A/G"
"f"     "A/A"    "G/G"    "A/G"
"g"    "A/A"    "G/G"    "G/G"
"h"    "A/A"    "G/G"    "G/G"
"I"     "A/A"    "G/G"    "G/G"
$ 
$ cat input1b
"aphab"    "S1"    "S2"    "S3"
"a"    "A/A"    "A/A"    "A/A"
"b"    "A/G"    "A/G"    "A/A"
"c"    "A/A"    "G/G"    "A/A"
"d"    "G/G"    "A/G"    "A/G"
"e"    "A/G"    "G/G"    "A/G"
"f"     "A/A"    "G/G"    "A/G"
"g"    "A/A"    "G/G"    "G/G"
"h"    "A/A"    "G/G"    "G/G"
"I"     "A/A"    "G/G"    "G/G"
$ 
$ cat input1c
"aphab"    "S1"    "S2"    "S3"
"a"    "A/A"    "A/A"    "A/A"
"b"    "A/G"    "A/G"    "A/A"
"c"    "A/A"    "G/G"    "A/A"
"d"    "G/G"    "A/G"    "A/G"
"e"    "A/G"    "G/G"    "A/G"
"f"     "A/A"    "G/G"    "A/G"
"g"    "A/A"    "G/G"    "G/G"
"h"    "A/A"    "G/G"    "G/G"
"I"     "A/A"    "G/G"    "G/G"
$ 
$ cat combine.pl
#!/usr/bin/perl -w

# check that at least 2 arguments are passed to this program
# exit with error code 1 otherwise
if ($#ARGV < 1) {
  print "Usage:   perl combine.pl <list of input files separated by space> input2\n";
  print "Example: perl combine.pl input1a input1b input1c input2\n";
  exit 1;
}

# now assign the list of "input1" file names to array @infile1
foreach (0..$#ARGV-1) {
  push @infile1, $ARGV[$_];
}
# set the variable $infile2 to the last argument i.e. the "input2" file
$infile2 = $ARGV[$#ARGV];

# define hashes - %chartonum, %numtochar and %mainhash
my %chartonum = qw(A/A 1 A/G 2 G/G 3);
my %numtochar = qw(1 A/A 2 A/G 3 G/G);
my %mainhash;

# first process all "input1" files i.e. all elements of the array @infile1
foreach $file1 (@infile1) {
  open(INFILE, $file1) or die "Can't open $file1: $!";
  while (<INFILE>) {
    chomp;
    s/"//g;
    s/[ ]+/ /g;
    if ($. == 1) {
      @x = split/ /;
    } else {
      @y = split/ /;
      foreach $i (1..$#y) {
        $mainhash{$x[$i].",".$chartonum{$y[$i]}}++;
      }
    }
  }
  close(INFILE) or die "Can't close $file1: $!";
}

# print the header
printf("%-12s%-12s%-12s%-12s%-12s%-s\n","\"ID\"","\"Label\"","\"StYPE\"","\"Ntype\"","\"Stype_No\"","\"log\"");
# now start processing the "input2" file
open(INFILE, $infile2) or die "Can't open $infile2: $!";
while (<INFILE>) {
  if ($. > 1) {
    chomp;
    s/"//g;
    s/[ ]+/ /g;
    # print $_,"\n";
    @z = split/ /;
    if (!defined $prev or $z[0] ne $prev) {$num = 1} else {$num++};
    $prev = $z[0];
    printf("%-12s%-12s%-12s%-12s%-12s%-s\n",
           "\"$z[0]\"",
           "\"$z[1]\"",
           defined $numtochar{$num} ? "\"$numtochar{$num}\"" : "\"NULL\"",
           exists $numtochar{$num} ? $num : "\"null\"",
           defined $mainhash{$z[0].",".$num} ? $mainhash{$z[0].",".$num} : "\"null\"", 
           $z[2]
          );
  }
}
close(INFILE) or die "Can't close $infile2: $!";

$ 
$ # Error checking - incorrect number of arguments
$ 
$ perl combine.pl
Usage:   perl combine.pl <list of input files separated by space> input2
Example: perl combine.pl input1a input1b input1c input2
$ 
$ perl combine.pl input1a
Usage:   perl combine.pl <list of input files separated by space> input2
Example: perl combine.pl input1a input1b input1c input2
$ 
$ perl combine.pl input2
Usage:   perl combine.pl <list of input files separated by space> input2
Example: perl combine.pl input1a input1b input1c input2
$ 
$ echo $?
1
$ 
$ # Successful run
$ 
$ perl combine.pl input1a input1b input1c input2
"ID"        "Label"     "StYPE"     "Ntype"     "Stype_No"  "log"
"S1"        "xxx"       "A/A"       1           18          2.8
"S1"        "xxx"       "A/G"       2           6           3
"S1"        "xxx"       "G/G"       3           3           4
"S2"        "yyy"       "A/A"       1           3           6.8
"S2"        "yyy"       "A/G"       2           6           7
"S2"        "yyy"       "G/G"       3           18          7.4
"S2"        "yyy"       "NULL"      "null"      "null"      8
"S3"        "zzz"       "A/A"       1           9           12
"S3"        "zzz"       "A/G"       2           9           14
"S3"        "zzz"       "G/G"       3           9           16
"S3"        "zzz"       "NULL"      "null"      "null"      18
"S3"        "zzz"       "NULL"      "null"      "null"      20
$ 
$ echo $?
0
$ 
$

HTH,
tyler_durden

Last edited by durden_tyler; 03-06-2010 at 03:19 PM..
 

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

joining 2 files

Hi, I have two files that I need to find difference between. Do I use diff or join? If join, how do I use it? thanks, webtekie (1 Reply)
Discussion started by: webtekie
1 Replies

2. Shell Programming and Scripting

Help with joining two files

Greetings, all. I've got a project that requires I join two data files together, then do some processing and output. Everything must be done in a shell script, using standard unix tools. The files look like the following: File_1 Layout: Acct#,Subacct#,Descrip Sample: ... (3 Replies)
Discussion started by: rjlohman
3 Replies

3. Shell Programming and Scripting

joining two or more files

i have three files file a has contents 123 234 238 file b has contents 189 567 567 and file c has contents qwe ert ery (1 Reply)
Discussion started by: tomjones
1 Replies

4. Shell Programming and Scripting

Joining Three Files

Hi guys, I have three files which needs to be joined to a single file. File 1: Col a, Col b, Col c File 2: Col 1a, Col 1b File 3: Col 2a, Col 2b Output: Col 1a, Col 2a, Col a, Col b, Col c. All the files are comma delimited. I need to join Col b with Col 1b and need to... (17 Replies)
Discussion started by: mac4rfree
17 Replies

5. UNIX for Dummies Questions & Answers

Joining two files

I have two comma separated files. I want to join those filesa nd put the result in separate file. smaple data are: file1: A1,1,100 A2,1,200 B1,2,100 B2,2,200 file2 1,50 1,25 1,25 1,100 1,100 2,50 2,50 (10 Replies)
Discussion started by: pandeesh
10 Replies

6. Shell Programming and Scripting

Joining two files into one

Hi experts, I'm quite newbie here!! I have two seperate files. Contents of file like below File 1: 6213019212001 8063737 File:2 15703784 I want to join these two files into one where content will be File 3: 6213019212001 8063737 15703784 Regards, Ray Seilden (1 Reply)
Discussion started by: RayanS
1 Replies

7. Shell Programming and Scripting

Help with joining files and adding headers to files

Hi, I have about 20 tab delimited text files that have non sequential numbering such as: UCD2.summary.txt UCD45.summary.txt UCD56.summery.txt The first column of each file has the same number of lines and content. The next 2 column have data points: i.e UCD2.summary.txt: a 8.9 ... (8 Replies)
Discussion started by: rrdavis
8 Replies

8. Shell Programming and Scripting

Joining 2 Files

File "A" (column names: Nickname Number GB) Nickname Number GB PROD_DB0034 100A 16 ASMIL1B_DATA_003 100B 16 PSPROD_0000 1014 36 PSPROD_0001 100D 223 ..... File "B" (column names: TYPE DEVICE NUMBER SIZE) TYPE DEVICE NUMBER SIZE 1750500 hdisk2 100A 16384 1750500 hdisk3 ... (4 Replies)
Discussion started by: Daniel Gate
4 Replies

9. Shell Programming and Scripting

Please help me in joining two files

I have two files with the below contents : sampleoutput3.txt 20150202;hostname1 20150223;hostname2 20150716;hostname3 sampleoutput1.txt hostname;packages_out_of_date;errata_out_of_date; hostname1;11;0; hostnamea;12;0; hostnameb;11;0; hostnamec;95;38; hostnamed;440;358;... (2 Replies)
Discussion started by: rahul2662
2 Replies
All times are GMT -4. The time now is 01:31 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy