Sponsored Content
Top Forums Shell Programming and Scripting Perl - multiple keys and merging two files Post 302854625 by Lokesha on Tuesday 17th of September 2013 07:18:18 PM
Old 09-17-2013
Hammer & Screwdriver Perl - multiple keys and merging two files

Hi,

I'm not a regular coder but some times I write some basic perl script, hence Perl is bit difficult for me Smilie.

I'm merging two files a.txt and b.txt into c.txt:

Code:
a.txt
------
x001;frtb70;xyz;109
x001;frvt65;sec;239
x003;wqax34;jul;659
x004;yhud43;yhn;760

b.txt
------
x001;abcd80;xyz;193
x001;crrp28;xse;456
x002;lmno10;xyz;784
x002;jfds65;jfd;739
x002;juop88;jup;879
x003;yulo90;rem;542
x003;kihl98;dnt;312
x004;urel25;ewb;342


c.txt [output]
------
x001;frtb70;xyz;109
x001;frvt65;sec;239
x002;lmno10;xyz;784
x002;jfds65;jfd;739
x003;wqax34;jul;659
x004;yhud43;yhn;760




Only condition is: I need all the lines from a.txt into c.txt.
But while selecting lines from b.txt into c.txt, first I need to look into a.txt. If the line is already present in a.txt, then I shouldn't consider that b.txt line while writing into c.txt [output]. In all the files, we can consider first column as key, but it may contain duplicates. That is becoming challenge for me.

Below are the script I've writen. problem is, as I'm using hash for both input files, its not considering the lines which has same key value. But I should use all a.txt eventhough keys are same. Same is true for b.txt, except it should skip the lines, if the key is already present in a.txt.

Code:
#!/usr/bin/env perl

sub prepareHash {
	#my ($in_file, $primary_Key, $delimiter) = @_;
	my $in_file   = shift;
	my $key       = shift;
	my $delimiter = shift;
	
  my @line_tokens;
  my %FILE_Hash;
  open( IN_FILE, "< $in_file" ) or die "Can't open $in_file : $!";
	  
  while (<IN_FILE>) {
     my $in_line = $_;
     chomp($in_line);
     @line_tokens = split(/$delimiter/, $in_line);
	   $FILE_Hash{$line_tokens[$key]} = $in_line; 
  }
  
  close IN_FILE;

  return %FILE_Hash;
}

my $input1 = "/export/home/a.txt";
my $input2 = "/export/home/b.txt";
my $output = "/export/home/c.txt";

my %A_Hash  = prepareHash($input1, 0 , ";" );
my %B_Hash  = prepareHash($input2, 0 , ";" );

open( OUT_FILE, "> $c.txt" ) or die "Can't open $c.txt : $!";

for my $a_key ( sort keys %A_Hash ) {
   $a_key =~ s/\s+$//;
   my $a_line = $A_Hash{$a_key};
   print OUT_FILE $a_line . "\n";
}

  # Compare OBL and REPOOBL. Only write extra REPOOBL lines which are not in OBL into BOND file
  for my $b_key ( sort keys %B_Hash ) {
     $b_key =~ s/\s+$//;
     
     if ( ! exists $A_Hash{$b_key} ) {
      my $b_line = $B_Hash{$b_key};
      print OUT_FILE $b_line . "\n";
     } else {
      print "$B_Hash{$b_key} is the already writen into c.txt using a.txt, hence skipping\n";
     }
  }

close OUT_FILE;

Can any of you help me please?
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

merging multiple log files

Hello, I have 8 sql loader scripts which produce ".bad" file if there is any errors, how can I join the contents of these files together in one column? file 1 CA-94061-TSS Tkb Sport Shop CA-95133-V Vollyrite ... (3 Replies)
Discussion started by: jack1981
3 Replies

2. Shell Programming and Scripting

Merging columns from multiple files in one file

Hi, I want to select columns from multiple files and combine them in one file. The files are simulation-data-files with 23 columns each and about 50 rows. I now use: cut -f 11 Sweep?wing-30?scale=0.?0?fan2?.txt | pr -3 | awk '{printf("\n%s\t%s\t%s",$1,$2,$3)}' > ../Data_Processed/output.txtI... (1 Reply)
Discussion started by: isgoed
1 Replies

3. UNIX for Advanced & Expert Users

Merging multiple .so files

Hi All, How to merge independent .so files into an executable. Thanks in Advance, Regards, Kusu (2 Replies)
Discussion started by: Kusu
2 Replies

4. UNIX for Dummies Questions & Answers

Joining files based on multiple keys

I need a script (perl or awk..anything is fine) to join 3 files based on three key columns. The no of non-key columns can vary in each file. The columns are delimited by semicolon. For example, File1 Dim1;Dim2;Dim3;Fact1;Fact2;Fact3;Fact4;Fact5 ---- data delimited by semicolon --- ... (1 Reply)
Discussion started by: Sebben
1 Replies

5. Shell Programming and Scripting

Merging columns from multiple files

Hello, I have a number of tab delimited data files consists of two columns. Like that: File1 800.000000 0.002744 799.000000 0.002517 798.000000 0.002836 797.000000 0.002553 FIle2 800.000000 0.000261 799.000000 0.000001 798.000000 0.000551 797.000000 0.000275 File3... (19 Replies)
Discussion started by: erden
19 Replies

6. UNIX for Dummies Questions & Answers

Merging two CSV files by 3 primary keys (columns)

Hi there! I have the following problem: I have a set of files called rates_op_yyyyddmm with the format below (which corresponds to the file rates_op_20090130) 30-JAN-2009,ED,FEB09,C,96.375,,,0,,,,,,2.375,,,,,, 30-JAN-2009,ED,FEB09,C,96.5,,,0,,,,,,2.25,,,,,,... (2 Replies)
Discussion started by: Pep Puigvert
2 Replies

7. Shell Programming and Scripting

Merging multiple files using lines from one file

I have been working of this script for a very long time and I have searched the internet for direction but I am stuck here. I have about 3000 files with two columns each. The length of each file is 50000. Each of these files is named this way b.4, b.5, b.6, b.7, b.8, b.9, b.10, b.11, b.12... (10 Replies)
Discussion started by: iconig
10 Replies

8. Shell Programming and Scripting

Merging multiple files from multiple columns

Hi guys, I have very basic linux experience so I need some help with a problem. I have 3 files from which I want to extract columns based on common fields between them. File1: --- rs74078040 NA 51288690 T G 461652 0.99223 0.53611 3 --- rs77209296 NA 51303525 T G 461843 0.98973 0.60837 3... (10 Replies)
Discussion started by: bartman2099
10 Replies

9. Shell Programming and Scripting

Merging Multiple Columns between two files

Hello guys, I have 2 CSV files which goes like this: CSV1: Breaking.csv: UTF-8 "Name","Description","Occupation","Email" "Walter White","","Chemistry Teacher","w.w@bb.com" "Jessie Pinkman","","Junkie","j.p@bb.com" "Hank Schrader","","DEA Agent","h.s@bb.com" CSV2: Bad.csv... (7 Replies)
Discussion started by: jeffreybsu
7 Replies

10. Shell Programming and Scripting

Merging multiple files into one

Hi guys, could you please help me with this? I have multiple files with this structure: file1 xxx1 1.0 xxx2 3.5 xxx3 2.4 xxx4 3.0 … xxx1890 5.7 file2 xxx1 8.0 xxx3 7.5 xxx4 5.5 …. (4 Replies)
Discussion started by: coppuca
4 Replies
h5jam(1)						      General Commands Manual							  h5jam(1)

NAME
h5jam - Add a user block to a HDF5 file SYNOPSIS
h5jam -u user_block -i in_file.h5 [-o out_file.h5] [--clobber] DESCRIPTION
h5jam concatenates a user_block file and an HDF5 file to create an HDF5 file with a user block. The user block can be either binary or text. The output file is padded so that the HDF5 header begins on byte 512, 1024, etc.. (See the HDF5 File Format.) If out_file.h5 is given, a new file is created with the user_block followed by the contents of in_file.h5. In this case, infile.h5 is unchanged. If out_file.h5 is not specified, the user_block is added to in_file.h5. If in_file.h5 already has a user block, the contents of user_block will be added to the end of the existing user block, and the file shifted to the next boundary. If --clobber is set, any existing user block will be overwritten. EXAMPLE USAGE
Create new file, newfile.h5, with the text in file mytext.txt as the user block for the HDF5 file file.h5. h5jam -u mytext.txt -i file.h5 -o newfile.h5 Add text in file mytext.txt to front of HDF5 dataset, file.h5. h5jam -u mytext.txt -i file.h5 Overwrite the user block (if any) in file.h5 with the contents of mytext.txt. h5jam -u mytext.txt -i file.h5 --clobber RETURN VALUE
h5jam returns the size of the output file, or -1 if an error occurs. CAVEATS
This tool copies all the data (sequentially) in the file(s) to new offsets. For a large file, this copy will take a long time. The most efficient way to create a user block is to create the file with a user block (see H5Pset_user_block), and write the user block data into that space from a program. The user block is completely opaque to the HDF5 library and to the h5jam and h5unjam tools. The user block is simply read or written as a string of bytes, which could be text or any kind of binary data. It is up to the user to know what the contents of the user block means and how to process it. When the user block is extracted, all the data is written to the output, including any padding or unwritten data. This tool moves the HDF5 file through byte copies, i.e., it does not read or interpret the HDF5 objects. SEE ALSO
h5dump(1), h5ls(1), h5diff(1), h5import(1), gif2h5(1), h52gif(1), h5perf(1), h5unjam(1). h5jam(1)
All times are GMT -4. The time now is 06:10 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy