Sort flat file by 3rd column in perl


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Sort flat file by 3rd column in perl
# 8  
Old 11-14-2011
Code:
$
$
$ cat f40
9924873|20111114|00000000000013013|130|13|10/15/2010 12:36:22|W860944|N|00
9924873|20111114|00000000000013009|130|09|10/15/2010 12:36:22|W860944|N|00
9924873|20111114|00000000000029207|292|07|05/29/2001 10:35:32|DADS_JAMESA|N|00
$
$
$ perl -lne 'push @x,$_; END {print for (sort {substr($a,17,17) <=> substr($b,17,17)} @x)}' f40
9924873|20111114|00000000000013009|130|09|10/15/2010 12:36:22|W860944|N|00
9924873|20111114|00000000000013013|130|13|10/15/2010 12:36:22|W860944|N|00
9924873|20111114|00000000000029207|292|07|05/29/2001 10:35:32|DADS_JAMESA|N|00
$
$
$

tyler_durden
This User Gave Thanks to durden_tyler For This Post:
# 9  
Old 11-15-2011
Thanks a lot !!!!!!

---------- Post updated at 01:47 AM ---------- Previous update was at 01:43 AM ----------

Thanks a lot Skrynesaver for not only solving the issue but also for the explanation

Can you please kind enough to help me to read the line
$link_strength{$1}=$_ if /(?:[^|]+\|){2}([^|]+)/;


Is it storing the 3rd column in the hash value

How the keys value generated in the last line
print $link_strength{$_} for (sort {$a<=>$b} keys %link_strength);
# 10  
Old 11-15-2011
Quote:
Originally Posted by Pratik4891
Thanks a lot !!!!!!

---------- Post updated at 01:47 AM ---------- Previous update was at 01:43 AM ----------

Thanks a lot Skrynesaver for not only solving the issue but also for the explanation

Can you please kind enough to help me to read the line
$link_strength{$1}=$_ if /(?:[^|]+\|){2}([^|]+)/;

Is it storing the 3rd column in the hash value
No it's actually creating an entry using the 3rd field ass a key for the hash and the entire record as the value.

At this point we have slurped the entire file into an array and because we are stepping through the array ( for (@array){ ) the default variable $_ is the entire record.

If the record matches the pattern in the regex the first capturing parenthesis matches the third field and so the third field is stored in $1

We now store the record in the hash with the 3rd field as key.
Quote:
Originally Posted by Pratik4891
How the keys value generated in the last line
print $link_strength{$_} for (sort {$a<=>$b} keys %link_strength);
We retrieve the keys of the hash (field 3) and sort them numerically and process this list in the provided order. For each key we then print the value which is the stored record.

As I said above this method depends on the 3rd field being unique to each record. You could modify it to use the extracted 3rd field as a value and the record as a key ( $link_strength{$_}=$1 if /(?:[^|]+\|){2}([^|]+)/; this would mean that each record would have to be unique but the link strengths could be the same in several records.) then cycle through the hash with a sort function in place, something like (for (sort {$link_strength{$a} <=> $link_strength{$b}} keys %link_strength){

Last edited by Skrynesaver; 11-15-2011 at 04:19 AM..
This User Gave Thanks to Skrynesaver For This Post:
# 11  
Old 11-17-2011
Thanks a lot everyone to help me

Can you please please let me know how to read the array from file ........
actually after sorting the data in two file there is need of file comparison

So the array needs to be stored in to files before comparing....

I have tried with below but its not working

Code:
 
open my $fh1, '<', @sorted or die "Can't open $file1: $!";
print $fh3;

You guys helped me a lot!!!!thanks a ton for that!!!!!!!!!
# 12  
Old 11-17-2011
Quote:
Originally Posted by Pratik4891
...
I have tried with below but its not working
Code:
 
open my $fh1, '<', @sorted or die "Can't open $file1: $!";
print $fh3;

...
The call to the open function is incorrect. Have a look at the online Perl documentation:

open - perldoc.perl.org

Since you'd open a file, you may want to put the file name as the third argument. Also, $fh3 is undefined in the print statement.

tyler_durden
This User Gave Thanks to durden_tyler For This Post:
# 13  
Old 11-18-2011
Hello Tyler

Thanks a lot for the explanation

What I am trying here is to copy a sorted array to a file .then compare it to another file

My objective here is I have to compare two hugeeeeee file ,so I want to sort them first and then trying to compare them (Will it optimize the search)

So already you guys showed me how to sort the input file to array .....so I have to copy the array to a file and then have to compare or can I compare two array directly

Please find the attached script for file comparison

In the while loop is it possible to pass the arrat instead of file

Please please help.....Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Solution for replacement of 4th column with 3rd column in a file using awk/sed preserving delimters

input "A","B","C,D","E","F" "S","T","U,V","W","X" "AA","BB","CC,DD","EEEE","FFF" required output: "A","B","C,D","C,D","F" "S", T","U,V","U,V","X" "AA","BB","CC,DD","CC,DD","FFF" tried using awk but double quotes not preserving for every field. any help to solve this is much... (5 Replies)
Discussion started by: khblts
5 Replies

2. Shell Programming and Scripting

Sort based on positions in flat file

Hello, For example: 12........6789101112..............20212223242526..................50 ( Positions) LName FName DOB (Lastname starts from 1 to 6 , FName from 8 to 15 and date of birth from 21 to29) CURTIS KENNETH ... (5 Replies)
Discussion started by: duplicate
5 Replies

3. UNIX for Dummies Questions & Answers

sort a unix file by 3rd column

Hi, Can anybody tell me how to sort a unix file by 3rd column and not by ltr? Please help Thanks in advance (2 Replies)
Discussion started by: vinnyvirk
2 Replies

4. Shell Programming and Scripting

Summation of column value in flat file

Hello Guys Please find my below requirement I have a flat file with column headers in first line and data The structure like below col1 col2 col3 A 1 2 B 3 4 C 5 6 Say I have to take the summation of col2 (that will depend on the... (2 Replies)
Discussion started by: Pratik4891
2 Replies

5. UNIX for Dummies Questions & Answers

Sort after 2. column in array in Perl

Hey How do I sort an array numerically after the second column? My values are integers like 1, 2, 3, 4... and they are not unique, so I can't just reverse my hash and sort by keys. I wanna sort my file/array so that I get the lines with the highest value in the top - that is descending. ... (2 Replies)
Discussion started by: Banni
2 Replies

6. Shell Programming and Scripting

How to read the first column in a flat file with ~ as delimiter

I have one flat file like below id1~col~batch1 id2~col2~batch2 id3~col3~batch3 I need to read the first column one by one and I need to write one db2 query based on that column1 Like for (i=0;i<=10;i++) do insert into table column (con_id) values (select column from table where... (4 Replies)
Discussion started by: siri_886
4 Replies

7. Shell Programming and Scripting

Flat File column manipulation

Hi All, I have a tab delimited input file with say 4 fields (columns) as below : 0000443 1AGPR061 2006 Daiml 0002198 1B3XG0K2 1989 Chdds 0002199 1Bd64J0L 1990 Ch34s 0002275 1B3s4J0K 1989 Chadys 0002276 1B465302 2002 Dageml 0002290 1B45430K 1989 Cays I want the 2nd column in file to... (5 Replies)
Discussion started by: net
5 Replies

8. Shell Programming and Scripting

Converting Column to Rows in a Flat file

Hi, Request To guide me in writing a shell program for the following requirement: Example:if the Input File contains the follwing data Input File Data: 80723240029,12,323,443,88,98,7,98,67,87 80723240030,12,56,6,,,3,12,56,6,7,2,3,12,56,6,7,2,3,88,98,7,98,67,87... (5 Replies)
Discussion started by: srinikal
5 Replies

9. UNIX for Dummies Questions & Answers

Trim String in 3rd Column in Tab Delimited File...SED/PERL/AWK?

Hey Everybody, I am having much trouble figuring this out, as I am not really a programmer..:mad: Datafile.txt Column0 Column1 Column2 ABC DEF xxxGHI I am running using WGET on a cronjob to grab a datafile, but I need to cut the first three characters from... (6 Replies)
Discussion started by: rickdini
6 Replies

10. Shell Programming and Scripting

Look up column in a flat file

Here is on more go ! Need a shortcut for my problem ! problem is i have a look_update with fixed sequence of column that is : MANDT:SERAIL:SERSCHA:SEREX:EQTYP:BSTVP I will be getting data in a flat file having same number of column but the sequence could be different in each... (5 Replies)
Discussion started by: jambesh
5 Replies
Login or Register to Ask a Question