Sponsored Content
Top Forums Web Development Perl join two files by "common" column Post 302494415 by z1dane on Monday 7th of February 2011 08:30:19 AM
Old 02-07-2011
Hi yifangt,

No problems. chomp by the way operates on the default input ($_), so you can just specify chomp instead of chomp $_ Smilie

The computation will take a while as you pointed out. I think it might be worth fixing up the first file so that everything is systematic e.g. having a standardised assembly notation, so you don't need to use a regular expression. Once that is fixed up, you can just use a hash to see if the key exists.

As for your second approach of deleting elements in the hash, look up the delete() function.

Good luck and happy coding!

Dave
This User Gave Thanks to z1dane For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

"Join" or "Merge" more than 2 files into single output based on common key (column)

Hi All, I have working (Perl) code to combine 2 input files into a single output file using the join function that works to a point, but has the following limitations: 1. I am restrained to 2 input files only. 2. Only the "matched" fields are written out to the "matched" output file and... (1 Reply)
Discussion started by: Katabatic
1 Replies

2. Shell Programming and Scripting

Join multiple files based on 1 common column

I have n files (for ex:64 files) with one similar column. Is it possible to combine them all based on that column ? file1 ax100 20 30 40 ax200 22 33 44 file2 ax100 10 20 40 ax200 12 13 44 file2 ax100 0 0 4 ax200 2 3 4 (9 Replies)
Discussion started by: quincyjones
9 Replies

3. Shell Programming and Scripting

awk command to replace ";" with "|" and ""|" at diferent places in line of file

Hi, I have line in input file as below: 3G_CENTRAL;INDONESIA_(M)_TELKOMSEL;SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL My expected output for line in the file must be : "1-Radon1-cMOC_deg"|"LDIndex"|"3G_CENTRAL|INDONESIA_(M)_TELKOMSEL"|LAST|"SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL" Can someone... (7 Replies)
Discussion started by: shis100
7 Replies

4. UNIX for Dummies Questions & Answers

how to join two files using "Join" command with one common field in this problem?

file1: Toronto:12439755:1076359:July 1, 1867:6 Quebec City:7560592:1542056:July 1, 1867:5 Halifax:938134:55284:July 1, 1867:4 Fredericton:751400:72908:July 1, 1867:3 Winnipeg:1170300:647797:July 15, 1870:7 Victoria:4168123:944735:July 20, 1871:10 Charlottetown:137900:5660:July 1, 1873:2... (2 Replies)
Discussion started by: mindfreak
2 Replies

5. Shell Programming and Scripting

Substituting comma "," for dot "." in a specific column when comma"," is a delimiter

Hi, I'm dealing with an issue and losing a lot of hours figuring out how i would solve this. I have an input file which looks like this: ('BLABLA +200-GRS','Serviço ','TarifaçãoServiço','wap.bla.us.0000000121',2985,0,55,' de conversão em escada','Dia','Domingos') ('BLABLA +200-GRR','Serviço... (6 Replies)
Discussion started by: poliver
6 Replies

6. UNIX for Dummies Questions & Answers

How to use the the join command to join multiple files by a common column

Hi, I have 20 tab delimited text files that have a common column (column 1). The files are named GSM1.txt through GSM20.txt. Each file has 3 columns (2 other columns in addition to the first common column). I want to write a script to join the files by the first common column so that in the... (5 Replies)
Discussion started by: evelibertine
5 Replies

7. Shell Programming and Scripting

Problem of Perl's "join" function

$ perl -e '@f=("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa","1","911"); print join("\t",@f)."\n";' aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa ... (5 Replies)
Discussion started by: carloszhang
5 Replies

8. UNIX for Dummies Questions & Answers

How to join 2 .txt files based on a common column?

Hi all, I'm trying to join two .txt file tab delimitated based on a common column. File 1 transcript_id gene_id length effective_length expected_count TPM FPKM IsoPct comp1000201_c0_seq1 comp1000201_c0 337 183.51 0.00 0.00 0.00 0.00 comp1000297_c0_seq1 ... (1 Reply)
Discussion started by: alisrpp
1 Replies

9. Shell Programming and Scripting

Delete all log files older than 10 day and whose first string of the first line is "MSH" or "<?xml"

Dear Ladies & Gents, I have a requirement to delete all the log files in /var/log/test directory that are older than 10 days and their first line begin with "MSH" or "<?xml" or "FHS". I've put together the following BASH script, but it's erroring out: for filename in $(find /var/log/test... (2 Replies)
Discussion started by: Hiroshi
2 Replies

10. Shell Programming and Scripting

Join, merge, fill NULL the void columns of multiples files like sql "LEFT JOIN" by using awk

Hello, This post is already here but want to do this with another way Merge multiples files with multiples duplicates keys by filling "NULL" the void columns for anothers joinning files file1.csv: 1|abc 1|def 2|ghi 2|jkl 3|mno 3|pqr file2.csv: 1|123|jojo 1|NULL|bibi... (2 Replies)
Discussion started by: yjacknewton
2 Replies
ExtUtils::Packlist(3perl)				 Perl Programmers Reference Guide				 ExtUtils::Packlist(3perl)

NAME
ExtUtils::Packlist - manage .packlist files SYNOPSIS
use ExtUtils::Packlist; my ($pl) = ExtUtils::Packlist->new('.packlist'); $pl->read('/an/old/.packlist'); my @missing_files = $pl->validate(); $pl->write('/a/new/.packlist'); $pl->{'/some/file/name'}++; or $pl->{'/some/other/file/name'} = { type => 'file', from => '/some/file' }; DESCRIPTION
ExtUtils::Packlist provides a standard way to manage .packlist files. Functions are provided to read and write .packlist files. The original .packlist format is a simple list of absolute pathnames, one per line. In addition, this package supports an extended format, where as well as a filename each line may contain a list of attributes in the form of a space separated list of key=value pairs. This is used by the installperl script to differentiate between files and links, for example. USAGE
The hash reference returned by the new() function can be used to examine and modify the contents of the .packlist. Items may be added/deleted from the .packlist by modifying the hash. If the value associated with a hash key is a scalar, the entry written to the .packlist by any subsequent write() will be a simple filename. If the value is a hash, the entry written will be the filename followed by the key=value pairs from the hash. Reading back the .packlist will recreate the original entries. FUNCTIONS
new() This takes an optional parameter, the name of a .packlist. If the file exists, it will be opened and the contents of the file will be read. The new() method returns a reference to a hash. This hash holds an entry for each line in the .packlist. In the case of old- style .packlists, the value associated with each key is undef. In the case of new-style .packlists, the value associated with each key is a hash containing the key=value pairs following the filename in the .packlist. read() This takes an optional parameter, the name of the .packlist to be read. If no file is specified, the .packlist specified to new() will be read. If the .packlist does not exist, Carp::croak will be called. write() This takes an optional parameter, the name of the .packlist to be written. If no file is specified, the .packlist specified to new() will be overwritten. validate() This checks that every file listed in the .packlist actually exists. If an argument which evaluates to true is given, any missing files will be removed from the internal hash. The return value is a list of the missing files, which will be empty if they all exist. packlist_file() This returns the name of the associated .packlist file EXAMPLE
Here's "modrm", a little utility to cleanly remove an installed module. #!/usr/local/bin/perl -w use strict; use IO::Dir; use ExtUtils::Packlist; use ExtUtils::Installed; sub emptydir($) { my ($dir) = @_; my $dh = IO::Dir->new($dir) || return(0); my @count = $dh->read(); $dh->close(); return(@count == 2 ? 1 : 0); } # Find all the installed packages print("Finding all installed modules... "); my $installed = ExtUtils::Installed->new(); foreach my $module (grep(!/^Perl$/, $installed->modules())) { my $version = $installed->version($module) || "???"; print("Found module $module Version $version "); print("Do you want to delete $module? [n] "); my $r = <STDIN>; chomp($r); if ($r && $r =~ /^y/i) { # Remove all the files foreach my $file (sort($installed->files($module))) { print("rm $file "); unlink($file); } my $pf = $installed->packlist($module)->packlist_file(); print("rm $pf "); unlink($pf); foreach my $dir (sort($installed->directory_tree($module))) { if (emptydir($dir)) { print("rmdir $dir "); rmdir($dir); } } } } AUTHOR
Alan Burlison <Alan.Burlison@uk.sun.com> POD ERRORS
Hey! The above document had some coding errors, which are explained below: Around line 39: =back without =over perl v5.14.2 2010-12-30 ExtUtils::Packlist(3perl)
All times are GMT -4. The time now is 01:38 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy