Sponsored Content
Top Forums Shell Programming and Scripting Paste columns based on common column: multiple files Post 303009519 by genome on Friday 15th of December 2017 09:27:31 AM
Old 12-15-2017
Code:
for i in {1..22}
do
    #--iterate over chromosomes
    saveTemp=""
    files_info="$(find $input_dir -name "*_CHR$i.info"  | sort )"
    files_list=""
    
    #---split by new lines and make it array---
    SAVEIFS=$IFS
    IFS=$'\n'
    files_info=($files_info)
    IFS=$SAVEIFS 
    
    join -j 2 -o 1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8,1.9,1.10,1.11,1.12,2.1,2.2,2.3,2.4,2.5,2.6,2.7,2.8,2.9,2.10,2.11,2.12  ${files_info[0]}  ${files_info[1]}  > $output_dir/"tempCHR_"$i".info" 

    SAVEtemp=$output_dir/"tempCHR_"$i".info"
    printf "$i joined for first two files\n"

    for (( x=2;x<${#files_info[@]};x++ ))
    do
	join -j 2 -o 1.1,1.2,1.3,1.4,1.5,1.6,1.7,1.8,1.9,1.10,1.11,1.12,2.1,2.2,2.3,2.4,2.5,2.6,2.7,2.8,2.9,2.10,2.11,2.12 $SAVEtemp  ${files_info[$x]}  > $output_dir/"tempchr"$i"_"$x".info" 

	SAVEtemp=$output_dir/"tempchr"$i"_"$x".info"
    done
    mv $SAVEtemp $output_dir/"joined_CHR""$i"".info"
    SAVEtemp=$output_dir/"joined_CHR""$i"".info"
    printf "CHR $i is done for joining\n"
    
    for w in ` awk '{print $2}' $SAVEtemp | grep -v "rs_id" `
    do
	st="" #start null string to concatenate
	
	for (( x=0;x<${#files_info[@]};x++ ))
	do
	    #--loop through files to grep the string
	    
	    temp_st=$(grep -w $w ${files_info[$x]}) 
	    st=$st" "$temp_st

	done
	echo "$st" >> $output_dir/"cols_joined_CHR"$i".info" 
    done

    printf "Proceseed files for $i chromosome!\n"
done

I left it running last evening and script has not finished working with chromosome 1. Smilie Terrible.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to convert 2 column data into multiple columns based on a keyword in a row??

Hi Friends I have the following input data in 2 columns. SNo 1 I1 Value I2 Value I3 Value SNo 2 I4 Value I5 Value I6 Value I7 Value SNo 3 I8 Value I9 Value ............... ................ SNo N (1 Reply)
Discussion started by: ks_reddy
1 Replies

2. Shell Programming and Scripting

sum multiple columns based on column value

i have a file - it will be in sorted order on column 1 abc 0 1 abc 2 3 abc 3 5 def 1 7 def 0 1 -------- i'd like (awk maybe?) to get the results (any ideas)??? abc 5 9 def 1 8 (2 Replies)
Discussion started by: jjoe
2 Replies

3. Shell Programming and Scripting

Merging 2 files based on a common column

Hi All, I do have 2 files file 1 has 4 tab delimited columns 234 a c dfgyu 294 b g fih 302 c h jzh 328 z c san 597 f g son File 2 has 2 tab delimted columns 234 23 302 24 597 24 I want to merge file 2 with file 1 based on the data common in both files which is the first column so... (6 Replies)
Discussion started by: Lucky Ali
6 Replies

4. Shell Programming and Scripting

Join multiple files based on 1 common column

I have n files (for ex:64 files) with one similar column. Is it possible to combine them all based on that column ? file1 ax100 20 30 40 ax200 22 33 44 file2 ax100 10 20 40 ax200 12 13 44 file2 ax100 0 0 4 ax200 2 3 4 (9 Replies)
Discussion started by: quincyjones
9 Replies

5. UNIX for Dummies Questions & Answers

Writing a loop to merge multiple files by common column

I have 100 data files labelled 250.1.txt through 250.100.txt. The second column of the data files partially match (there is about %90 overlap). Each data file has 4 columns. I want the merge all these text files by the matching values in the second column. In the output, the first column should... (1 Reply)
Discussion started by: evelibertine
1 Replies

6. Shell Programming and Scripting

common entries between files based on 1st column

Hi, I am trying to get the common entries from 2 files based on 1st field.. However when I try to do in perl I am getting blank output.. How can I do this in awk? open(BUFF1, "my_genes"); open(BUFF3, "rawcounts"); #open(WRBUFF,">result_rawcounts"); while($line =<BUFF1>) { ... (3 Replies)
Discussion started by: Diya123
3 Replies

7. UNIX for Dummies Questions & Answers

How to join 2 .txt files based on a common column?

Hi all, I'm trying to join two .txt file tab delimitated based on a common column. File 1 transcript_id gene_id length effective_length expected_count TPM FPKM IsoPct comp1000201_c0_seq1 comp1000201_c0 337 183.51 0.00 0.00 0.00 0.00 comp1000297_c0_seq1 ... (1 Reply)
Discussion started by: alisrpp
1 Replies

8. UNIX for Dummies Questions & Answers

Merge selective columns from files based on common key

Hi, I am trying to selectively merge two files based on keys reported in the 1st column. File1: #file1-header1 file1-header2 111 qwe rtz uio 198 asd fgh jkl 165 yxc 789 poi uzt rew 89 lkj File2: #file2-header2 file2-header2 165 ghz nko2 ... (2 Replies)
Discussion started by: dovah
2 Replies

9. Shell Programming and Scripting

Join columns across multiple lines in a Text based on common column using BASH

Hello, I have a file with 2 columns ( tableName , ColumnName) delimited by a Pipe like below . File is sorted by ColumnName. Table1|Column1 Table2|Column1 Table5|Column1 Table3|Column2 Table2|Column2 Table4|Column3 Table2|Column3 Table2|Column4 Table5|Column4 Table2|Column5 From... (6 Replies)
Discussion started by: nv186000
6 Replies

10. UNIX for Beginners Questions & Answers

How to copy a column of multiple files and paste into new excel file (next to column)?

I have data of an excel files as given below, file1 org1_1 1 1 2.5 100 org1_2 1 2 5.5 98 org1_3 1 3 7.2 88 file2 org2_1 1 1 2.5 100 org2_2 1 2 5.5 56 org2_3 1 3 7.2 70 I have multiple excel files as above shown. I have to copy column 1, column 4 and paste into a new excel file as... (26 Replies)
Discussion started by: dineshkumarsrk
26 Replies
TNEF(3) 						User Contributed Perl Documentation						   TNEF(3)

NAME
Convert::TNEF - Perl module to read TNEF files SYNOPSIS
use Convert::TNEF; $tnef = Convert::TNEF->read($iohandle, \%parms) or die Convert::TNEF::errstr; $tnef = Convert::TNEF->read_in($filename, \%parms) or die Convert::TNEF::errstr; $tnef = Convert::TNEF->read_ent($mime_entity, \%parms) or die Convert::TNEF::errstr; $tnef->purge; $message = $tnef->message; @attachments = $tnef->attachments; $attribute_value = $attachments[$i]->data($att_attribute_name); $attribute_value_size = $attachments[$i]->size($att_attribute_name); $attachment_name = $attachments[$i]->name; $long_attachment_name = $attachments[$i]->longname; $datahandle = $attachments[$i]->datahandle($att_attribute_name); DESCRIPTION
TNEF stands for Transport Neutral Encapsulation Format, and if you've ever been unfortunate enough to receive one of these files as an email attachment, you may want to use this module. read() takes as its first argument any file handle open for reading. The optional second argument is a hash reference which contains one or more of the following keys: output_dir - Path for storing TNEF attribute data kept in files (default: current directory). output_prefix - File prefix for TNEF attribute data kept in files (default: 'tnef'). output_to_core - TNEF attribute data will be saved in core memory unless it is greater than this many bytes (default: 4096). May also be set to 'NONE' to keep all data in files, or 'ALL' to keep all data in core. buffer_size - Buffer size for reading in the TNEF file (default: 1024). debug - If true, outputs all sorts of info about what the read() function is reading, including the raw ascii data along with the data converted to hex (default: false). display_after_err - If debug is true and an error is encountered, reads and displays this many bytes of data following the error (default: 32). debug_max_display - If debug is true then read and display at most this many bytes of data for each TNEF attribute (default: 1024). debug_max_line_size - If debug is true then at most this many bytes of data will be displayed on each line for each TNEF attribute (default: 64). ignore_checksum - If true, will ignore checksum errors while parsing data (default: false). read() returns an object containing the TNEF 'attributes' read from the file and the data for those attributes. If all you want are the attachments, then this is mostly garbage, but if you're interested then you can see all the garbage by turning on debugging. If the garbage proves useful to you, then let me know how I can maybe make it more useful. If an error is encountered, an undefined value is returned and the package variable $errstr is set to some helpful message. read_in() is a convienient front end for read() which takes a filename instead of a handle. read_ent() is another convient front end for read() which can take a MIME::Entity object (or any object with like methods, specifically open("r"), read($buff,$num_bytes), and close ). purge() deletes any on-disk data that may be in the attachments of the TNEF object. message() returns the message portion of the tnef object, if any. The thing it returns is like an attachment, but its not an attachment. For instance, it more than likely does not have a name or any attachment data. attachments() returns a list of the attachments that the given TNEF object contains. Returns a list ref if not called in array context. data() takes a TNEF attribute name, and returns a string value for that attribute for that attachment. Its your own problem if the string is too big for memory. If no argument is given, then the 'AttachData' attribute is assumed, which is probably the attachment data you're looking for. name() is the same as data(), except the attribute 'AttachTitle' is the default, which returns the 8 character + 3 character extension name of the attachment. longname() returns the long filename and extension of an attachment. This is embedded within a MAPI property of the 'Attachment' attribute data, so we attempt to extract the name out of that. size() takes an TNEF attribute name, and returns the size in bytes for the data for that attachment attribute. datahandle() is a method for attachments which takes a TNEF attribute name, and returns the data for that attribute as a handle which is the same as a MIME::Body handle. See MIME::Body for all the applicable methods. If no argument is given, then 'AttachData' is assumed. EXAMPLES
# Here's a rather long example where mail is retrieved # from a POP3 server based on header information, then # it is MIME parsed, and then the TNEF contents # are extracted and converted. use strict; use Net::POP3; use MIME::Parser; use Convert::TNEF; my $mail_dir = "mailout"; my $mail_prefix = "mail"; my $pop = new Net::POP3 ( "pop3server_name" ); my $num_msgs = $pop->login("user_name","password"); die "Can't login: $!" unless defined $num_msgs; # Get mail by sender and subject my $mail_out_idx = 0; MESSAGE: for ( my $i=1; $i<= $num_msgs; $i++ ) { my $header = join "", @{$pop->top($i)}; for ($header) { next MESSAGE unless /^from:.*someone@somewhere.net/im && /^subject:s*important stuff/im } my $fname = $mail_prefix."-".$$.++$mail_out_idx.".doc"; open (MAILOUT, ">$mail_dir/$fname") or die "Can't open $mail_dir/$fname: $!"; # If the get() complains, you need the new libnet bundle $pop->get($i, *MAILOUT) or die "Can't read mail"; close MAILOUT or die "Error closing $mail_dir/$fname"; # If you want to delete the mail on the server # $pop->delete($i); } close MAILOUT; $pop->quit(); # Parse the mail message into separate mime entities my $parser=new MIME::Parser; $parser->output_dir("mimemail"); opendir(DIR, $mail_dir) or die "Can't open directory $mail_dir: $!"; my @files = map { $mail_dir."/".$_ } sort grep { -f "$mail_dir/$_" and /$mail_prefix-$$-/o } readdir DIR; closedir DIR; for my $file ( @files ) { my $entity=$parser->parse_in($file) or die "Couldn't parse mail"; print_tnef_parts($entity); # If you want to delete the working files # $entity->purge; } sub print_tnef_parts { my $ent = shift; if ( $ent->parts ) { for my $sub_ent ( $ent->parts ) { print_tnef_parts($sub_ent); } } elsif ( $ent->mime_type =~ /ms-tnef/i ) { # Create a tnef object my $tnef = Convert::TNEF->read_ent($ent,{output_dir=>"tnefmail"}) or die $Convert::TNEF::errstr; for ($tnef->attachments) { print "Title:",$_->name," "; print "Data: ",$_->data," "; } # If you want to delete the working files # $tnef->purge; } } SEE ALSO
perl(1), IO::Wrap(3), MIME::Parser(3), MIME::Entity(3), MIME::Body(3) CAVEATS
The parsing may depend on the endianness (see perlport) and width of integers on the system where the TNEF file was created. If this proves to be the case (check the debug output), I'll see what I can do about it. AUTHOR
Douglas Wilson, dougw@cpan.org perl v5.18.2 2012-07-23 TNEF(3)
All times are GMT -4. The time now is 09:57 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy