Make Separated files from a single matrix - Perl


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Make Separated files from a single matrix - Perl
# 1  
Old 06-07-2013
Make Separated files from a single matrix - Perl

Hey Masters,

Here is my input:

Code:
fragmentID	chromosome	start	end	HEL25E	TRIP1
r5GATC2L00037	chr2L	5301	6026	0.03	0.036
r5GATC2L00038	chr2L	6023	6882	-0.025	-0.041
r5GATC2L00040	chr2R	6921	7695	-0.031	0.005
r5GATC2L00042	chr2R	7715	8554	-0.006	-0.024
r5GATC2L00043	chr3L	8551	8798	0.042	0
r5GATC2L00044	chr3L	8795	9696	0.025	0.042
r5GATC2L00047	chr3R	9801	12443	-0.028	-0.027
r5GATC2L00048	chr4	12440	12695	0.004	0.08
r5GATC2L00050	chrX	12713	13068	0.01	0.1
r5GATC2L00051	chrX	13065	13295	0.137	-0.009

I wanna make output files for each column from 5th column to the end.

If we assume this is the entire input I wanna make two output files as shown below:

HEL25E.wig
Code:
chr2L	5301	6026	0.03
chr2L	6023	6882	-0.025
chr2R	6921	7695	-0.031
chr2R	7715	8554	-0.006
chr3L	8551	8798	0.042
chr3L	8795	9696	0.025
chr3R	9801	12443	-0.028
chr4	12440	12695	0.004
chrX	12713	13068	0.01
chrX	13065	13295	0.137

TRIP1.wig
Code:
chr2L	5301	6026	0.036
chr2L	6023	6882	-0.041
chr2R	6921	7695	0.005
chr2R	7715	8554	-0.024
chr3L	8551	8798	0
chr3L	8795	9696	0.042
chr3R	9801	12443	-0.027
chr4	12440	12695	0.08
chrX	12713	13068	0.1
chrX	13065	13295	-0.009


P.S. I prefer to have a Perl code so I can learn more from Perl. If that's not possible, Shell also works for me. input is .txt and desires outputs are .wig

Thanks for your help.
# 2  
Old 06-07-2013
This will work for any number of columns you have after col #4.

Code:
#! /usr/bin/perl -w
use strict;

# Open the input.txt for reading in filehandle FILE
open FILE, "< input.txt";
my $totCols = 0;
my @hdrFields;
my @fields;
my @refs;

 # Read each line of input.txt from the filehandle
while (<FILE>) {
    # Remove the new line character from the line using chomp
    chomp;
    # If it's the first line being read then...
    if ($. == 1) {
        # Split the first line according to whitespaces and store the fields in array @hdrFields
        @hdrFields = split /\s+/;
        # Store the total number of columns in $totCols.
        # $#hdrFields refers to the last index number of attay @hdrFields
        $totCols = $#hdrFields + 1;
        
        # First 4 columns are fixed. So with 5th column as 1, iterate till the last column.
        # In the given example, there are 2 columns after col #4.
        # The below for-loop opens that many files for writing the output. In given example its 2.
        for (1 .. ($totCols - 4)) {
            # Reference to filehandles are stored in array @refs
            open ($refs[$_], "> $hdrFields[4 - 1 + $_].wig");
        }
        
        # After parsing the first line and opening the file handles for writing skip to the next line of file.
        # This will prevent the below code from being executed.
        next;
    }
    
    # For all lines after the header line, do the below:
    # First split the line according to whitespaces and store the fields in array @fields.
    @fields = split /\s+/;
    
    # Again, iterate through that many numbers equivalent to the number of columns after 4 (since first 4 columns are fixed)
    for (1 .. ($totCols - 4)) {
        # Retrieve the reference to file from array @refs
        my $fh = $refs[$_];
        # Print to corresponding file, first 4 fixed columns then the corresponding column. In this case, either col #5 or col #6.
        print $fh join("\t", @fields[1..3]) . "\t$fields[4 - 1 + $_]\n";
    }
}

# Finally close all the files that were opened for writing
for (1 .. ($totCols - 4)) {
    close $refs[$_];
}

# Finally close the input file that was opened for reading
close FILE;

After running this script, find the files in current directory:
Code:
[user@host ~]$ ls -l HEL25E.wig TRIP1.wig
-rw-r--r-- 1 user domain Users 226 Jun  7 16:32 HEL25E.wig
-rw-r--r-- 1 user domain Users 221 Jun  7 16:32 TRIP1.wig
[user@host ~]$


Last edited by balajesuri; 06-07-2013 at 08:11 AM..
This User Gave Thanks to balajesuri For This Post:
# 3  
Old 06-07-2013
Hi,

for example ( shell script) :
Code:
oldIFS=$IFS;IFS=$'\n';for linea in $(cat input.txt);do echo "`echo $linea| awk '{print $1,$2,$3,$4}'`" >> salida.out;\
 echo "`echo $linea| awk '{print $1,$2,$3,$5}'`">>salida2.out ;done ;IFS=$oldIFS

I have no idea how convert it to .wig

Regards
JJ

Last edited by vbe; 06-07-2013 at 08:58 AM.. Reason: code tags next time please
# 4  
Old 06-07-2013
balajesuri!! OMG! when I saw all your comment I was stunned! Man I appreciate what you do! Big one!
And I ran it for the sample inout file and it works perfectly. I am pretty sure it works for the bigger file as well.
Once more thanks a lot for your help and more for your time!

---------- Post updated at 01:33 PM ---------- Previous update was at 01:26 PM ----------

Dear macorval thanks for the script that you provided. But it doesn't give what I needed and I have much more columns in my input file!
# 5  
Old 06-07-2013
A bit clunky, but you could try:-
Code:
#!/bin/ksh

AllCols=`head -1 $InFile|wc -w`                 # Count all the columns to control the loop

CurCol=5                                        # Set first intersting column
while [ $CurCol -le $AllCols ]                  # Start loop 
do
   OutFile=`head -1 $InFile | \
               cut -f$CurCol -d "tab"`.wig      # Get output filename

   cut -f-4,$CurCol -d "tab" $InFile > $OutFile # Generate output file
   ((CurCol=$CurCol+1))                         # Move counter to next column
done


I hope that this helps


Robin
Liverpool/Blackburn
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Tabbed multiple csv files into one single excel file with using shell script not perl

Hi Experts, I am querying backup status results for multiple databases and getting each and every database result in one csv file. so i need to combine all csv files in one excel file with separate tabs. I am not familiar with perl script so i am using shell script. Could anyone please... (4 Replies)
Discussion started by: ramakrk2
4 Replies

2. Shell Programming and Scripting

Insert single quote on every word separated by comma

Hello, I have a text file as:-ABC BCD CDF DEF EFGI need to convert as 'ABC', 'BCD', 'CDF', 'DEF', 'EFG' using a unix command anybody can help me out on this. Regards, Jas Please wrap all code, files, input & output/errors in CODE tags. It makes them easier to read and preserves... (12 Replies)
Discussion started by: jassi10781
12 Replies

3. Shell Programming and Scripting

Make multiple lines into single quoted comma separated Linux

Hi, I want to change a file file1.txt: 1234 3456 2345 6789 3456 2333 4444 As, file2.txt in Linux: '1234','3456','2345','6789','3456','2333','4444' Could someone please help me. (Single liner sed, awk will be welcome!) (7 Replies)
Discussion started by: wiweq05
7 Replies

4. Shell Programming and Scripting

Shell Code required -Output in Multiple Rows to be in single row separated by Commas -

Hola Greetings Experts , I have records spreaded across multiple lines. in attached log.txt i want output to be in 1 line like this below Atached as Output.txt. In brief Output related to 1 line is spreaded across multiple row I wanted it to be in 1 row . Please opem the file in notepad... (4 Replies)
Discussion started by: manishK
4 Replies

5. UNIX for Dummies Questions & Answers

tab-separated file to matrix conversion

hello all, i have an input file like that A A X0 A B X1 A C X2 ... A Z Xx B A X1 B B X3 .... Z A Xx Z B X4 and i want to have an output like that A B C D A X0 X1 X2 Xy B X1 X3 X4 (4 Replies)
Discussion started by: TheTransporter
4 Replies

6. Homework & Coursework Questions

Find the files and make them comma separated files

Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted! 1. The problem statement, all variables and given/known data: Hi All, I am new to unix, my requirement is like need to find the files like DATA_FUNCTION* and put those... (1 Reply)
Discussion started by: madsongtel
1 Replies

7. Shell Programming and Scripting

Replace single quote with two single quotes in perl

Hi I want to replace single quote with two single quotes in a perl string. If the string is <It's Simpson's book> It should become <It''s Simpson''s book> (3 Replies)
Discussion started by: DushyantG
3 Replies

8. UNIX for Dummies Questions & Answers

How to make a distance matrix

Hi, I'm trying to generate a distance matrix between sample pairs for use in a tree-drawing program (example below). The example below demonstrates what I'd like to get out of the data - essentially, to calculate the proportion of positions where two samples differ. Any help much appreciated!... (1 Reply)
Discussion started by: auburn
1 Replies

9. Shell Programming and Scripting

BASH: print matrix from single array

I am creating a report in groff and need to format data from a file into a table cell. Sample data: dador,173323,bpt,jsp,39030013338878,1 dador,173323,brew,jsp,39030013338860,1 dador,173323,brew,jsp,39030013339447,1 dador,173323,brew,jsp,39030013339538,1 I would like to build a table... (12 Replies)
Discussion started by: Bubnoff
12 Replies

10. Shell Programming and Scripting

Separate lines in a single '|' separated line

Hi I have a file with contents like china india france japan italy germany . . . . etc.... I want the output as china|india|france|japan|italy|germany|.|.|. (3 Replies)
Discussion started by: hidnana
3 Replies
Login or Register to Ask a Question