How to extract some parts of a file to create some outfile


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to extract some parts of a file to create some outfile
# 8  
Old 05-08-2009
Hallo durden_tyler,
your perl code works. Thanks a lot. But there is still one problem.
As I told in my input file there are several amount of spaces in front desired lines.
Is there any possibility to get rid of these space directly?
Now it is giving:
Code:
mitra:~ mitra$ cat outfile1.txt
          Gammaproteobacteria: 2767
       Alphaproteobacteria: 4123
         Deltaproteobacteria: 1343
                         Epsilonproteobacteria: 26
     Betaproteobacteria: 397
                        unclassified Proteobacteria: 48
                  Spirochaetes (class): 15
        Nitrospira (class): 1
        Bacilli: 25
  Not assigned: 1445
  No hits: 220253

Thank you very much for your help.
Best Wishes,
Mitra.

Last edited by Franklin52; 05-10-2009 at 01:07 PM.. Reason: adding code tags
# 9  
Old 05-08-2009
Sorry, I don't know why all the spaces disappears here. But there are several spaces (not equal for all lines)in front of desired lines.
# 10  
Old 05-08-2009
Hallo ghostdog74,
I will try with this modification. If it works.
Thank you very much.
Best,
Mitra.
# 11  
Old 05-09-2009
Quote:
Originally Posted by iammitra
...
As I told in my input file there are several amount of spaces in front desired lines.
Is there any possibility to get rid of these space directly?
...
Here's one way to do it:

Code:
perl -ne '{$/=""; $i=1;
  while (/^Taxonomy:.(.*?)\+{11}/msgi) {
    $x = $1; $x =~ s/(^|\n)\s+/\1/g;
    open(OUT,">outfile".$i++.".txt"); print OUT $x; close(OUT);
  }}' input.txt

Testing on sample data:

Code:
$ 
$ cat input.txt
Number of disabled taxa: 9
Loading mapping file: ncbi.map
Load mapping:                 
taxId2TaxLevel: 469951        
--- Subsample reads (20%): 66680 of 334386
Processing: tree-from-summary             
Running tree-from-summary algorithm       
Taxonomy:                                 
    Gammaproteobacteria: 2767             
Alphaproteobacteria: 4123                 
  Deltaproteobacteria: 1343               
     Epsilonproteobacteria: 26            
 Not assigned: 1445                       
    No hits: 220253                       
+++++++++++End of summary for file: B-Red-sum.txt
--- Subsample reads (20%): 67037 of 334386       
Processing: tree-from-summary                    
Running tree-from-summary algorithm
Taxonomy:
      Gammaproteobacteria: 2809
  Alphaproteobacteria: 4001
        Deltaproteobacteria: 1208
    Epsilonproteobacteria: 15
Not assigned: 299
    No hits: 461890
+++++++++++End of summary for file: B-Red-sum.txt
::::: and so on
$
$ perl -ne '{$/=""; $i=1;
  while (/^Taxonomy:.(.*?)\+{11}/msgi) {
    $x = $1; $x =~ s/(^|\n)\s+/\1/g;
    open(OUT,">outfile".$i++.".txt"); print OUT $x; close(OUT);
  }}' input.txt
$
$
$ cat outfile1.txt
Gammaproteobacteria: 2767
Alphaproteobacteria: 4123
Deltaproteobacteria: 1343
Epsilonproteobacteria: 26
Not assigned: 1445
No hits: 220253
$
$ cat outfile2.txt
Gammaproteobacteria: 2809
Alphaproteobacteria: 4001
Deltaproteobacteria: 1208
Epsilonproteobacteria: 15
Not assigned: 299
No hits: 461890
$
$

Hope that helps,
tyler_durden

____________________________________________________________
"This is your life and it's ending one minute at a time."
# 12  
Old 05-09-2009
Quote:
Originally Posted by iammitra
Sorry, I don't know why all the spaces disappears here. ...
The spaces disappear here because you do not enclose your file data or code within the "code" tags. (Notice how the actual code posted by the forum members has a nice little box around it with the title "Code:" at the top.)

If you sandwich the desired text within "code" tags, without any space between "code", "]", "[" and "/" :

[ code ] <your_text_here> [ / code ]

then the leading spaces will be preserved.

Alternatively, if you are feeling lazy to actually type the "code" tags, then you can do this -
(a) select the desired text, and
(b) click on the "#" icon in your Message Box right above the response area
The dynamic script associated with the web page will put the "code" tags for you.

HTH,
tyler_durden

____________________________________________________________
"This is your life and it's ending one minute at a time."
# 13  
Old 05-10-2009
Hallo durden_tyler,
At first I want to thank you for your help. Thanks a lot...I am very new in scripting. Can you please explain the filed (.*?)\+{11}/msgi) for your code in my thread help?

Actually I am trying to learn. So it will be really helpful. And one more question How can I make this script executable.

My try was:
#!/usr/bin/perl -w

$#ARGV==1 or die "Usage: 2ndprocess-script 1st-output-file-as-inputfile\n";

$input=shift;

perl -ne '{$/=""; $i=1;
while (/^Taxonomy:.(.*?)\+{11}/msgi) {
$x = $1; $x =~ s/(^|\n)\s+/\1/g;
open(OUT,">outfile".$i++.".txt"); print OUT $x; close(OUT);
}}' $1;

-----------------------------
which didn't work.
Can you please help me to learn this?
Thank you very much once again.
Have anice time.
Best wishes,
Mitra
# 14  
Old 05-10-2009
Hallo durden_tyler,
At first I want to thank you for your help. Thanks for the help in writing also. Now I can use that.Thanks a lot...I am very new in scripting. Can you please explain the filed (.*?)\+{11}/msgi) for your code in my thread help?

Actually I am trying to learn. So it will be really helpful. And one more question How can I make this script executable.

My try was:

Quote:
#!/usr/bin/perl -w

$#ARGV==1 or die "Usage: 2ndprocess-script 1st-output-file-as-inputfile\n";

$input=shift;

perl -ne '{$/=""; $i=1;
while (/^Taxonomy:.(.*?)\+{11}/msgi) {
$x = $1; $x =~ s/(^|\n)\s+/\1/g;
open(OUT,">outfile".$i++.".txt"); print OUT $x; close(OUT);
}}' $1;
-----------------------------
which didn't work.
Can you please help me to learn this?
Thank you very much once again.
Have anice time.
Best wishes,
Mitra
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

TCL script to extract the file name and then create two independent list

I am having one problem as stated below Problem Description I am having some "sv" extension files , I am using "glob" to extract the matching files , Now in these matching files , I need to split them and extract the elements and create different lists. For example set files This... (1 Reply)
Discussion started by: kshitij
1 Replies

2. Shell Programming and Scripting

Extract parts of the line

I have a long list of lines in a txt file which i'm only interested to extract the list of domains like the colored ones. domain.com domain.com/page codes $.09 domain.org domain.org/page2/ codes $0.10 domain.net domain.net/page03 codes $0.05 domain.info ... (3 Replies)
Discussion started by: garfish
3 Replies

3. Solaris

Solaris 10 error "-sh: /tmp/outfile: cannot create"

Hello, Each time a user log on to host, they receive below error: -sh: /tmp/outfile: cannot create Example: $ ssh host user@host's password: Last login: Fri Dec 4 08:17:28 2015 from client.ref |-----------------------------------------------------------------| -sh:... (2 Replies)
Discussion started by: feroccimx
2 Replies

4. Shell Programming and Scripting

Incrementing parts of ten digits number by parts

I have number in file which contains date and serial number: 2013101000. The last two digits are serial number (00). So maximum of serial number is 100. After reaching 100 it becomes 00 with incrementing 10 which is day with max 31. after reaching 31 it becomes 00 and increments 10... (31 Replies)
Discussion started by: Natalie
31 Replies

5. Shell Programming and Scripting

Extract Parts of File

Hello All, I have a file like this Define schema flat_file_schema ( a varchar(20) ,b varchar(30) ,c varchar(40) ); (Insert into table ( a ,b ,c ) values ( 1 ,2 ,3 ); (4 Replies)
Discussion started by: nnani
4 Replies

6. Shell Programming and Scripting

extract certain parts from a file

I have a logfile from which i need to extract certain pattern based on the time but the problem here is the time is not same for all days. Input file: Mon 12:34:56 abvjingjgg Mon 12:34:57 ofjhjgjhgh . . . Mon 22:30:00 kkfng . . . Mon 23:12:23 kjgsdafhkljf . . . Tue 01:04:54... (8 Replies)
Discussion started by: gpk_newbie
8 Replies

7. Shell Programming and Scripting

awk? create similarity matrix by calculating overlaps between sets comprising of individual parts

Hi everyone I am very new at awk and to me the task I need to get done is very very challenging... Nevertheless, after admiring how fast and elegant issues are being solved here I am sure this is my best chance. I have a 2D data file (input file is a plain tab-delimited text file). The first... (1 Reply)
Discussion started by: stonemonkey
1 Replies

8. Shell Programming and Scripting

Create shell script to extract unique information from one file to a new file.

Hi to all, I got this content/pattern from file http.log.20110808.gz mail1 httpd: Account Notice: close igchung@abc.com 2011/8/7 7:37:36 0:00:03 0 0 1 mail1 httpd: Account Information: login sastria9@abc.com proxy sid=gFp4DLm5HnU mail1 httpd: Account Notice: close sastria9@abc.com... (16 Replies)
Discussion started by: Mr_47
16 Replies

9. Shell Programming and Scripting

Extract date from filename and create a new file

Hi, i have a filename CRED20102009.txt in a server 20102009 is the date of the file ddmmaaaa format the complete route is /dprod/informatica/Fuentes/CRED20102009.csv i want to extract the date to create a new file named Parameters.txt I need to create Parameters.txt with this... (6 Replies)
Discussion started by: angel1001
6 Replies

10. Shell Programming and Scripting

extract columns from 2 different files and create new file

Hi All, I have 2 issues while working with file. 1. I have 2 delimited(~) files. I want to extract column numbner 3 from file1 and column number 8 from file2 and paste it into file3. I have tried using cut, but not able to get answer. 2. i have 2 filxed-width file. I wanted to do same... (1 Reply)
Discussion started by: Amit.Sagpariya
1 Replies
Login or Register to Ask a Question