The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
running command prompt executable file in shell script atl@mav UNIX for Dummies Questions & Answers 5 04-14-2009 12:14 PM
Script extracting the incorrect data from text file jermaine4ever Shell Programming and Scripting 6 03-16-2009 12:18 PM
Calling an Executable C file from the script (URGENT HELP PLX ! ) phantom308 Shell Programming and Scripting 7 06-25-2008 05:54 PM
Need to write a script in UNIX to find a file if another file exists mmdawg Shell Programming and Scripting 1 05-04-2008 11:40 PM
filter parts of a big file using awk or sed script apalex Shell Programming and Scripting 1 07-25-2005 05:45 PM

 
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Bulgarian Greek Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
Prev Previous Post   Next Post Next
  #1 (permalink)  
Old 05-13-2009
iammitra iammitra is offline
Registered User
  
 

Join Date: Apr 2009
Posts: 20
Please help to write a executable script for extracting some parts of a file

Hi All,
I am very new in programming. I need some help.
I have one input file like:


Code:
Number of disabled taxa: 9
Loading mapping file: ncbi.map
Load mapping:
taxId2TaxLevel: 469951
--- Subsample reads (20%): 66680 of 334386
Processing: tree-from-summary
Running tree-from-summary algorithm
Taxonomy:
          Gammaproteobacteria: 2767
       Alphaproteobacteria: 4123
         Deltaproteobacteria: 1343
                         Epsilonproteobacteria: 26
     Betaproteobacteria: 397
                        unclassified Proteobacteria: 48
                  Spirochaetes (class): 15
        Nitrospira (class): 1
        Bacilli: 25
  Not assigned: 1445
  No hits: 220253
+++++++++++End of summary for file: B-Red-sum.txt
--- Subsample reads (20%): 67037 of 334386
Processing: tree-from-summary
Running tree-from-summary algorithm
Taxonomy:
    Gammaproteobacteria: 2809
                Alphaproteobacteria: 4001
       Deltaproteobacteria: 1208
Epsilonproteobacteria: 15
Not assigned: 299
No hits: 461890
+++++++++++End of summary for file: B-Red-sum.txt

::::: and so on

I want to create some output like:
Out file1.txt(which grep from, next line of "Taxonomy:" upto "+++++++++++End" ) with no space in front of line and so on.

So the desired ouput will be: (with no space in front of the names)
outfile1.txt
Gammaproteobacteria: 2767
Alphaproteobacteria: 4123
Deltaproteobacteria: 1343
Epsilonproteobacteria: 26
Betaproteobacteria: 397
unclassified Proteobacteria: 48
Spirochaetes (class): 15
Nitrospira (class): 1
Bacilli: 25
Not assigned: 1445
No hits: 220253

outfile2.txt
Gammaproteobacteria: 2809
Alphaproteobacteria: 4001
Deltaproteobacteria: 1208
Epsilonproteobacteria: 15
Not assigned: 299
No hits: 461890

and so on.

I got several helps from this forum. Thanks to durden_tyler, ghostdog74 and summer_cherry .
But still there is little problem so I am posting again.
Can anybody please help me in this matter?

The perl code below works (without the marked line in the code) but provides out file with spaces in front of line. I tried to get rid of the space but couldn't.


--------------------------------------------------------------------------

Code:
#!/usr/bin/perl -w

$#ARGV==0 or die "Usage: 2ndprocess-megan-script 1st-output-file-as-inputfile\n";

$i=1;
while (<>){
chomp;
   
 if (/Taxonomy:/ ) { 
     $x = $1; $x =~ s/^\s+|\s+$//g;    ##for this line there is the error, without this line it works
     open(OUT,">>","output_".$i++) or die "Cannot open for writing:$!\n";
     $f=1; next;
 }
 
 if (/\+*End of summary for file/ ){
    $f=0;close(OUT);next;
 }
 if ($f) { print OUT $_."\n";}
}

--------------------------------------------------------

One more code by durden_tyler works perfectly (but only in terminal, I failed to create a executable file with this)
----------------------------------------------------------

Code:
perl -ne '{$/=""; $i=1;
  while (/^Taxonomy:.(.*?)\+{11}/msgi) {
    $x = $1; $x =~ s/(^|\n)\s+/\1/g;
    open(OUT,">outfile".$i++.".txt"); print OUT $x; close(OUT);
  }}' input.txt

I tryed in this way(below). But couldn't make it.

Code:
#!/usr/bin/perl -w

$#ARGV==0 or die "Usage: 2ndprocess-script 1st-output-file-as-inputfile\n";

$input=shift;

perl -ne '{$/=""; $i=1;
while (/^Taxonomy:.(.*?)\+{11}/msgi) {
$x = $1; $x =~ s/(^|\n)\s+/\1/g;
open(OUT,">outfile".$i++.".txt"); print OUT $x; close(OUT);
}}' $1;

-------------------------------------------------------------------

From the help with Python also I couldn't make it properly
% code.py
Usage: code.py <input file>
%

---------------------------------------------------------

Code:
#! /usr/bin/python

f=0;i=0
for line in open(input file):
    line=line.strip()
    if line.startswith("+++++++++++"): 
        f=0
        o.close()
    if "Taxonomy:" in line: 
        f=1;i=i+1
        o=open("out_"+str(i)+".txt","w")
    if f:
        print >>o, line

Please help in this matter. I want to prepare one executable script (perl/python/pr bash)..
Many thanks in advance...
Please help.
Best wishes,
Mitra
 

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 12:20 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0