How to extract some parts of a file to create some outfile


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to extract some parts of a file to create some outfile
# 15  
Old 05-10-2009
if you want to use Perl, here's another version more "understandable" as there's less of regular expression.
Code:
$i=0;
while (<>){
 chomp;
 if (/\+*End of summary for file/ ){
    $f=0;close(FH);next;
 }    
 if (/Taxonomy:/ ) { 
     open(FH,">>","output_".$i++) or die "Cannot open for writing:$!\n";
     $f=1; next;
 }
 if ($f) { 
    s/^\s+//g; #get rid of spaces in front
    print FH $_."\n";
  }
}

to use the script,
Code:
# perl myscript.pl file


Last edited by ghostdog74; 05-11-2009 at 09:42 AM..
# 16  
Old 05-11-2009
Dear ghostdog74,
My main problem is I am very new in programming. I am trying to learn. So I am not habituated with either perl or python. Both are new to me. Can you please help me to understand how should I make this files executable, like a script? In case of other reply also, when I use the code directly in the terminal then it works, but in all the cases, still I am unable to make these as an executable script with a given input file like $1.
Can you or anyone else please help me in this matter?
Thanks a lot for your help.
With best regards,
Mitra.
# 17  
Old 05-11-2009
below perl code should help you some.

Code:
open $fh,"<","a.txt";
my ($flag,$n)=(0,0);
while(<$fh>){
	if(/Taxonomy:/){
		$n++;
		$file=sprintf("outfile%s.txt",$n);
		open FH,"+>$file";
		$flag=1;
		next;
	}
	if(/\++/){
		$flag=0;
		next;
	}
	print FH $_ if $flag==1;
}

# 18  
Old 05-11-2009
Dear All,
Thanks for your replies, codes and advices.
My main problem is I am very new in programming. I am trying to learn. So I am not habituated with either perl or python. Both are new to me. Can anybody please help me to understand how should I make this files executable, like a script, which I can call afterwords? Suppose if I call the script like code.perl or code.anything else
Everytime I want to give ./code.perl input.txt
My 1st try was:
Code:
#!/usr/bin/perl -w

$#ARGV==1 or die "Usage: 2ndprocess-script 1st-output-file-as-inputfile\n";

$name=shift;

$inputfile="`pwd`/$name";

perl -ne '{$/=""; $i=1;
  while (/^Taxonomy:.(.*?)\+{11}/msgi) {
    $x = $1; $x =~ s/(^|\n)\s+/\1/g;
    open(OUT,">outfile".$i++.".txt"); print OUT $x; close(OUT);
  }}' inputfile;

and 2nd try was:
Code:
#!/usr/bin/perl -w

$#ARGV==1 or die "Usage: 2ndprocess-script 1st-output-file-as-inputfile\n";

$name=shift;

$inputfile="`pwd`/$name";

open $fh,"<", $inputfile;
my ($flag,$n)=(0,0);
while(<$fh>){
	if(/Taxonomy:/){
		$n++;
		$file=sprintf("outfile%s.txt",$n);
		open FH,"+>$file";
		$flag=1;
		next;
	}
	if(/\++/){
		$flag=0;
		next;
	}
	print FH $_ if $flag==1;
}

But both of them didn't work in a desired way.
Can anybody please help me?
With best regards and many thanks,
Mitra.
# 19  
Old 05-11-2009
ghostdog74,
Thank you for your help. Your last help for the script works. but still it produces files will spaces in front of lines. How I can get rid of the spaces.
The output looks like
mitra:testNextPart mitra$ more output_0
Code:
  Gammaproteobacteria: 2767
        Alphaproteobacteria: 4123
          Deltaproteobacteria: 1343
          Epsilonproteobacteria: 26
        Betaproteobacteria: 397
        unclassified Proteobacteria: 48
          Elusimicrobium: 2
        candidate division WWE1: 9
          Flavobacteria: 2358
          Sphingobacteria: 136
          Bacteroidia: 162
          environmental samples: 21
          Chlorobia: 77
        Planctomycetacia: 40
        Spirochaetes (class): 15
        Nitrospira (class): 1
        Bacilli: 25
  Not assigned: 1445
  No hits: 220253


Sorry to disturb you again and again.
Thanks a lot.
With best regard,
Mitra.

Last edited by iammitra; 05-11-2009 at 06:59 AM..
# 20  
Old 05-11-2009
Dear All,
I was trying like below to get rid off the space in front of the line(see the previous post).
Code:
#!/usr/bin/perl -w

$#ARGV==0 or die "Usage: 2ndprocess-megan-script 1st-output-file-as-inputfile\n";

$i=1;
while (<>){
chomp;
   
 if (/Taxonomy:/ ) { 
     $x = $1; $x =~ s/^\s+|\s+$//g;
     open(OUT,">>","output_".$i++) or die "Cannot open for writing:$!\n";
     $f=1; next;
 }
 
 if (/\+*End of summary for file/ ){
    $f=0;close(OUT);next;
 }
 if ($f) { print OUT $_."\n";}
}

But its not working.

Can anybody please help me to have the out put in the form:
Gammaproteobacteria: 2767
Alphaproteobacteria: 4123
Deltaproteobacteria: 1343
Epsilonproteobacteria: 26
Betaproteobacteria: 397
unclassified Proteobacteria: 48
Elusimicrobium: 2
candidate division WWE1: 9
Flavobacteria: 2358
Sphingobacteria: 136
Bacteroidia: 162
environmental samples: 21
Chlorobia: 77
Planctomycetacia: 40
Spirochaetes (class): 15
Nitrospira (class): 1
Bacilli: 25
Not assigned: 1445
No hits: 220253

Thanks a lot.
Best,
Mitra.
# 21  
Old 05-11-2009
Try this:

Code:
awk '{$1=$1}1' file

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

TCL script to extract the file name and then create two independent list

I am having one problem as stated below Problem Description I am having some "sv" extension files , I am using "glob" to extract the matching files , Now in these matching files , I need to split them and extract the elements and create different lists. For example set files This... (1 Reply)
Discussion started by: kshitij
1 Replies

2. Shell Programming and Scripting

Extract parts of the line

I have a long list of lines in a txt file which i'm only interested to extract the list of domains like the colored ones. domain.com domain.com/page codes $.09 domain.org domain.org/page2/ codes $0.10 domain.net domain.net/page03 codes $0.05 domain.info ... (3 Replies)
Discussion started by: garfish
3 Replies

3. Solaris

Solaris 10 error "-sh: /tmp/outfile: cannot create"

Hello, Each time a user log on to host, they receive below error: -sh: /tmp/outfile: cannot create Example: $ ssh host user@host's password: Last login: Fri Dec 4 08:17:28 2015 from client.ref |-----------------------------------------------------------------| -sh:... (2 Replies)
Discussion started by: feroccimx
2 Replies

4. Shell Programming and Scripting

Incrementing parts of ten digits number by parts

I have number in file which contains date and serial number: 2013101000. The last two digits are serial number (00). So maximum of serial number is 100. After reaching 100 it becomes 00 with incrementing 10 which is day with max 31. after reaching 31 it becomes 00 and increments 10... (31 Replies)
Discussion started by: Natalie
31 Replies

5. Shell Programming and Scripting

Extract Parts of File

Hello All, I have a file like this Define schema flat_file_schema ( a varchar(20) ,b varchar(30) ,c varchar(40) ); (Insert into table ( a ,b ,c ) values ( 1 ,2 ,3 ); (4 Replies)
Discussion started by: nnani
4 Replies

6. Shell Programming and Scripting

extract certain parts from a file

I have a logfile from which i need to extract certain pattern based on the time but the problem here is the time is not same for all days. Input file: Mon 12:34:56 abvjingjgg Mon 12:34:57 ofjhjgjhgh . . . Mon 22:30:00 kkfng . . . Mon 23:12:23 kjgsdafhkljf . . . Tue 01:04:54... (8 Replies)
Discussion started by: gpk_newbie
8 Replies

7. Shell Programming and Scripting

awk? create similarity matrix by calculating overlaps between sets comprising of individual parts

Hi everyone I am very new at awk and to me the task I need to get done is very very challenging... Nevertheless, after admiring how fast and elegant issues are being solved here I am sure this is my best chance. I have a 2D data file (input file is a plain tab-delimited text file). The first... (1 Reply)
Discussion started by: stonemonkey
1 Replies

8. Shell Programming and Scripting

Create shell script to extract unique information from one file to a new file.

Hi to all, I got this content/pattern from file http.log.20110808.gz mail1 httpd: Account Notice: close igchung@abc.com 2011/8/7 7:37:36 0:00:03 0 0 1 mail1 httpd: Account Information: login sastria9@abc.com proxy sid=gFp4DLm5HnU mail1 httpd: Account Notice: close sastria9@abc.com... (16 Replies)
Discussion started by: Mr_47
16 Replies

9. Shell Programming and Scripting

Extract date from filename and create a new file

Hi, i have a filename CRED20102009.txt in a server 20102009 is the date of the file ddmmaaaa format the complete route is /dprod/informatica/Fuentes/CRED20102009.csv i want to extract the date to create a new file named Parameters.txt I need to create Parameters.txt with this... (6 Replies)
Discussion started by: angel1001
6 Replies

10. Shell Programming and Scripting

extract columns from 2 different files and create new file

Hi All, I have 2 issues while working with file. 1. I have 2 delimited(~) files. I want to extract column numbner 3 from file1 and column number 8 from file2 and paste it into file3. I have tried using cut, but not able to get answer. 2. i have 2 filxed-width file. I wanted to do same... (1 Reply)
Discussion started by: Amit.Sagpariya
1 Replies
Login or Register to Ask a Question