Sponsored Content
Top Forums Shell Programming and Scripting Deleting files that don't contain particular text strings / more than one instance of a string Post 302368398 by kmkocot on Wednesday 4th of November 2009 07:52:40 PM
Old 11-04-2009
Question Deleting files that don't contain particular text strings / more than one instance of a string

Hi all,

I have a directory containing many subdirectories each named like KOG#### where # represents any digit 0-9. There are several files in each KOG#### folder but the one I care about is named like KOG####_final.fasta. I am trying to write a script to copy all of the KOG####_final.fasta files to the same directory and then apply some filters to them.

For the filters, I want to go through each of the KOG####_final.fasta files and remove any of them that don't contain at least 10 different text strings that are specified in a text file or somewhere in the script. I'd also like to have a filter that removes files that have more than one instance of any one string.

I know this is a lot but I'm really stumped as to where to start on this one. Any assistance in getting started with this would be much appreciated!

Thanks!
Kevin
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Deleting a file I don't own

I have a directory with permissions set 777, and some gumby has dumped a bunch of files and directories in there. I don't own the culprit files or directories, but do own the containing directory - Is there some way I can delete this other user's files? The other interesting thing is that... (5 Replies)
Discussion started by: kumachan
5 Replies

2. Shell Programming and Scripting

deleting lines from multiple text files

I have a directory full of text data files. Unfortunately I need to get rid of the 7th and 8th line from them all so that I can input them into a GIS application. I've used an awk script to do one at a time but due to the sheer number of files I need some kind of loop mechanism to automate... (3 Replies)
Discussion started by: vrms
3 Replies

3. Shell Programming and Scripting

Extracting text between two strings, first instance only

There are a lot of ways to extract text from between two strings, but what if those strings occur multiple times and you only want the text from the first two strings? I can't seem to find anything to work here. I'm using sed to process the text after it's extracted, so I prefer a sed answer, but... (4 Replies)
Discussion started by: fubaya
4 Replies

4. Shell Programming and Scripting

Deleting a line from a file based on one specific string instance?

Hello! I need to delete one line in a file which matches one very precise instance of a string only. When searching the forum I unfortunately only found a solution which would delete each line on which a particular string occurs. Let's assume I have a file composed of thousands of lines... (4 Replies)
Discussion started by: Black Sun
4 Replies

5. Shell Programming and Scripting

Text strings in files.

Hi all I have two files X.txt and Y.txt. The file format of X.txt is : madras is also the fountainhead of the theosophical movement which spread worldwide . and second file Y.txt is of the format: madra|s|nsubj is|cop also|advmod the|det fountainhead|empty of|prep the|det... (3 Replies)
Discussion started by: my_Perl
3 Replies

6. UNIX for Dummies Questions & Answers

Using grep to find files that don't contain a string

Hi all, I am still learning my way around unix commands and I have the following question. I have a website and I want to search for all the html pages that don't contain a certain js file. The file I am searching for is located under /topfolder/js/rules.js . So I assume in my grep search I... (5 Replies)
Discussion started by: SyphaX
5 Replies

7. UNIX for Dummies Questions & Answers

Deleting lines that contain a specific string from a space delimited text file?

Hi, I have a space delimited text file that looks like the following: 250 rs10000056 0.04 0.0888 4 189321617 250 rs10000062 0.05 0.0435 4 5254744 250 rs10000064 0.02 0.2403 4 127809621 250 rs10000068 0.01 NA 250 rs1000007 0.00 0.9531 2 237752054 250 rs10000081 0.03 0.1400 4 17348363... (5 Replies)
Discussion started by: evelibertine
5 Replies

8. Shell Programming and Scripting

Search text file, then grep next instance of string

I need to be able to search for a beginning line header, then use grep or something else to get the very next instance of a particular string, which will ALWAYS be in "Line5". What I have is some data that appears like this: Line1 Line2 Line3 Line4 Line5 Line6 Line7 Line1 Line2 ...... (4 Replies)
Discussion started by: Akilleez
4 Replies

9. Homework & Coursework Questions

Problem with Shell Scripts deleting text in files.

Me and a friend are working on a project, and We have to create a script that can go into a file, and replace all occurances of a certain expression/word/letter with another using Sed. It is designed to go through multiple tests replacing all these occurances, and we don't know what they will be so... (1 Reply)
Discussion started by: Johnny2518
1 Replies

10. Windows & DOS: Issues & Discussions

Deleting all files containing string (WINDOWS DOS)

So I want to skim through all folders (ongoing from the curr dir) and delete all files that contain the string: "in conflikt standing copy". Is this possible WITH DOS ? (1 Reply)
Discussion started by: pasc
1 Replies
Bio::AlignIO::fasta(3pm)				User Contributed Perl Documentation				  Bio::AlignIO::fasta(3pm)

NAME
Bio::AlignIO::fasta - fasta MSA Sequence input/output stream SYNOPSIS
Do not use this module directly. Use it via the Bio::AlignIO class. DESCRIPTION
This object can transform Bio::SimpleAlign objects to and from fasta flat file databases. This is for the fasta alignment format, not for the FastA sequence analysis program. To process the alignments from FastA (FastX, FastN, FastP, tFastA, etc) use the Bio::SearchIO module. FEEDBACK
Support Please direct usage questions or support issues to the mailing list: bioperl-l@bioperl.org rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. Reporting Bugs Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution. Bug reports can be submitted via the web: https://redmine.open-bio.org/projects/bioperl/ AUTHORS
Peter Schattner APPENDIX
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _ next_aln Title : next_aln Usage : $aln = $stream->next_aln Function: returns the next alignment in the stream. Returns : Bio::Align::AlignI object - returns 0 on end of file or on error Args : -width => optional argument to specify the width sequence will be written (60 chars by default) See Bio::Align::AlignI write_aln Title : write_aln Usage : $stream->write_aln(@aln) Function: writes the $aln object into the stream in fasta format Returns : 1 for success and 0 for error Args : L<Bio::Align::AlignI> object See Bio::Align::AlignI _get_len Title : _get_len Usage : Function: determine number of alphabetic chars Returns : integer Args : sequence string width Title : width Usage : $obj->width($newwidth) $width = $obj->width; Function: Get/set width of alignment Returns : integer value of width Args : on set, new value (a scalar or undef, optional) perl v5.14.2 2012-03-02 Bio::AlignIO::fasta(3pm)
All times are GMT -4. The time now is 11:05 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy