Each line of the file has some words exactly same letters as of the first one. But has zero or more "_+" inserted. I am interested in those words and remove the other cases.
Example:
I want to get this:
Hi, All
I have a huge file which has 450G. Its tab-delimited format is as below
x1 A 50020 1
x1 B 50021 8
x1 C 50022 9
x1 A 50023 10
x2 D 50024 5
x2 C 50025 7
x2 F 50026 8
x2 N 50027 1
:
:
Now, I want to extract a subset from this file. In this subset, column 1 is x10, column 2 is... (3 Replies)
Hello everyone. I'm new to the boards, I hope I can get and possibly give some help through these forums.
I need some help.
I have two CSV files, let's call them File A and File B.
This is the structure for File A:
ID, VAR1, VAR2, VAR3 - VAR50 (where the VAR 1-VAR50 are either 0 or 1)
... (1 Reply)
I have a file that has the words I want to find in other files (but lets say I just want to find my words in a single file). Those words are IDs, so if my word is ZZZ4, outputs like aaZZZ4, ZZZ4bb, aaZZZ4bb, ZZ4, ZZZ, ZyZ4, ZZZ4.8 (or anything like that) WON'T BE USEFUL.
I need the whole word... (6 Replies)
Dear all,
I have a file lik below: n of row=420, n of letters in each row=100000 like below: there is no space between the letters.
what I want is: the 75000th letter to the 85000th letter in each row.
how to do that? thanks a lot!
... (2 Replies)
I am compiling a fortran program using gfortran and the result looks as below
I want to write a bash or awk script that will scan the information and output
only problems within a range of line numbers
Example: If I specify the file createmodl.f08, start line 1000 and end line 1100, I will... (8 Replies)
Hi. I have a large data file. the first column has unique identifiers. I have approximately 5 of these files and they have varying number of columns in their rows. I need to extract ~300 of the rows in to a separate file. I'm not looking for something that would do all 5 files at once, but... (7 Replies)
I need to know if file1 is a subset of file2 i.e all the contents of file1 are present in file2 or not.
Here is how i would do it.
Read line by line file1 and grep every line in file2 in a for loop. any failing grep would means that it is not a subset.
Is there a quicker or easier way... (3 Replies)
In-order to check and print if file2 is a subset of file one i do the below.
var1=$(cat //tmp/file1 | sort -u | wc)
var2=$(cat /tmp/file2 /tmp/file1 | sort -u | wc)
if ; then
echo "file2 is a subset of file1 becoz var1 and var2 have the same values."
fi
However, i get the following error ... (1 Reply)
GO-FILTER-SUBSET(1p) User Contributed Perl Documentation GO-FILTER-SUBSET(1p)NAME
go-filter-subset.pl - extracts a subgraph from an ontology file
SYNOPSIS
go-filter-subset.pl -id GO:0003767 go.obo
go-filter-subset.pl -id GO:0003767 -to png go.obo | xv -
go-filter-subset.pl -filter_code 'sub{shift->name =~ /transcr/}' go.obo
DESCRIPTION
Exports a subset of an ontology from a file. The subset can be based on a specified set of IDs, a preset "subset" filter in the ontology
file (eg a GO "slim" or subset), or a user-defined filter.
The subset can be exported in any format, including a graphical image
ARGUMENTS -id ID
ID to use as leaf node in subgraph. All ancestors of this ID are included in the exported graph (unless -partial is set)
Multiple IDs can be passed
-id ID1 -id ID2 -id ID3 ....etc
-subset SUBSET_ID
Extracts a named subset from the ontology file. (only works with obo format files). For example, a specific GO slim
ONLY terms belonging to the subset are exported - the -partial option is automatically set
-namespace NAMESPACE
only terms in this namespace
-filter_code SUBROUTINE
advanced option
A subroutine with which the GO::Model::Term object is tested for inclusion in the subgraph (all ancestors are automatically included)
You should have an understanding of the go-perl object model before using this option
Example:
go-filter-subset -filter_code 'sub {shift->namespace eq 'molecular_function'}' go.obo
(the same things can be achieved with the -namespace option)
-partial
If this is set, then only terms that match the user query are included. Parentage is set to the next recursive parent node in the
filter
For example, with the -subset option: if X and Y belong to the subset, and Z does not, and X is_a Z is_a Y, then the exported graph
withh have X is_a Y
-use_cache
If this switch is specified, then caching mode is turned on.
With caching mode, the first time you parse a file, then an additional file will be exported in a special format that is fast to parse.
This file will have the same filename as the original file, except it will have the ".cache" suffix.
The next time you parse the file, this program will automatically check for the existence of the ".cache" file. If it exists, and is
more recent than the file you specified, this is parsed instead. If it does not exist, it is rebuilt.
DOCUMENTATION
<http://www.godatabase.org/dev>
perl v5.14.2 2010-05-12 GO-FILTER-SUBSET(1p)