Sponsored Content
Top Forums Shell Programming and Scripting Extract sequences based on the list Post 302756939 by tukuyomi on Wednesday 16th of January 2013 05:54:29 PM
Old 01-16-2013
Using awk:
Code:
unix.com$ awk 'NR==FNR{A[NR]=$1;next}{for(i in A){if($0~A[i]){print;getline;print}}}' file2 file1

Using grep:
Code:
~/unix.com$ grep -A1 -Ff file2 file1

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Trimming sequences based on specific pattern

My files look like this And I need to cut the sequences at the last "A" found in the following 'pattern' -highlighted for easier identification, the pattern is the actual file is not highlighted. The expected result should look like this Thus, all the sequences would end with AGCCCTA... (2 Replies)
Discussion started by: Xterra
2 Replies

2. Shell Programming and Scripting

Deleting sequences based on character frequency

This is what I would like to accomplish, I have an input file (file A) that consist of thousands of sequence elements with the same number of characters (length), each headed by a free text header starting with the chevron ‘>' character followed by the ID (all different IDs with different lenghts)... (9 Replies)
Discussion started by: Xterra
9 Replies

3. Shell Programming and Scripting

Trimming sequences based on Reference

My file looks something like this Wnat I need is to look for the Reference sequence (">Reference1") and based on the length of that sequence trim all the entries in that file. So, the rersulting file will contain all sequences with the same length, like this Thus, all sequences will keep... (5 Replies)
Discussion started by: Xterra
5 Replies

4. Shell Programming and Scripting

Extract length wise sequences from fastq file

I have a fastq file from small RNA sequencing with sequence lengths between 15 - 30. I wanted to filter sequence lengths between 21-25 and write to another fastq file. how can i do that? (4 Replies)
Discussion started by: empyrean
4 Replies

5. Shell Programming and Scripting

Selecting sequences based on scores

I have two files with thousands of sequences of different lengths. infile1 contains the actual sequences and infile2 the scores for each A, T, G and C in infile1. Something like this: infile1: >HZVJKYI01ECH5R TTGATGTGCCAGCTGCCGTTGGTGTGCCAA >HZVJKYI01AQWJ8 GGATATGATGATGAACTGGTTTGGCACACC... (4 Replies)
Discussion started by: Xterra
4 Replies

6. Shell Programming and Scripting

Eliminating sequences based on Distances

I have to remove sequences from a file based on the distance value. I am attaching the file containing the distances (Distance.xls) The second file looks something like this: Sequences.txt >Sample1 Freq 59 ggatatgatgatgaactggt >Sample1 Freq 54 ggatatgatgttgaactggt >Sample1 Freq 44... (2 Replies)
Discussion started by: Xterra
2 Replies

7. Shell Programming and Scripting

Extract sequences of bytes from binary for differents blocks

Hello to all, I would like to search sequences of bytes inside big binary file. The bin file contains blocks of information, each block begins is estructured as follow: 1- Each block begins with the hex 32 (1 byte) and ends with FF. After the FF of the last block, it follows 33. 2- Next... (59 Replies)
Discussion started by: Ophiuchus
59 Replies

8. Shell Programming and Scripting

Extract the part of sequences from a file

I have a text file, input.fasta contains some protein sequences. input.fasta is shown below. >P02649 MKVLWAALLVTFLAGCQAKVEQAVETEPEPELRQQTEWQSGQRWELALGRFWDYLRWVQT LSEQVQEELLSSQVTQELRALMDETMKELKAYKSELEEQLTPVAEETRARLSKELQAAQA RLGADMEDVCGRLVQYRGEVQAMLGQSTEELRVRLASHLRKLRKRLLRDADDLQKRLAVY... (8 Replies)
Discussion started by: rahim42
8 Replies

9. Shell Programming and Scripting

Extract sequences from a FASTA file based on another file

I have two files. File1 is shown below. >153L:B|PDBID|CHAIN|SEQUENCE RTDCYGNVNRIDTTGASCKTAKPEGLSYCGVSASKKIAERDLQAMDRYKTIIKKVGEKLCVEPAVIAGIISRESHAGKVL KNGWGDRGNGFGLMQVDKRSHKPQGTWNGEVHITQGTTILINFIKTIQKKFPSWTKDQQLKGGISAYNAGAGNVRSYARM DIGTTHDDYANDVVARAQYYKQHGY >16VP:A|PDBID|CHAIN|SEQUENCE... (7 Replies)
Discussion started by: nelsonfrans
7 Replies

10. Shell Programming and Scripting

Outputting sequences based on length with sed

I have this file: >ID1 AA >ID2 TTTTTT >ID-3 AAAAAAAAA >ID4 TTTTTTGGAGATCAGTAGCAGATGACAG-GGGGG-TGCACCCC Add I am trying to use this script to output sequences longer than 15 characters: sed -r '/^>/N;{/^.{,15}$/d}' The desire output would be this: >ID4... (8 Replies)
Discussion started by: Xterra
8 Replies
h5diff(1)						      General Commands Manual							 h5diff(1)

NAME
h5diff - Compares two HDF5 files and reports the differences. SYNOPSIS
h5diff file1 file2 [OPTIONS] [object1 [object2 ] ] DESCRIPTION
h5diff is a command line tool that compares two HDF5 files, file1 and file2, and reports the differences between them. Optionally, h5diff will compare two objects within these files. If only one object, object1, is specified, h5diff will compare object1 in file1 with object1 in file2. In two objects, object1 and object2, are specified, h5diff will compare object1 in file1 with object2 in file2. These objects must be HDF5 datasets. object1 and object2 must be expressed as absolute paths from the respective file's root group. Additional information, with several sample cases, can be found in the document H5diff Examples. OPTIONS
file1 file2 The HDF5 files to be compared. -h Print all differences. -r Print only the names of objects that differ; do not print the differences. These objects may be HDF5 datasets, groups, or named datatypes. -n count Print difference up to count differences, then stop. count must be a positive integer. -d delta Print only differences that are greater than the limit delta. delta must be a positive number. The comparison criterion is whether the absolute value of the difference of two corresponding values is greater than delta (e.g., |a-b| > delta, where a is a value in file1 and b is a value in file2). -p relative Print only differences that are greater than a relative error. relative must be a positive number. The comparison criterion is whether the absolute value of the difference 1 and the ratio of two corresponding values is greater than relative (e.g., |1-(b/a)| > relative where a is a value in file1 and b is a value in file2). object1 object2 Specific object(s) within the files to be compared. EXAMPLES
The following h5diff call compares the object /a/b in file1 with the object /a/c in file2: h5diff file1 file2 /a/b /a/c This h5diff call compares the object /a/b in file1 with the same object in file2: h5diff file1 file2 /a/b And this h5diff call compares all objects in both files: h5diff file1 file2 SEE ALSO
h5dump(1), h5ls(1), h5repart(1), h5import(1), gif2h5(1), h52gif(1), h5perf(1) h5diff(1)
All times are GMT -4. The time now is 12:01 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy