Hi there,
I'm wanting to produce a shell script that will check through some file names and identify a skip in sequence (four digit seq num in file name).
I have played on the idea of havng a file that has a sorted list of file names which I can read line at a time and cut out the sequence... (1 Reply)
A developer of mine has this requirement - I couldn't tell her quickly how to do it with UNIX commands or a quick script so she's writing a quick program to do it - but that got my curiousity up and thought I'd ask here for advice.
In a text file, there are some records (about half of them)... (4 Replies)
Hi,
I have a data as follow:
1 400
2 239
3 871
4 219
5 543
6 ...
7 ...
.. ...
.. ...
99 818
100 991
I want to replace the sequence number (column 1) that start from 150. The output should like this:
150 400
151 239 (3 Replies)
Hi,
I have a string like:
DBMS stats (Number Used | Percentage of total): 10 | 1.00%
I have a sed command to extract numbers from this string:
sed "s///g;s/^$/-1/;"
Output: 10100
However what I want the sed command to return is only the first number(regardless of its size) i.e.... (3 Replies)
Dear Perl users,
I need your help to solve my problem below.
I want to print the sequence number without missing number within the range.
E.g. my sequence number :
1 2 3 4 5 6 7 8 11 12 13 14
my desired output:
1 -8 , 11-14
my code below but still problem with the result:
1 - 14
1 -... (2 Replies)
Am using unix aix KSH...
I have the files called
MMRR0106.DAT
MMRR0206.DAT
MMRR0406.DAT
MMRR0506.DAT
MMRR0806.DAT
....
...
MMRR3006.DAT
MMRR0207.DAT
These files are in one dircetory /venky ?
I want the output like this ?
Missing files are :
MMRR0306.DAT
MMRR0606.DAT... (7 Replies)
Hi Forum:
I have struggle with it and decide to use my eye ball to accomplish this.
Basically I am looking for sequence of date inside a file.
If one of the sequence repeat 2-3 time or skip once; it's still consider a match.
input text file:
Sep 6 A
Sep 6 A
Sep 10 A
Sep 7 B
Sep 8... (7 Replies)
Hi,
Need to add sequnce number to one of the csv file and please find below
actual requirement.
Input file
ABC,500
XXQ,700
ADF,400,
ART,200
Out put file should be
1,ABC,500
2,XXQ,700
3,ADF,400,
4,ART,200 (3 Replies)
Hello, here I am posting my query again with modified data input files.
see my query is :
i have two input files file1 and file2.
file1 is smalldata.fasta
>gi|546671471|gb|AWWX01449637.1| Bubalus bubalis breed Mediterranean WGS:AWWX01:contig449636, whole genome shotgun sequence... (20 Replies)
Discussion started by: harpreetmanku04
20 Replies
LEARN ABOUT DEBIAN
seqfmt
SEQFMT(5) User Manuals SEQFMT(5)NAME
seqfmt - Sequences formats
DESCRIPTION
This document illustrates some common formats used for sequences representation.
EMBL
ID MMVASPHOS standard; RNA; EST; 140 BP.
AC X97897;
DE M.musculus mRNA for protein homologous to
DE vasodilator-stimulated phosphoprotein
SQ Sequence 140 BP; 25 A; 58 C; 39 G; 17 T; 1 other;
ttctcccaga agctgactct atggngaccc cgagagagac tgagcagaac 60
ccccgcaccc ctgcacttcc aatcaggggc gccccgggag cactccccgt 120
ccgccctccg cgcagccatg 140
//
FASTA
>MMVASPHOS
ttctcccagaagctgactctatggngaccccgagagagactgagcagaacctggagccag
ccccgcacccctgcacttccaatcaggggcgccccgggagcactccccgtggcgcgccgc
ccgccctccgcgcagccatg
GCG
!!NA_SEQUENCE 1.0
(No documentation)
dna1.txt Length: 88 Nov 22, 2001 14:38 Type: N Check: 3818 ..
1 TAGTCGTAGT CGGAGCGATG CTGACGATGA CGATGACGAT CGTAGCTGAT
51 CGATCGAGCT GATGCTGATC GAGCTAGCTG ATCGATCG
GDE
#sample1
TTCAAGAGAAACAGCGGCCAAGGAAAAGACTCGGCATGATTGTCCATAGCTTACAAAGCG
#sample2
TTCAAGAGAAACAGCGGCTGGGGGAAAGACTCGTCCTGATTGCCTGTAGATGGTAAAGCG
GENBANK
LOCUS HUMHBV1 130 bp DNA PRI 17-JUN-1993
DEFINITION Human DNA/endogenous Hepatitis B virus (HBV) DNA, left
host viral junction.
ACCESSION M15770
BASE COUNT 32 a 43 c 29 g 26 t
ORIGIN
1 agcgggcagt gcagctgctt ggacagcagg ggtgtttctt caacccaggc
61 ctcctgtcac aacaggccca ttcaattctg aacctgcaag ccaactccaa
121 cctcttttcc cagggggaac caaaaaccct
//
IG
; comment
U03518
AACCTGCGGAAGGATCATTACCGAGTGCGGGTCCTTTGGGCCCAACCTCCCATCCGTGTC
TATTGTACCCTGTTGCTTCGGCGGGCCCGCCGCTTGTCGGCCGCCGGGGGGGCGCCTCTG
TGAGTTGATTGAATGCAATCAGTTAAAACTTTCAACAATGGATCTCTTGGTTCCGGC1
NBRF
>P1;CCHU
cytochrome c [validated] - human
MGDVEKGKKIFIMKCSQCHTVEKGGKHKTGPNLHGLFGRKTGQAPGYSYTAANKNKGIIW
GEDTLMEYLENPKKYIPGTKMIFVGIKKKEERADLIAYLKKATNE*
PIR
ENTRY CCHU #type complete
TITLE cytochrome c [validated] - human
ACCESSIONS A31764; A05676; I55192; A00001
SUMMARY #length 105 #molecular-weight 11749 #checksum 3247
SEQUENCE
5 10 15 20 25 30
1 M G D V E K G K K I F I M K C S Q C H T V E K G G K H K T G
31 P N L H G L F G R K T G Q A P G Y S Y T A A N K N K G I I W
61 G E D T L M E Y L E N P K K Y I P G T K M I F V G I K K K E
91 E R A D L I A Y L K K A T N E
///
RAW
ttctcccagaagctgactctatggngaccccgagagagactgagcagaacctggagccag
ccccgcacccctgcacttccaatcaggggcgccccgggagcactccccgtggcgcgccgc
ccgccctccgcgcagccatg
Warning: This format cannot handle more than one sequence per file.
SWISSPROT
ID 100K_RAT STANDARD; PRT; 149 AA.
AC Q62671;
DE 100 kDa protein (EC 6.3.2.-).
SQ SEQUENCE 149 AA; 17004 MW; D06484B8BC29112E CRC64;
MMSARGDFLN YALSLMRSHN DEHSDVLPVL DVCSLKHVAY VFQALIYWIK
PQLERKRTRE LLELGIDNED SEHENDDDTS QSATLNDKDD ESLPAETGQN
SITIRPPDDQ HLPTANTCIS RLYVPLYSSK QILKQKLLLA IKTKNFGFV
//
SEE ALSO squizz(1), alifmt(5)AUTHOR
Nicolas Joly (njoly@pasteur.fr), Institut Pasteur.
Unix 2009-05-19 SEQFMT(5)