Still not clear. If you want to remove the line count, all white space including new lines, and then extract some chars, this might help:
This works for the sample you gave us; not sure how it behaves on large files.
Dear Collegues
I have to extract Some pattern from raw text file using perl
The input will be raw text.
Pattern to get - Sequence of Capital Letter Words ( e.g. he is working in Center for Perl Studies. He will come tomorrow...) from thos I have to extract sequences like "Center for Perl... (5 Replies)
Hi,
I have an one-line file consisting of a sequence of 660 letters. I would like to extract 9-letter blocks iteratively:
ASDFGHJKLQWERTYUIOPZXCVBNM
first block: ASDFGHJKL
1nd block: SDFGHJKLQ
What I have so far only gives me the first block, can anyone please explain why?
cat... (7 Replies)
Hi,
Is there a tool somewhat parallel to rev, but which randomizes instead of reverses?
I've tried rl, but I can only get it to randomize words.
I was hoping for something like this
echo "hello" | ran
leolh
less simpler solutions are also welcome.
Sorry if the question is... (21 Replies)
Hi all,
I have a file like this
ID 3BP5L_HUMAN Reviewed; 393 AA.
AC Q7L8J4; Q96FI5; Q9BQH8; Q9C0E3;
DT 05-FEB-2008, integrated into UniProtKB/Swiss-Prot.
DT 05-JUL-2004, sequence version 1.
DT 05-SEP-2012, entry version 71.
FT COILED 59 140 ... (1 Reply)
Hello,
I am running Solaris 8. When issuing the command "stty lcase" all text which is output to the terminal are capitalized. Letters that are supposed to be capitals are preceded by a backslash during output. All text which is input is converted to lower case. This is the expected behaviour... (5 Replies)
I have a file with hundreds of lines in it. I wanted to extract anything that matches the following:
KR followed by 4 digits:
example KR1201
cat list | sed "s///g"
Is the closest I've come, and obviously it is not what I want. This would remove all of the items that I want and leave me... (2 Replies)
Hi,
I want to match the sequence id (sub-string of line starting with '>' and extract the information upto next '>' line ). Please help .
input
> fefrwefrwef X900
AGAGGGAATTGG
AGGGGCCTGGAG
GGTTCTCTTC
> fefrwefrwef X932
AGAGGGAATTGG
AGGAGGTGGAG
GGTTCTCTTC
> fefrwefrwef X937... (2 Replies)
Hi there,
first of all this is not homework...this is a new type of exercise for practicing vocabulary with my students.
I have a file consisting of two columns, separated by a tab, each line consisting of a word and its definition, separated by a line break.
What i need is to replace a... (15 Replies)
Discussion started by: eldeingles
15 Replies
LEARN ABOUT ULTRIX
uuencode
uuencode(5) File Formats Manual uuencode(5)Name
uuencode - format of an encoded uuencode file
Description
Files output by consist of a header line, followed by a number of body lines, and a trailer line. The command ignores any lines preceding
the header or following the trailer. Lines preceding a header must not, of course, look like a header.
The header line is distinguished by having the first six characters by the word ``begin'', followed by a space. The next item on the line
is a mode (in octal) and a string which names the remote file. A space separates the three items in the header line.
The body consists of a number of lines, each at most 62 characters long including the trailing new line. These consist of a character
count, followed by encoded characters, followed by a new line. The character count is a single printing character and represents an inte-
ger, the number of bytes the rest of the line represents. Such integers are always in the range from 0 to 63 and can be determined by sub-
tracting the character space (octal 40) from the character.
Groups of 3 bytes are stored in 4 characters, with 6 bits per character. All are offset by a space to make the characters print. The last
line may be shorter than the normal 45 bytes. If the size is not a multiple of 3, this fact can be determined by the value of the count on
the last line. Extra dummy characters are included to make the character count a multiple of 4. The body is terminated by a line with a
count of zero. This line consists of one ASCII space.
The trailer line consists of "end" on a line by itself.
See Alsomail(1), uucp(1c), uudecode(1c), uuencode(1c), uusend(1c)uuencode(5)