Sponsored Content
Top Forums Shell Programming and Scripting Extract distinc sequence of letters Post 302920977 by Don Cragun on Tuesday 14th of October 2014 03:45:30 AM
Old 10-14-2014
Quote:
Originally Posted by kamcamonty
Hallo,
I need to extract distinct sequence of letters for example from 136 to 193
Files are quite big, so I would prefer not to use "fold -w1"
Thank you very much

Input file look like this:
Code:
       1 cttttacctt catgtgtttt tgcagatatt tgttcataat aacatcttct ttttaagtta
      61 ttaaaatctt ttttaaagtt attaacattt ttttgtcttt tagatcctat atagatccta
      121 aaagatccta aaagatccta aaagatcccc gtttttgtta aagcatatgt gataaggttt
      181 tatagtactt taagattcac tatagtcagt aaaacgttca ctatagtcag taaaacgttc

I'm lost.
  1. Please use CODE tags.
  2. From 136 to 193 what? You have shown us lines with a variable length digit string followed by 5 or 6 groups of 10 letters. The only thing you have shown us between 136 and 193 is the digit string 181 marked in red above.
  3. What do you mean by "Files are quite big"? How long will the longest lines be in your input files? What is the maximum size (in bytes) of your input files?
  4. What output are you expecting from the above sample input?
  5. I understand that you don't want to use fold -w1, but I don't understand why you would say that. I don't see how using fold -w1 to put each character in your input files on a single line would help solve this problem. (But, maybe that is just because I can't figure out what you're trying to do.)
  6. What OS and shell are you using?

Last edited by Don Cragun; 10-14-2014 at 04:48 AM.. Reason: Add request for OS and shell information.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to extract a sequence of n lines from a file

Hi I want to be able to extract a sequence of n lines from a file. ideas, commands and suggestions would be highly appreciated. Thanks (4 Replies)
Discussion started by: 0ktalmagik
4 Replies

2. Shell Programming and Scripting

Extract Pattern Sequence

Dear Collegues I have to extract Some pattern from raw text file using perl The input will be raw text. Pattern to get - Sequence of Capital Letter Words ( e.g. he is working in Center for Perl Studies. He will come tomorrow...) from thos I have to extract sequences like "Center for Perl... (5 Replies)
Discussion started by: jaganadh
5 Replies

3. Shell Programming and Scripting

extract words with even numbr of letters

Hello All I need to extract words which are of even number of letters and not greater than 10. Any help?? Thanks, Manish (3 Replies)
Discussion started by: manish205
3 Replies

4. Shell Programming and Scripting

Extract sequence blocks

Hi, I have an one-line file consisting of a sequence of 660 letters. I would like to extract 9-letter blocks iteratively: ASDFGHJKLQWERTYUIOPZXCVBNM first block: ASDFGHJKL 1nd block: SDFGHJKLQ What I have so far only gives me the first block, can anyone please explain why? cat... (7 Replies)
Discussion started by: solli
7 Replies

5. Shell Programming and Scripting

Randomize letters

Hi, Is there a tool somewhat parallel to rev, but which randomizes instead of reverses? I've tried rl, but I can only get it to randomize words. I was hoping for something like this echo "hello" | ran leolh less simpler solutions are also welcome. Sorry if the question is... (21 Replies)
Discussion started by: jeppe83
21 Replies

6. Shell Programming and Scripting

find common entries and match the number with long sequence and cut that sequence in output

Hi all, I have a file like this ID 3BP5L_HUMAN Reviewed; 393 AA. AC Q7L8J4; Q96FI5; Q9BQH8; Q9C0E3; DT 05-FEB-2008, integrated into UniProtKB/Swiss-Prot. DT 05-JUL-2004, sequence version 1. DT 05-SEP-2012, entry version 71. FT COILED 59 140 ... (1 Reply)
Discussion started by: manigrover
1 Replies

7. Solaris

Escape Sequence for Capital Letters Input at Shell Not Working

Hello, I am running Solaris 8. When issuing the command "stty lcase" all text which is output to the terminal are capitalized. Letters that are supposed to be capitals are preceded by a backslash during output. All text which is input is converted to lower case. This is the expected behaviour... (5 Replies)
Discussion started by: rstor
5 Replies

8. UNIX for Dummies Questions & Answers

sed - extract a group of Letters/numbers

I have a file with hundreds of lines in it. I wanted to extract anything that matches the following: KR followed by 4 digits: example KR1201 cat list | sed "s///g" Is the closest I've come, and obviously it is not what I want. This would remove all of the items that I want and leave me... (2 Replies)
Discussion started by: newbie2010
2 Replies

9. Shell Programming and Scripting

Extract sequence from fasta file

Hi, I want to match the sequence id (sub-string of line starting with '>' and extract the information upto next '>' line ). Please help . input > fefrwefrwef X900 AGAGGGAATTGG AGGGGCCTGGAG GGTTCTCTTC > fefrwefrwef X932 AGAGGGAATTGG AGGAGGTGGAG GGTTCTCTTC > fefrwefrwef X937... (2 Replies)
Discussion started by: ritakadm
2 Replies

10. UNIX for Beginners Questions & Answers

Random letters

Hi there, first of all this is not homework...this is a new type of exercise for practicing vocabulary with my students. I have a file consisting of two columns, separated by a tab, each line consisting of a word and its definition, separated by a line break. What i need is to replace a... (15 Replies)
Discussion started by: eldeingles
15 Replies
FOLD(1) 							   User Commands							   FOLD(1)

NAME
fold - wrap each input line to fit in specified width SYNOPSIS
fold [OPTION]... [FILE]... DESCRIPTION
Wrap input lines in each FILE (standard input by default), writing to standard output. Mandatory arguments to long options are mandatory for short options too. -b, --bytes count bytes rather than columns -c, --characters count characters rather than columns -s, --spaces break at spaces -w, --width=WIDTH use WIDTH columns instead of 80 --help display this help and exit --version output version information and exit GNU coreutils online help: <http://www.gnu.org/software/coreutils/> Report fold translation bugs to <http://translationproject.org/team/> AUTHOR
Written by David MacKenzie. COPYRIGHT
Copyright (C) 2013 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. SEE ALSO
The full documentation for fold is maintained as a Texinfo manual. If the info and fold programs are properly installed at your site, the command info coreutils 'fold invocation' should give you access to the complete manual. GNU coreutils 8.22 June 2014 FOLD(1)
All times are GMT -4. The time now is 02:33 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy