Sponsored Content
Full Discussion: Removing repeates sequences
Top Forums Shell Programming and Scripting Removing repeates sequences Post 302461164 by hravisankar on Friday 8th of October 2010 05:22:56 PM
Old 10-08-2010
Removing repeates sequences

Hai,
How to remove the repeated 'Chr's in different sequences. In the given example, Chr19 is repeated in two samples
with the same number i.e. +52245923. How to remove one of the entry in any of the samples and to give the range for each
Chr which is -20 for minimum range value and +120 for maximum range value. For Chr19 it will be displayed as
Chr19:52245903-5224546043 in output file (i.e., for Chr19, +5224593 given. So -20 from this value is min.range and +120 is max. range)
No impotance for the sign (+ or -) in the input data. The final output also givn for easy understanding.


INPUT FILE:
>sample1:1:1:1058:8130#0 5 830
Chr19 +52245923 1
Chr17 +69679873 1
Chr23 +52121254 1
>sample1:1:1:1060:5177#0 5 67
Chr19 +52245923 1
Chr17 -69679873 1
Chr15 +82202352 1
Chr5 +30440548 1

OUTPUT FILE:
>sample1:1:1:1058:8130#0 5 830
Chr19:52245903-52246043
Chr17:69679853-69679993
Chr23:52121234-52121374

>sample1:1:1:1060:5177#0 5 67
Chr15:82202332-82202472
Chr5:30440528-30440628

PLS. HELP ME TO WRITE A SHELL SCRIPT FOR THIS SEQUENCE WHICH HELPS A LOT IN BIOINFORMATICS RESEARCH.
THANKS IN ADVANCE.
i.e., Chr19 and Chr17 are removed from second sample because they are repeated. For all Chrs we replaced the value with a range in the above format shown in output.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

AWK and hex sequences

for file in `seq 1 256`; do printf "\x$file -- $file" ; done ; printf "\n" produces the wrong output. I want to show the ascii codes but need to output a hexidecimal number sequence. I know I should use awk to do this but i'm not sure how cause I forget. what is the awk equivelant of seq... (5 Replies)
Discussion started by: JoeTheGuy
5 Replies

2. Solaris

Available escape sequences

:) Hi, Can any one help me to find available escape sequences in UNIX shell programming? ( Like \n, \c etc,. in C or C++) Iam generating one report using one of the script, in that it is very much essential. Regards, LOVE (6 Replies)
Discussion started by: Love
6 Replies

3. UNIX for Advanced & Expert Users

Deal with binary sequences

Hello, I have come across the necessity for me to deal with binary sequences and I had a few questions. 1- Does any UNIX scripting language provide any tool or command for converting text data to binary sequences? Example of binary sequence: "0x97 0x93 0x85 0x40 0xd5 0xd6 0xd7" 2- If I want... (2 Replies)
Discussion started by: Indalecio
2 Replies

4. Programming

Trigraph sequences

Hi, i have read trigraph sequence in The C99 Draft (N869, 18 January, 1999) printf("Eh???/n"); will produce printf("Eh?\n"); what does that mean? i tried that but i am getting the same output i.e Eh???/n. what actually these tri graph characters are? any idea why ,when and... (1 Reply)
Discussion started by: MrUser
1 Replies

5. Shell Programming and Scripting

trimming sequences

My file looks like this: But I would like to 'trim' all sequences to the same lenght 32 characters, keeping intact all the identifier (>GHXCZCC01AJ8CJ) Would it be possible to use awk to perform this task? (2 Replies)
Discussion started by: Xterra
2 Replies

6. Shell Programming and Scripting

Removing low frequency sequences

If I have a file with the following information And I would like to remove all the sequences with Freq less than 3, so I end up having the following file: I am currently using awk to accomplish this task but I am not getting the results I actually want. Any help will be greatly appreciated. (3 Replies)
Discussion started by: Xterra
3 Replies

7. Shell Programming and Scripting

Removing specific sequences from file

My file looks like this But I need to remove the entry with the identifier >Reference1 along with the entire sequence. Thus, I will end up having the following file Thanks in advance! (2 Replies)
Discussion started by: Xterra
2 Replies

8. Programming

Searching String which repeates

Hi All, Any one please help me with the below scenario. I have a file with the data like below This is the integer variable name is <abc1> This is the Float variable name is <abc1> This is the integer variable name is <abc2> This is the Float variable name is <abc2> This is the integer... (1 Reply)
Discussion started by: jhon1257
1 Replies

9. Shell Programming and Scripting

Escape Sequences

Hi Gurus, Escape sequences \n, \t, \b, \t, \033(1m are not working. I just practiced these escape sequences. It worked first. Later its not working. Also the command - echo inside the script editor shows as shaded by a color. Before that echo inside the script editor wont show like this.... (4 Replies)
Discussion started by: GaneshAnanth
4 Replies

10. Shell Programming and Scripting

Removing duplicate sequences and modifying a text file

Hi. I've tried several different programs to try and solve this problem, but none of them seem to have done exactly what I want (and I need the file in a very specific format). I have a large file of DNA sequences in a multifasta file like this, with around 15 000 genes: ... (2 Replies)
Discussion started by: 4galaxy7
2 Replies
clri(1M)																  clri(1M)

NAME
clri - clear inode SYNOPSIS
special i-number ... DESCRIPTION
The command clears the inode i-number by filling it with zeros. special must be a special file name referring to a device containing a file system. For proper results, special should not be mounted (see WARNINGS below). After is executed, all blocks in the affected file show up as "missing" in an of special (see fsck(1M)). This command should only be used in emergencies. Read and write permission is required on the specified special device. The inode becomes allocatable. WARNINGS
The primary purpose of this command is to remove a file that for some reason does not appear in any directory. If it is used to clear an inode that does appear in a directory, care should be taken to locate the entry and remove it. Otherwise, when the inode is reallocated to some new file, the old entry in the directory will still point to that file. At that point, removing the old entry destroys the new file, causing the new entry to point to an unallocated inode, so the whole cycle is likely to be repeated again. If the file system is mounted, is likely to be ineffective. DEPENDENCIES
operates only on file systems of type SEE ALSO
fsck(1M), fsdb(1M), ncheck(1M). STANDARDS CONFORMANCE
clri: SVID2, SVID3 clri(1M)
All times are GMT -4. The time now is 08:36 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy