Sponsored Content
Top Forums Shell Programming and Scripting Removing duplicate sequences and modifying a text file Post 302964554 by RavinderSingh13 on Friday 15th of January 2016 12:33:17 PM
Old 01-15-2016
Hello 4galaxy7,

Could you please try following and let me know if this helps you.
Code:
awk '/^>TCONS/{gsub(/TCONS_.*gene/,"gene",$0);split($0, A,"_");A[2]=++i;printf("%s %06d\n", A[1], A[2]);next} 1'  Input_file

Output will be as follows.
Code:
>gene=XLOC 000001
AATTGTGGTGAAATGACTTCTGTTAACGGAGACATCGATGATTGTTGTTACTATTTGTTCTCAGGATTCA
TTTGTCCGGTTCATACCCCGGACGGCGCCCCTTGCGGGCTGCTCAATCACCTGACAATGAACTGTATCGT
CACGAAGCATCCGGATCGCAAATTAAAGGCTGCGCTACCAACGGTGCTGGTGGATCTAGGAATGCTTCCG
TTGTCTGTTGCGAATAATTGGAAGGACTCGTACACGGTAATGCTGAATGGTAAAGTGATCGGCCTGATCG
AAGATAATATTGTTGATAAGGTGGCCCGCAAACTAAGGCAGCTGAAGATAATTGGTGAAGAGGTGCCGAA
CACGTTGGAGATCGCGCTGGTGCCGAAGAGGAAGG
>gene=XLOC 000002
CGGATGTATATCGTGCCGTGCTTTGATCGTTTATTTGATGTCCCATTTGCTGTTGGACTTGCGGCGGTAT
TGCCGTTGTTCTCGGCCTTGGTCGTGGCCGTGTGTCTTGCGTGTTTAGGTCCGGGCTGTCTTGAGCACCA
ACTTCCAGTGTCGGTAGTGGAGCTCGTGGTTGCAGGGTTTGCTGCCGAGTTCGTTGGGGCGTTTTGATTG
TTAGGCCTCGTGAACTCGTTTTTTTCGACGCAGATATTGATTTCGAAGGTGTGTGTCTCCTTTCCTGCGG
TTGTTTCGTTTGTTTTGTCGTCGACGGCTCGACGTATTTCGTTGTACTTGAGGTGTCTTTGTTTTGTCGA
TCTTTGTTTCGATCGAGTATATTCCCAACGTTGTGGACGTTGGTCTTCATTCTTCTTATTTCAAATATTA
TATTTTTCCGGCGTTCCTCAAGATATTGGAGGCACCGTTGTTCTCTTTCGCGAAGTCGCGTGAACTCTTC
>gene=XLOC 000003
TGGGTGAAGGTGCTGTGAGCCGTAAAACTTGTAAAAAGTGGTTTCAGAAGTTTCGGAATGGCGATTTCGA
TCTTACTGATCGCGAACGCAGTGGAATGCCGAGAAAAGTTGAAGACGAGGAACTGGAGCAACTATTGAAC
GAGAATCCTTGTAAGACGCAACAAGAACTTGCTGAGCAACTTGGTGTAACTCAACAAGCTATTTCCGTTC
GCTTAAAAAAGCTTGGAAGAATTTCCAAGGCAGGCCGTTGGGTTCCTCATGTGTTCAGCCCCAAACACAA
AGCGAGACGCTGTGACATTAGAATAACTAACCATGGTCAGTCAGTTTGCTTACGGCTTATGTCTTAAAGC
AAGGTTGTAAACAAGAACTTATCTCTTGTCTATGATCTTGCTTTAAAATATAAATAGTAATTAAATTGAC
CAACTACGATCGTTTATTGGAAGAATAATCGATCGTGGTTGGTTAGGTTATGTTTCACAATACGTCGTAT
GTCGCTGTCGG

Thanks,
R. Singh
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

removing duplicate lines from a file

Hi, I am trying to remove duplicate lines from a file. For example the contents of example.txt is: this is a test 2342 this is a test 34343 this is a test 43434 and i want to remove the "this is a test" lines only and end up with the numbers in the file, that is, end up with: 2342... (4 Replies)
Discussion started by: ocelot
4 Replies

2. Shell Programming and Scripting

removing the duplicate lines in a file

Hi, I need to concatenate three files in to one destination file.In this if some duplicate data occurs it should be deleted. eg: file1: ----- data1 value1 data2 value2 data3 value3 file2: ----- data1 value1 data4 value4 data5 value5 file3: ----- data1 value1 data4 value4 (3 Replies)
Discussion started by: Sharmila_P
3 Replies

3. UNIX for Dummies Questions & Answers

modifying ls() to support the display of file sequences?

Hi there, I'm new to the board and I did try a search, but couldn't quite find what I was looking for. I deal in mostly large sets of sequential files, usually images. I was wondering if someone has modified the standard ls() command, or created another command that would display standardly... (9 Replies)
Discussion started by: Dr_Flambe
9 Replies

4. Shell Programming and Scripting

Removing low frequency sequences

If I have a file with the following information And I would like to remove all the sequences with Freq less than 3, so I end up having the following file: I am currently using awk to accomplish this task but I am not getting the results I actually want. Any help will be greatly appreciated. (3 Replies)
Discussion started by: Xterra
3 Replies

5. Shell Programming and Scripting

Removing specific sequences from file

My file looks like this But I need to remove the entry with the identifier >Reference1 along with the entire sequence. Thus, I will end up having the following file Thanks in advance! (2 Replies)
Discussion started by: Xterra
2 Replies

6. Shell Programming and Scripting

Removing repeates sequences

Hai, How to remove the repeated 'Chr's in different sequences. In the given example, Chr19 is repeated in two samples with the same number i.e. +52245923. How to remove one of the entry in any of the samples and to give the range for each Chr which is -20 for minimum range value and +120 for... (1 Reply)
Discussion started by: hravisankar
1 Replies

7. Shell Programming and Scripting

Removing duplicate terms in a file

Hi everybody I have a .txt file that contains some assembly code for optimizing it i need to remove some replicated parts. for example I have:e_li r0,-1 e_li r25,-1 e_lis r25,0000 add r31, r31 ,r0 e_li r28,-1 e_lis r28,0000 add r31, r31 ,r0 e_li r28,-1 ... (3 Replies)
Discussion started by: Behrouzx77
3 Replies

8. UNIX for Dummies Questions & Answers

Removing a set of Duplicate lines from a file

Hi, How do i remove a set of duplicate lines from a file. My file contains the lines: abc def ghi abc def ghi jkl mno pqr jkl mno (1 Reply)
Discussion started by: raosr020
1 Replies

9. Shell Programming and Scripting

Removing Duplicate Rows in a file

Hello I have a file with contents like this... Part1 Field2 Field3 Field4 (line1) Part2 Field2 Field3 Field4 (line2) Part3 Field2 Field3 Field4 (line3) Part1 Field2 Field3 Field4 (line4) Part4 Field2 Field3 Field4 (line5) Part5 Field2 Field3 Field4 (line6) Part2 Field2 Field3 Field4... (7 Replies)
Discussion started by: ekbaazigar
7 Replies

10. Shell Programming and Scripting

How to remove escape sequences from a text file?

Hello friends, Could anyone please advise on how to remove escape sequences from a text file? $ file input.txt input.txt: ASCII English text, with escape sequences I'm able to see those escape characters when opened in vi editor like shown below: ^ but not when I run more... (6 Replies)
Discussion started by: magnus29
6 Replies
ppmquantall(1)						      General Commands Manual						    ppmquantall(1)

NAME
ppmquantall - run ppmquant on a bunch of files all at once, so they share a common colormap SYNOPSIS
ppmquantall [-ext extension] ncolors ppmfile ... DESCRIPTION
Takes a bunch of portable pixmap as input. Chooses ncolors colors to best represent all of the images, maps the existing colors to the new ones, and overwrites the input files with the new quantized versions. If you don't want to overwrite your input files, use the -ext option. The output files are then named the same as the input files, plus a period and the extension text you specify. Verbose explanation: Let's say you've got a dozen pixmaps that you want to display on the screen all at the same time. Your screen can only display 256 different colors, but the pixmaps have a total of a thousand or so different colors. For a single pixmap you solve this problem with ppmquant; this script solves it for multiple pixmaps. All it does is concatenate them together into one big pixmap, run ppmquant on that, and then split it up into little pixmaps again. (Note that another way to solve this problem is to pre-select a set of colors and then use ppmquant's -map option to separately quantize each pixmap to that set.) SEE ALSO
ppmquant(1), ppm(5) AUTHOR
Copyright (C) 1991 by Jef Poskanzer. 27 July 1990 ppmquantall(1)
All times are GMT -4. The time now is 12:33 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy