Sponsored Content
Full Discussion: Parsing the test file
Top Forums Shell Programming and Scripting Parsing the test file Post 302901351 by BCW_123 on Tuesday 13th of May 2014 12:56:49 PM
Old 05-13-2014
Parsing the test file

Hello,

I want to retrieve the rows with uniq count(column 4) for every *ref gene(column 7) on the basis of strand(column8 ) and tss(column 5).
If a ref gene has same number of count and it is on negative strand then keep the row with its highest tss and likewise*
If a ref gene has same number of count and it is on positive strand then keep the row with its lowest tss

I am working on the dat of format:
Code:
CHR	TSS-25bp	TSS+25bp	count 	tss	Ensemble transcript	refgene	strand
chr15	79554474	79554524	2	79554499	ENSMUST00000089311	Sun2	-
chr15	79554475	79554525	2	79554500	ENSMUST00000100439	Sun2	-
chr15	79554477	79554527	2	79554502	ENSMUST00000046259	Sun2	-
chr15	79569054	79569104	1	79569079	ENSMUST00000159660	Sun2	-
chr15	79570243	79570293	4	79570268	ENSMUST00000160355	Sun2	-
chr17	44914075	44914125	2	44914100	ENSMUST00000050630	Supt3h	+
chr17	44914248	44914298	3	44914273	ENSMUST00000130623	Supt3h	+
chr17	44914319	44914369	3	44914344	ENSMUST00000127798	Supt3h	+
chr11	87551028	87551078	2	87551053	ENSMUST00000152700	Supt4h1	+
chr11	87551029	87551079	2	87551054	ENSMUST00000141169	Supt4h1	+
chr7	29099891	29099941	2	29099916	ENSMUST00000003527	Supt5h	-
chr11	78020504	78020554	3	78020529	ENSMUST00000108314	Supt6h	-



I would expect this in the output:
Code:
CHR	TSS-25bp	TSS+25bp	count 	tss	Ensemble transcript	refgene	strand
chr15	79554477	79554527	2	79554502	ENSMUST00000046259	Sun2	-
chr15	79569054	79569104	1	79569079	ENSMUST00000159660	Sun2	-
chr15	79570243	79570293	4	79570268	ENSMUST00000160355	Sun2	-
chr17	44914075	44914125	2	44914100	ENSMUST00000050630	Supt3h	+
chr17	44914248	44914298	3	44914273	ENSMUST00000130623	Supt3h	+
chr11	87551028	87551078	2	87551053	ENSMUST00000152700	Supt4h1	+
chr7	29099891	29099941	2	29099916	ENSMUST00000003527	Supt5h	-
chr11	78020504	78020554	3	78020529	ENSMUST00000108314	Supt6h	-


So far I have this ,
Code:
Code:
#!/bin/bash

example=Workbook4.txt
for gene in `cut -f7 example | uniq`
** do
** sign=`grep $gene example | cut -f8 | uniq`
** for count in `grep $gene example | cut -f4 | sort | uniq`
** do
* * * if [ "$sign" == "-" ]
* * * then
* * * grep $gene example | grep $count example | sort -k5 | head -1 ----
* * * else
* * * grep $gene example | grep $count example | sort -k5 | tail -1 ----
** done
** break
done

]

I am not sure about the one in bold. It would be nice if you can help me solving this.*

Thanks for your time
Kirthi

Moderator's Comments:
Mod Comment Please use code tags next time for your code and data. Thanks

Last edited by BCW_123; 05-13-2014 at 02:16 PM.. Reason: code tags
 

7 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Parsing a configuration Test tile

Team I need help parsing a text file that meet the layout below: high:850:856:214:855:810 med:852:304:310 low:315:240:323:310 I need to read each line and if for example a line start with high in in that same line there is a 850 or any other number then I wan to print it. The same ohld true... (4 Replies)
Discussion started by: edpdgr
4 Replies

2. Shell Programming and Scripting

Finding & Moving Oldest File by Parsing/Sorting Date Info in File Names

I'm trying to write a script that will look in an /exports folder for the oldest export file and move it to a /staging folder. "Oldest" in this case is actually determined by date information embedded in the file names themselves. Also, the script should only move a file from /exports to... (6 Replies)
Discussion started by: nikosey
6 Replies

3. Shell Programming and Scripting

Parsing of file for Report Generation (String parsing and splitting)

Hey guys, I have this file generated by me... i want to create some HTML output from it. The problem is that i am really confused about how do I go about reading the file. The file is in the following format: TID1 Name1 ATime=xx AResult=yyy AExpected=yyy BTime=xx BResult=yyy... (8 Replies)
Discussion started by: umar.shaikh
8 Replies

4. Shell Programming and Scripting

Test on string containing spacewhile test 1 -eq 1 do read a $a if test $a = quitC then break fi d

This is the code: while test 1 -eq 1 do read a $a if test $a = stop then break fi done I read a command on every loop an execute it. I check if the string equals the word stop to end the loop,but it say that I gave too many arguments to test. For example echo hello. Now the... (1 Reply)
Discussion started by: Max89
1 Replies

5. Shell Programming and Scripting

How to check weather a string is like test* or test* ot *test* in if condition

How to check weather a string is like test* or test* ot *test* in if condition (5 Replies)
Discussion started by: johnjerome
5 Replies

6. Shell Programming and Scripting

Problem in test file operator on a ufsdump archive file mount nfs

Hi, I would like to ask if someone know how to test a files if exist the file is a nfs mount ufsdump archive file.. i used the test operator -f -a h almost all test operator but i failed file1=ufs_root_image.dump || echo "files doesn't exist && exit 1 the false file1 is working but... (0 Replies)
Discussion started by: jao_madn
0 Replies

7. Shell Programming and Scripting

Hit multiple URL from a text file and store result in other test file

Hi, I have a problem where i have to hit multiple URL that are stored in a text file (input.txt) and save their output in different text file (output.txt) somewhat like : cat input.txt http://192.168.21.20:8080/PPUPS/international?NUmber=917875446856... (3 Replies)
Discussion started by: mukulverma2408
3 Replies
All times are GMT -4. The time now is 09:09 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy