09-18-2012
Parse tab delimited file, check condition and delete row
I am fairly new to programming and trying to resolve this problem. I have the file like this.
CHROM POS REF ALT 10_sample.bam 11_sample.bam 12_sample.bam 13_sample.bam 14_sample.bam 15_sample.bam 16_sample.bam |
tg93 77 T C T T T T T |
tg93 79 C - C C C - - |
tg93 79 C G C C C C G C |
tg93 80 G A G G G G A A G |
tg93 81 A C A A A A C C C |
tg93 86 C A C C A A A A C |
tg93 105 A G A A A A A G A |
tg93 108 A G A A A A G A A |
tg93 114 T C T T T T T C T |
tg93 131 A C A A A A A A A |
tg93 136 G C C G C C G G G |
tg93 150 CTCTC - CTCTC - CTCTC CTCTC |
In this file, in the heading
CHROM - name POS - position REF - reference ALT - alternate 10 - 16_sample.bam - samplesd I Now i wanted to see how many times the letter in REF and ALT column occured. If either of them is repeated less than two times, i need to delete that row. For example In the first row, i have 'T' in REF and 'C' in ALT . I see in 7 samples, there are 5 T's and 2 blanks and no C. So i need to delete this row. In Second row, REF is 'C' and Alt is '-'. Now in seven samples we have 3 C's, 2 '-'s and 2 blanks. So we keep this row as C and - have repeated more than 2 times. Always we ignore the blanks while counting
The final file after filtering is
#CHROM POS REF ALT 10_sample.bam 11_sample.bam 12_sample.bam 13_sample.bam 14_sample.bam 15_sample.bam 16_sample.bam |
tg93 79 C - C C C - - |
tg93 80 G A G G G G A A G |
tg93 81 A C A A A A C C C |
tg93 86 C A C C A A A A C |
tg93 108 A G A A A A G A A |
tg93 136 G C C G C C G G G |
I am able to read the columns in to arrays and display them in the code but i am not sure how to start the loops to read the base and count their occurences and remain the column. Can anyone tell me how i should be proceeding with this? Or it will be helpful if you have any example code i can modify up on.
Thank you for the help !!
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi All,
Please help me out with a script which checks whether a given file say abc.txt is in ASCII format and data is tab-delimited. If the condition doesn't satisfy then it should generate error code "100" for file not in ASCII format and "105" if it is not in tab-delimited format.
If the... (9 Replies)
Discussion started by: Mandab
9 Replies
2. Shell Programming and Scripting
I would like to remove characters from column 7 so that from an input file looking like this:
>HWI-EAS422_12:4:1:69:89 GGTTTAAATATTGCACAAAAGGTATAGAGCGT U0 1 0 0 ref_chr8.fa 6527777 F DD
I get something like that in an output file:
... (13 Replies)
Discussion started by: matlavmac
13 Replies
3. Shell Programming and Scripting
I have a large text-file with tab-delimited genetic data that looks like:
KSC112 KSC234 0 0 1 1 A G C T
I simply wan to delete the first column, but since the file has 600 000 columns, it is not possible with awk (seems to be limited at 32k columns).
Does anyone have an idea how to do this? (2 Replies)
Discussion started by: andmal
2 Replies
4. UNIX for Dummies Questions & Answers
How do you delete cells from a space delimited text file given row and column number? Letś say the row number is r and the column number is c. Thanks! (5 Replies)
Discussion started by: evelibertine
5 Replies
5. UNIX for Dummies Questions & Answers
Hello gurus,
I have a file in a tab delimited format and a header row. I need a code to delete the header in the file, and convert the file to a fixed width format, with all the columns aligned. Below is a sample of the file:... (4 Replies)
Discussion started by: chumsky
4 Replies
6. Shell Programming and Scripting
Hi,
Can anyone please tell me about how we can delete an entire column from a tab delimited file?
Mu input_file.txt looks like this:
And I want the output as:
I used the below code
nawk -v d="1" 'BEGIN{FS=OFS="\t"}{$d=""}{print}' input_file.txtBut in the output, the first column is... (5 Replies)
Discussion started by: sampoorna
5 Replies
7. Shell Programming and Scripting
Hi all ,
I have a file having 12 columns tab delimited .
I need to read this file and remove the column 3 and column 4 and insert a word in column 3 as "AVIALABLE "
Is there a way to do this . I am trying like below
Thanks
DJ
cat $FILENAME|awk -F"\t" '{ print $1 "\t... (3 Replies)
Discussion started by: Hypesslearner
3 Replies
8. UNIX for Dummies Questions & Answers
Hi, I have a rquirement in unix as below .
I have a text file with me seperated by | symbol and i need to generate a excel file through unix commands/script so that each value will go to each column.
ex:
Input Text file:
1|A|apple
2|B|bottle
excel file to be generated as output as... (9 Replies)
Discussion started by: raja kakitapall
9 Replies
9. UNIX for Beginners Questions & Answers
Hi there,
I would like to use awk to reformat a tab-delimited file containing three columns as follows:
Data file:
sample 1 173
sample 269 530
sample 687 733
sample 1699 1779
Desired output file:
sample 174..265, 531..686, 734..1698
I need the value... (5 Replies)
Discussion started by: emiley
5 Replies
10. UNIX for Beginners Questions & Answers
Hello Everyone..
I want to replace the retail col from FileI with cstp1 col from FileP if the strpno matches in both files
FileP.txt
... (2 Replies)
Discussion started by: YogeshG
2 Replies
LEARN ABOUT LINUX
git-symbolic-ref
GIT-SYMBOLIC-REF(1) Git Manual GIT-SYMBOLIC-REF(1)
NAME
git-symbolic-ref - Read, modify and delete symbolic refs
SYNOPSIS
git symbolic-ref [-m <reason>] <name> <ref>
git symbolic-ref [-q] [--short] <name>
git symbolic-ref --delete [-q] <name>
DESCRIPTION
Given one argument, reads which branch head the given symbolic ref refers to and outputs its path, relative to the .git/ directory.
Typically you would give HEAD as the <name> argument to see which branch your working tree is on.
Given two arguments, creates or updates a symbolic ref <name> to point at the given branch <ref>.
Given --delete and an additional argument, deletes the given symbolic ref.
A symbolic ref is a regular file that stores a string that begins with ref: refs/. For example, your .git/HEAD is a regular file whose
contents is ref: refs/heads/master.
OPTIONS
-d, --delete
Delete the symbolic ref <name>.
-q, --quiet
Do not issue an error message if the <name> is not a symbolic ref but a detached HEAD; instead exit with non-zero status silently.
--short
When showing the value of <name> as a symbolic ref, try to shorten the value, e.g. from refs/heads/master to master.
-m
Update the reflog for <name> with <reason>. This is valid only when creating or updating a symbolic ref.
NOTES
In the past, .git/HEAD was a symbolic link pointing at refs/heads/master. When we wanted to switch to another branch, we did ln -sf
refs/heads/newbranch .git/HEAD, and when we wanted to find out which branch we are on, we did readlink .git/HEAD. But symbolic links are
not entirely portable, so they are now deprecated and symbolic refs (as described above) are used by default.
git symbolic-ref will exit with status 0 if the contents of the symbolic ref were printed correctly, with status 1 if the requested name is
not a symbolic ref, or 128 if another error occurs.
GIT
Part of the git(1) suite
Git 1.8.5.3 01/14/2014 GIT-SYMBOLIC-REF(1)