Thanks for your reply. I might not fully explain what I want. I am interested in values in Mouse column which are unique (in a row) when compared to other species. I don't see that being address in your code.
Thanks for your fast reply. This might be confusing, but I am more interested in the value underneath the Mouse i.e 1/1 or whenever its unique to other species and when it does then I will be interested in column such as Chrom and POS REF ALT.
I need a help with extracting data from tab delimited file which look like this
I am only interested to what is unique to the mouse when compared the other species (the actual file have more species).
The desired output
I will be grateful for your help
N.B the data file is 5 Gb in size.
Thanks
You say that the file is tab delimited, but there are single space characters (rather than tabs) between fields in your sample file. Which is it?
How long is the longest line in your file? What operating system are you using and what is the LINE_MAX limit on your system. I.e. what is the output from the commands:
Are you saying that you want to print lines where the contents of the 9th field on the line is different from the contents of the 6th, 7th, 8th, and 10th fields? Is it always the 9th field that matters, is it always the field with the label Mouse in the 1st line in the file that matters, or is there some other way that your will let your script know which field matters?
Are the 1st five fields always ignored when comparing fields, or do the fields to be ignored vary?
Do you really want to print the entire line, or do you just want to print the 1st (#CHROM), 2nd (POS), 4th (REF), and 5th (ALT) fields from lines with unique Mouse data as indicated in your last message? Are those fields always in the same columns?
This User Gave Thanks to Don Cragun For This Post:
Hi,
I have multiple files that each contain four columns of strings:
File1:
Code:
123 abc gfh 273
456 ddff jfh 837
789 ghi u4u 395
File2:
Code:
123 abc dd fu
456 def 457 nd
891 384 djh 783
I want to compare the strings in Column 1 of File 1 with each other file and Print in... (3 Replies)
I would like to merge two tables based on column 1:
File 1:
1 today
1 green
2 tomorrow
3 red
File 2:
1 a lot
1 sometimes
2 at work
2 at home
2 sometimes
3 new
4 a lot
5 sometimes
6 at work (4 Replies)
Data file example
I look for primary and * to isolate the interesting slot number.
slot=`sed '/^primary$/,/\*/!d' filename | tail -1 | sed s'/*//' | awk '{print $1" "$2}'`
Now I want to get the Touch line for only the associate slot number, in this case, because the asterisk... (2 Replies)
Hi, I have tab-deliminated data similar to the following:
dot is-big 2
dot is-round 3
dot is-gray 4
cat is-big 3
hot in-summer 5
I want to count the frequency of each individual "unique" value in the 1st column. Thus, the desired output would be as follows:
dot 3
cat 1
hot 1
is... (5 Replies)
Hello friends,
I have a text file with many columns (no. columns vary from row to row) separated by space. I need to collect all the values from 18th column to the end from each line and group them as pairs and then numbering like below..
1. 18th-col-value 19th-col-value 2. 20th-col-value ... (5 Replies)
Is it possible to modify file like this.
1. Remove all the duplicate names in a define column i.e 4th col
2. Count the no.of unique names separated by ";" and print as a 5th col
thanx in advance!!
Q
input
c1 30 3 Eh2
c10 96 3 Frp
c41 396 3 Ua5;Lop;Kol;Kol
c62 2 30 Fmp;Fmp;Fmp
... (5 Replies)
Dear all,
Greetings.
I would like to ask for your help to extract lines with specific words in addition 2 lines before and after these lines by using awk or sed.
For example, the input file is:
1 ak1 abc1.0
1 ak2 abc1.0
1 ak3 abc1.0
1 ak4 abc1.0
1 ak5 abc1.1
1 ak6 abc1.1
1 ak7... (7 Replies)
Hi all
I have a file which looks like this
1234|1|Jon|some text|some text
1234|2|Jon|some text|some text
3453|5|Jon|some text|some text
6533|2|Kate|some text|some text
4567|3|Chris|some text|some text
4567|4|Maggie|some text|some text
8764|6|Maggie|some text|some text
My third column is my... (9 Replies)
Hi All,
Below is the sample data of my files:
O|A|571000689|D|S|PNH|S|SI
sadm|ibscml1x|
I|A|571000689|P|S|PNH|S|SI
sadm|ibscml1x|
O|A|571000689|V|S|PNH|S|SI
sadm|ibscml1x|
S|C|CAM|D|S|PNH|R|ZOA|2004
bscml1x| ... (3 Replies)