Extracting only unique data between two columns


 
Thread Tools Search this Thread
Special Forums UNIX Desktop Questions & Answers Extracting only unique data between two columns
# 1  
Old 02-02-2012
Question Extracting only unique data between two columns

SmilieHi there,

I am trying to extract/filter a unique data between specific columns from a tab deliminated file, that has a number of columns:

input file as follow:
Code:
5       rs1       70        A        C       7       1       1        Blue
5       rs9       66        A        E       7        0       2 Green
1       rs0       41        B        R       3        0      0        Red
1       rs2        30        B        R       3        2      1       Red

I want to filter the common values on the 7th and 8th columns and extract/keep all the other strings that are unique in values between the 7th and the 8th columns. To be like this

Output
Code:
5       rs9       66        A        E       7        0       2 Green
1       rs2        30        B        R       3        2      1       Red

N.B the values in the 7th and 8th columns range are 0-1-2
# 2  
Old 02-05-2012
Code:
awk -F\\t '$7!=$8' input

---------- Post updated at 09:24 PM ---------- Previous update was at 09:14 PM ----------

Please note that the example you paste is not <tab> delimited, it contains a lot of space and only one tab just before the "Green".

You can reformate your input to change successions of spaces into one tab :

Code:
sed 's/  */        /g' input >input.tab_delimited

Note that there are 2 <space> before the wildcard.
Also press <Ctrl>+<V> keys before pressing the <tab> key so to be more clear here are the keys you will have tyo press:

Code:
's/<space><space>*/<Ctrl><V><tab>/g'

and then run your extraction on the properly formatted file :

Code:
awk -F\\t '$7!=$8' input.tab_delimited

This User Gave Thanks to ctsgnb For This Post:
# 3  
Old 02-17-2012
Quote:
Originally Posted by ctsgnb
Code:
awk -F\\t '$7!=$8' input

---------- Post updated at 09:24 PM ---------- Previous update was at 09:14 PM ----------

Please note that the example you paste is not <tab> delimited, it contains a lot of space and only one tab just before the "Green".

You can reformate your input to change successions of spaces into one tab :

Code:
sed 's/  */        /g' input >input.tab_delimited

Note that there are 2 <space> before the wildcard.
Also press <Ctrl>+<V> keys before pressing the <tab> key so to be more clear here are the keys you will have tyo press:

Code:
's/<space><space>*/<Ctrl><V><tab>/g'

and then run your extraction on the properly formatted file :

Code:
awk -F\\t '$7!=$8' input.tab_delimited


Thanks ctsgnb,

Even though i manage to do it but in excel. I will try your code and let you know.

Cheers
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extracting data from specific rows and columns from multiple csv files

I have a series of csv files in the following format eg file1 Experiment Name,XYZ_07/28/15, Specimen Name,Specimen_001, Tube Name, Control, Record Date,7/28/2015 14:50, $OP,XYZYZ, GUID,abc, Population,#Events,%Parent All Events,10500, P1,10071,95.9 Early Apoptosis,1113,11.1 Late... (6 Replies)
Discussion started by: pawannoel
6 Replies

2. Shell Programming and Scripting

Extracting unique values of a column from a feed file

Hi Folks, I have the below feed file named abc1.txt in which you can see there is a title and below is the respective values in the rows and it is completely pipe delimited file ,. ... (4 Replies)
Discussion started by: punpun66
4 Replies

3. Shell Programming and Scripting

Print unique records in 2 columns using awk

Is it possible to print the records that has only 1 value in 2nd column. Ex: input awex1 1 awex1 2 awex1 3 assww 1 ader34 1 ader34 2 output assww 1 (5 Replies)
Discussion started by: quincyjones
5 Replies

4. Shell Programming and Scripting

How to merge columns into lines, using unique keys?

I would really appreciate a sulution for this : invoice# client# 5929 231 4358 231 2185 231 6234 231 1166 464 1264 464 3432 464 1720 464 9747 464 1133 791 4930 791 5496 791 6291 791 8681 989 3023 989 (2 Replies)
Discussion started by: hemo21
2 Replies

5. Shell Programming and Scripting

Extracting several lines of text after a unique string

I'm attempting to write a script to identify users who have sudo access on a server. I only want to extract the ID's of the sudo users after a unique line of text. The list of sudo users goes to the EOF so I only need the script to start after the unique line of text. I already have a script to... (1 Reply)
Discussion started by: bouncer
1 Replies

6. Shell Programming and Scripting

awk : extracting unique lines based on columns

Hi, snp.txt CHR_A SNP_A BP_A_st BP_A_End CHR_B BP_B SNP_B R2 p-SNP_A p-SNP_B 5 rs1988728 74904317 74904318 5 74960646 rs1427924 0.377333 0.000740085 0.013930081 5 ... (12 Replies)
Discussion started by: genehunter
12 Replies

7. Shell Programming and Scripting

Extracting Text Between Two Unique Lines

Hi all! Im trying to extract a portion of text from a file and put it into a new file. I need all the lines between <Placement> and </Placement> including the Placemark lines themselves. Is there a way to extract all instances of these and not just the first one found? I've tried using sed and... (4 Replies)
Discussion started by: Grizzly
4 Replies

8. Shell Programming and Scripting

extracting unique lines from text file

I have a file with 14million lines and I would like to extract all the unique lines from the file into another text file. For example: Contents of file1 happy sad smile happy funny sad I want to run a command against file one that only returns the unique lines (ie 1 line for happy... (3 Replies)
Discussion started by: soliberus
3 Replies

9. Shell Programming and Scripting

Extracting records with unique fields from a fixed width txt file

Greetings, I would like to extract records from a fixed width text file that have unique field elements. Data is structured like this: John A Smith NY Mary C Jones WA Adam J Clark PA Mary Jones WA Fieldname / start-end position Firstname 1-10... (8 Replies)
Discussion started by: sitney
8 Replies
Login or Register to Ask a Question