👤
Home Man
Search
Today's Posts
Register

If you're not sure where to post a Unix or Linux question, post it here. All unix and Linux beginners welcome in this forum!

Print lines based upon unique values in Nth field

Tags
awk, sort, uniq

👤 Login to reply

 
Thread Tools Search this Thread
# 1  
Old 12-28-2017
Print lines based upon unique values in Nth field

For some reason I am having difficulty performing what should be a fairly easy task. I would like to print lines of a file that have a unique value in the first field. For example, I have a large data-set with the following excerpt:

Code:
PS003,001 MZMWR/ L-DWD// *
PS003,001 B-!!BRX[/+W M(N-PN(H/J >BCLWM// BN/+W *
PS004,001 L-(H-1M]]NYX[/ B-NGJN(H/WT MZMWR/ L-DWD// *
PS005,001 L-(H-1M]]NYX[/ >L H-NXJLWT/(W(T MZMWR/ L-DWD// *
PS006,001 L-(H-1M]]NYX[/ B-NGJN(H/WT <L H-CMJNJ/T MZMWR/ L-DWD// *
PS007,001 CGJWN/ L-DWD// >CR C(JR[ L-JHWH// *
PS007,001 <L DBR/J KWC=// BN/ JMJNJ/ *
PS008,001 L-(H-1M]]NYX[/ <L H-GTJT/ MZMWR/ L-DWD// *
PS009,001 L-(H-1M]]NYX[/ <LMWT/ L-(H-BN/ MZMWR/ L-DWD// *
PS011,001 L-(H-1M]]NYX[/ L-DWD// B-JHWH// XS)HJ[TJ >JK !T!>MR[W L-NPC/+J *
PS011,001 !!NWD[)JW HR/+KM YPWR/ *

The output I desire is this:
Code:
PS004,001 L-(H-1M]]NYX[/ B-NGJN(H/WT MZMWR/ L-DWD// *
PS005,001 L-(H-1M]]NYX[/ >L H-NXJLWT/(W(T MZMWR/ L-DWD// *
PS006,001 L-(H-1M]]NYX[/ B-NGJN(H/WT <L H-CMJNJ/T MZMWR/ L-DWD// *
PS008,001 L-(H-1M]]NYX[/ <L H-GTJT/ MZMWR/ L-DWD// *
PS009,001 L-(H-1M]]NYX[/ <LMWT/ L-(H-BN/ MZMWR/ L-DWD// *

I have attempted 'sort' with appropriate flags which should work, but for some reason I cannot get it to. For example:

Code:
sort -u -k1,1

I have also tried an 'awk' solution:

Code:
awk '!a[$1]++'

Both of the latter seem to give me the first of the two repeated values in $1, such as:

Code:
PS003,001 MZMWR/ L-DWD// *
PS004,001 L-(H-1M]]NYX[/ B-NGJN(H/WT MZMWR/ L-DWD// *
PS005,001 L-(H-1M]]NYX[/ >L H-NXJLWT/(W(T MZMWR/ L-DWD// *
PS006,001 L-(H-1M]]NYX[/ B-NGJN(H/WT <L H-CMJNJ/T MZMWR/ L-DWD// *
PS007,001 CGJWN/ L-DWD// >CR C(JR[ L-JHWH// *
PS008,001 L-(H-1M]]NYX[/ <L H-GTJT/ MZMWR/ L-DWD// *
PS009,001 L-(H-1M]]NYX[/ <LMWT/ L-(H-BN/ MZMWR/ L-DWD// *
PS011,001 L-(H-1M]]NYX[/ L-DWD// B-JHWH// XS)HJ[TJ >JK !T!>MR[W L-NPC/+J *

However, this is not correct. Any help would be greatly appreciated.
# 2  
Old 12-28-2017
With a little more work, you can do this in awk, without the sort, but:
Code:
$ awk '{A[$1]++; L[$1]=$0} END { for( a in A ) if( A[a] == 1 ) print L[a] }' file | sort
PS004,001 L-(H-1M]]NYX[/ B-NGJN(H/WT MZMWR/ L-DWD// *
PS005,001 L-(H-1M]]NYX[/ >L H-NXJLWT/(W(T MZMWR/ L-DWD// *
PS006,001 L-(H-1M]]NYX[/ B-NGJN(H/WT <L H-CMJNJ/T MZMWR/ L-DWD// *
PS008,001 L-(H-1M]]NYX[/ <L H-GTJT/ MZMWR/ L-DWD// *
PS009,001 L-(H-1M]]NYX[/ <LMWT/ L-(H-BN/ MZMWR/ L-DWD// *

(noting that PS004,001, not just PS004 counts as $1, as , is not a field separator)
# 3  
Old 12-28-2017
Thanks Scott! Worked like a charm!
# 4  
Old 12-28-2017
Depending on your uniq version, this might also work:
Code:
uniq -uw5 file
PS004,001 L-(H-1M]]NYX[/ B-NGJN(H/WT MZMWR/ L-DWD// *
PS005,001 L-(H-1M]]NYX[/ >L H-NXJLWT/(W(T MZMWR/ L-DWD// *
PS006,001 L-(H-1M]]NYX[/ B-NGJN(H/WT <L H-CMJNJ/T MZMWR/ L-DWD// *
PS008,001 L-(H-1M]]NYX[/ <L H-GTJT/ MZMWR/ L-DWD// *
PS009,001 L-(H-1M]]NYX[/ <LMWT/ L-(H-BN/ MZMWR/ L-DWD// *

# 5  
Old 01-03-2018
Provided the input file is sorted on column 1, the following awk works as well and does not consume much memory (especially with a big input file)
Code:
awk '
{
  if ($1!=p1) {
    if (c==1) print p0
    c=0
  }
  c++
  p1=$1; p0=$0
}
END {
  if (c==1) print p0
}
' file

👤 Login to reply

« Previous Thread | Next Thread »
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
awk to print lines based on text in field and value in two additional fields cmccabe Shell Programming and Scripting 0 07-10-2017 08:53 AM
Print count of unique values H squared Shell Programming and Scripting 3 08-26-2016 06:13 AM
awk to print unique text in field before hyphen cmccabe Shell Programming and Scripting 3 12-22-2015 05:19 AM
awk to print unique text in field cmccabe Shell Programming and Scripting 5 10-09-2015 07:54 AM
Print unique lines without sort or unique cokedude UNIX for Dummies Questions & Answers 7 09-18-2013 07:14 PM
awk - printing nth field based on parameter krishmaths Shell Programming and Scripting 5 06-18-2013 04:47 AM
Print Nth to last field RECrerar UNIX for Dummies Questions & Answers 8 11-10-2012 05:25 PM
How to Print from nth field to mth fields using awk machomaddy Shell Programming and Scripting 8 03-02-2012 09:41 AM
Compare Tab Separated Field with AWK to all and print lines of unique fields. rocket_dog Shell Programming and Scripting 1 05-26-2011 08:03 PM
Find top N values for field X based on field Y's value FrancoisCN Shell Programming and Scripting 1 05-29-2009 09:57 AM


All times are GMT -4. The time now is 01:30 AM.

Unix & Linux Forums Content CopyrightŠ1993-2018. All Rights Reserved.
×
UNIX.COM Login
Username:
Password:  
Show Password





Not a Forum Member?
Forgot Password?