Compare Tab Separated Field with AWK to all and print lines of unique fields.
Hi.
I have a tab separated file that has a couple nearly identical lines. When doing:
Code:
sort file | uniq > file.new
It passes through the nearly identical lines because, well, they still are unique.
a)
I want to look only at field x for uniqueness and if the content in field x is the same as field x in any other line, move all the duplicate lines to a new file called file.duplicates.
b)
I also want to be able to look only at field x for uniqueness and if the content in field x is the same as field x in any other line, remove the following lines with the duplicate field x.
witam
potrzebuje polecenia porownujacego koumny na podstawie n-ostatnich znakow danej linnijki tj
mam 2 koumny AiB zawierajace ciag dowolnych znakow (dlugosci w kazdej linijce mga byc rozne wiec uzycie substra odpada)
A B
ewewewabc nbgujnnabc... (3 Replies)
Hi friends,
I have multiple files. For now, let's say I have two of the following style
cat 1.txt
cat 2.txt
output.txt
Please note that my files are not sorted and in the output file I need another extra column that says the file from which it is coming. I have more than 100... (19 Replies)
Hi,
Is there any short method to print from a particular field till another filed using awk?
Example File:
File1
====
1|2|acv|vbc|......|100|342
2|3|afg|nhj|.......|100|346
Expected output:
File2
====
acv|vbc|.....|100
afg|nhj|.....|100 (8 Replies)
Hi experts,
I need to print the first field first then last two fields should come next and then i need to print rest of the fields.
Input :
a1,abc,jsd,fhf,fkk,b1,b2
a2,acb,dfg,ghj,b3,c4
a3,djf,wdjg,fkg,dff,ggk,d4,d5
Expected output:
a1,b1,b2,abc,jsd,fhf,fkk... (6 Replies)
I am trying to re-format a .csv file using awk. I have 6 fields in the .csv file. Some of the fields are enclosed in double quotes and contain comma's inside the quotes. awk is breaking this into multiple fields.
Sample lines from the .csv file:
Device Name,Personnel,Date,Solution... (1 Reply)
I am trying to use awk to print the unique entries in $2
So in the example below there are 3 lines but 2 of the lines match in $2 so only one is used in the output.
File.txt
chr17:29667512-29667673 NF1:exon.1;NF1:exon.2;NF1:exon.38;NF1:exon.4;NF1:exon.46;NF1:exon.47 703.807... (5 Replies)
Trying to print the unique values in $2 before the -, currently the count is displayed. Hopefully, the below is close. Thank you :).
file
chr2:46603668-46603902 EPAS1-902|gc=54.3 253.1
chr2:211471445-211471675 CPS1-1205|gc=48.3 264.7
chr19:15291762-15291983 NOTCH3-1003|gc=68.8 195.8... (3 Replies)
In the awk below I am trying to print the entire line, along with the header row, if $2 is SNV or MNV or INDEL. If that condition is met or is true, and $3 is less than or equal to 0.05, then in $7 the sub pattern :GMAF= is found and the value after the = sign is checked. If that value is less than... (0 Replies)
For some reason I am having difficulty performing what should be a fairly easy task. I would like to print lines of a file that have a unique value in the first field. For example, I have a large data-set with the following excerpt:
PS003,001 MZMWR/ L-DWD// *
PS003,001... (4 Replies)
Discussion started by: jvoot
4 Replies
LEARN ABOUT REDHAT
uniq
UNIQ(1) FSF UNIQ(1)NAME
uniq - remove duplicate lines from a sorted file
SYNOPSIS
uniq [OPTION]... [INPUT [OUTPUT]]
DESCRIPTION
Discard all but one of successive identical lines from INPUT (or standard input), writing to OUTPUT (or standard output).
Mandatory arguments to long options are mandatory for short options too.
-c, --count
prefix lines by the number of occurrences
-d, --repeated
only print duplicate lines
-D, --all-repeated[=delimit-method] print all duplicate lines
delimit-method={none(default),prepend,separate} Delimiting is done with blank lines.
-f, --skip-fields=N
avoid comparing the first N fields
-i, --ignore-case
ignore differences in case when comparing
-s, --skip-chars=N
avoid comparing the first N characters
-u, --unique
only print unique lines
-w, --check-chars=N
compare no more than N characters in lines
--help display this help and exit
--version
output version information and exit
A field is a run of whitespace, then non-whitespace characters. Fields are skipped before chars.
AUTHOR
Written by Richard Stallman and David MacKenzie.
REPORTING BUGS
Report bugs to <bug-coreutils@gnu.org>.
COPYRIGHT
Copyright (C) 2002 Free Software Foundation, Inc.
This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICU-
LAR PURPOSE.
SEE ALSO
The full documentation for uniq is maintained as a Texinfo manual. If the info and uniq programs are properly installed at your site, the
command
info uniq
should give you access to the complete manual.
uniq (coreutils) 4.5.3 February 2003 UNIQ(1)