The purpose is to go over the same file twice, first to find the number of occurrence of field one. The second time only to print lines of which field 1 occurs exactly once..
-F,
Set the input field separator to comma
NR==FNR
When the first file is being read (only then are FNR and NR equal)
C[$1]++
create an (associative) array element with the first filed as the index and increment its value by 1
next
start reading the next record
C[$1]==1
(while reading the second file, which in this case is the first file for the second time) if the count is equal to 1, i.e. the total number of appearances of field 1 in the input file is 1 then print the record (line).
infile infile
read infile followed by infile
The same can be done without arrays and with only a single pass, but than the input file needs to be sorted on field 1:
In C[$1]++ does the $1 refer to field 1 or line 1? In C[$1]++ does it read threw all the fields or lines and then jump to C[$1]==1, or does it jump to C[$1]==1 after each increment? Does the 1 in C[$1]==1 mean true or something else?
What does an array like this mean? I've seen a few awk arrays like this.
Hi all,
I would like to extract records of a file based on a condition. The file contains 47 fields, and I would like to extract only those records that match a certain value in one of the columns, e.g.
COL1 COL2 COL3 ............... COL47
1 XX 45 ... (4 Replies)
I have huge txt file having millions of trade data.
For e.g
Trade.txt (first 8 lines in the file is header info)
COB_DATE,TRADE_ID,SOURCE_SYSTEM_TRADE_ID,TRADE_GROUP_ID,
TRADE_TYPE,DEALER_NAME,EXTERNAL_COUNTERPARTY_ID,
EXTERNAL_COUNTERPARTY_NAME,DB_COUNTERPARTY_ID,... (6 Replies)
Hi,
Need to find a duplicate records on the first column,
ANU4501710430989 0000000W20389390
ANU4501710430989 0000000W67065483
ANU4501130050520 0000000W80838713
ANU4501210170685 0000000W69246611... (3 Replies)
Can anyone help me to removing duplicate records from 2 separate files in UNIX?
Please find the sample records for both the files
cat Monday.dat
3FAHP0JA1AR319226MOHMED ATEK 966504453742 SAU2010DE
3LNHL2GC6AR636361HEA DEUK CHOI 821057314531 KOR2010LE
3MEHM0JG7AR652083MUTLAB NAL-NAFISAH... (4 Replies)
I am a newbie to shell scripting ..
I have a .csv file. It has 1000 some rows and about 7 columns...
but before I insert this data to a table I have to parse it and clean it ..basing on the value of the first column..which a string of phone number type...
example below..
column 1 ... (2 Replies)
Hi,
I want to remove duplicate records including the first line based on column1. For example
inputfile(filer.txt):
-------------
1,3000,5000
1,4000,6000
2,4000,600
2,5000,700
3,60000,4000
4,7000,7777
5,999,8888
expected output:
----------------
3,60000,4000
4,7000,7777... (5 Replies)
Hello, I'm trying to delete duplicates when there are more than 10 duplicates, based on the value of the first column.
e.g.
a 1
a 2
a 3
b 1
c 1
gives
b 1
c 1
but requires 11 duplicates before it deletes.
Thanks for the help
Video tutorial on how to use code tags in The UNIX... (11 Replies)
Hi,
I have tried to remove dublicate lines based on first column with pipe delimiter . but i ma not able to get some uniqu lines
Command : sort -t'|' -nuk1 file.txt
Input :
38376KZ|09/25/15|1.057
38376KZ|09/25/15|1.057
02006YB|09/25/15|0.859
12593PS|09/25/15|2.803... (2 Replies)
I have csv file with 30, 40 columns
Pasting just three column for problem description
I want to filter record if column 1 matches CN or DN then,
check for values in column 2 if column contain 1235, 1235 then in column 3 values must be sequence of 2345, 2345
and if column 2 contains 6789, 6789... (5 Replies)
Hi Experts,
I have csv file with 30, 40 columns
Pasting just 2 column for problem description.
Need to print error if below combination is not present in file
check for column-1 (DocumentNumber) and filter columns where value in DocumentNumber field is same.
For all such rows, the field... (7 Replies)
Discussion started by: as7951
7 Replies
LEARN ABOUT DEBIAN
ndselect
NDSELECT(1) User Commands NDSELECT(1)NAME
ndselect - select lines and fields for numdiff
DESCRIPTION
Usage: ndselect -h|--help|-v|--version or
ndselect [-b N][-e N][-s N][-F N][-L N][-I N][-S IFS][-x][-l PATH][-o PATH] [FILE]
Print to standard output a subset of lines and fields from a given file.
The argument after the options is the name of the file to read from. The complete path of the file should be given, a directory name is
not accepted. If no input file is specified, the program reads from the standard input.
Exit status: 0 in case of normal termination, -1 (255) in case of error
-b, --beginning, --start=N
Set to N the number of the first line to print (The default behavior is to start with line number 1)
-e, --end=N
Set to N the number of the last line that can be printed (The default behavior is to arrive till to the end of the file)
-s, --step=N
Set to N the increment to use when selecting the lines to print (The default value for the increment is 1)
-F, --first-field=N
Set to N the number of the first field to print (The default behavior is to start with field number 1)
-L, --last-field=N
Set to N the number of the last field that can be printed (The default behavior is to arrive till to the end of every line)
-I, --increment=N
Set to N the increment to use when selecting the fields to print (The default value for the increment is 1)
-S, --separator=IFS
Specify the set of characters to use to split the input lines into fields (The default set of characters is space, tab and newline).
-x, --omit-empty-lines
Do not print empty lines
-l, --warnings-to=PATH
Redirect warning and error messages from stderr to the indicated file
-o, --output=PATH
Redirect output from stdout to the indicated file
-h, --help
Show this help message
-v, --version
Show version number, Copyright, Distribution Terms and NO-Warranty
COPYRIGHT
Copyright (C) 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012 Ivano Primi <ivprimi@libero.it>
License GPLv3+: GNU GPL version 3 or later, see <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.
SEE ALSO
The full documentation for ndselect is maintained as a Texinfo manual. If the info and ndselect programs are properly installed at your
site, the command
info numdiff
should give you access to the complete manual.
ndselect 5.6.0 January 2012 NDSELECT(1)