Hi All,
I have a problem with the sort and duplicate filter command I am using in one of my scripts. I have a '|' delimited file and want to sort and remove duplicates on the 1,2,15 fields. These fields constitute the primary key of the table I will be loading the data into. But I see that some... (4 Replies)
Hi ,
I have 5 columns total and am wanting to search lines in columns 3-5 and basically grep -v patterns that match 'BBB_0123' 'BVG_0895' 'BSD_0987'
Does anyone know how to do this? I tried combining grep -v with grep -e but, it didn't work.
Thanks! (5 Replies)
I have a file which consists of 1000 entries. Out of 1000 entries i have 500 Duplicate Entires. I want to remove the first Duplicate Entry (i,e entire Line) in the File.
The example of the File is shown below:
8244100010143276|MARISOL CARO||MORALES|HSD768|CARR 430 KM 1.7 ... (1 Reply)
Hi,
I wish to use a column, as inputted by a user from command line, for pattern matching.
awk file:
{
if($1 ~ /^8/)
{
print $0> "temp2.csv"
}
}
something like this, but i want '$1' to be any column as selected by the user from command line.
... (1 Reply)
Hello Gurus,
I need to remove lines within a file if it contains specific criteria. Here is what I am trying to resolve:
Users of AppRuntime: (Total of 10 licenses issued; Total of 6 licenses in use)
buih02 dsktp501 AppGui 1 (compute_lic/27006 3122), start Mon 2/22 7:58
dingj1... (3 Replies)
I have an input file of 5GB which contains duplicate records and have to remove duplicate records by retaing first instance of that record .
Based on 5 fields the duplicates has to be removed .
Kindly request to help me in writing a Unix Script.
Thanks
Asim (11 Replies)
Hello Help,
2356798 7689867 999 000
123678 20385907 9797 666
17978975 87468976 968978 98798
I am trying to have out put which actually look for the third column value of 9797 and then it insert line there after with first, second column value exactly as the previous line and replace the third... (3 Replies)
I have a script that builds a database ~30 million lines, ~3.7 GB .cvs file. After multiple optimzations It takes about 62 min to bring in and parse all the files and used to take 10 min to remove duplicates until I was requested to add another column. I am using the highly optimized awk code:
awk... (34 Replies)
Hi There,
I have an I/P which looks like --
1 2 3 4 5
1 2 3 4 6
4 7 8 9 9
5 6 7 8 9
I would like O/P to be ---
1 2 3 4 5
1 2 3 4 6
So, printing only the consecutive lines where $1,$2,$3,$4 are matching.
Is there any command to do this or small awk script?
Thanks, (12 Replies)
Discussion started by: Indra2011
12 Replies
LEARN ABOUT V7
join
JOIN(1) General Commands Manual JOIN(1)NAME
join - relational database operator
SYNOPSIS
join [ options ] file1 file2
DESCRIPTION
Join forms, on the standard output, a join of the two relations specified by the lines of file1 and file2. If file1 is `-', the standard
input is used.
File1 and file2 must be sorted in increasing ASCII collating sequence on the fields on which they are to be joined, normally the first in
each line.
There is one line in the output for each pair of lines in file1 and file2 that have identical join fields. The output line normally con-
sists of the common field, then the rest of the line from file1, then the rest of the line from file2.
Fields are normally separated by blank, tab or newline. In this case, multiple separators count as one, and leading separators are dis-
carded.
These options are recognized:
-an In addition to the normal output, produce a line for each unpairable line in file n, where n is 1 or 2.
-e s Replace empty output fields by string s.
-jn m Join on the mth field of file n. If n is missing, use the mth field in each file.
-o list
Each output line comprises the fields specifed in list, each element of which has the form n.m, where n is a file number and m is a
field number.
-tc Use character c as a separator (tab character). Every appearance of c in a line is significant.
SEE ALSO sort(1), comm(1), awk(1)BUGS
With default field separation, the collating sequence is that of sort -b; with -t, the sequence is that of a plain sort.
The conventions of join, sort, comm, uniq, look and awk(1) are wildly incongruous.
JOIN(1)