08-28-2017
Are all of your .csv files sorted on field 38? If not, your code won't work. (You will get artificially high counts for the number of distinct values in a file because uniq produces a line of output for each case where a field 38 value changes from the value found on the previous line.)
Does each distinct value in field 38 of all of your files appear in only one or your input files? If not, your code won't work. (You don't have any way to determine which distinct values in a single file appear in one or more of the other files.)
Are you always processing 3 files?
Having 3 files of a megabyte each each should not cause any problem producing a single merged or sorted combined file. Why are you unable to merge them?
Why not just use a single awk script to read all of your files once and produce the output you want for each input file and for the combined input from all of the input files?
Do you really want the number of distinct field 38 values in each input file? Or, do you really just want the number of distinct field 38 values in the merged input files?
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi,
I have a challenging task,in which i have to find the duplicate files by its name and size,then i need to take anyone of the file.Then i need to open the file and find for more than one pattern and count of that pattern.
Note:These are the samples of two files,but i can have more... (2 Replies)
Discussion started by: jerome Sukumar
2 Replies
2. UNIX for Dummies Questions & Answers
Ok, another fun hiccup in my UNIX learning curve. I am trying to count the number of occurrences of an IP address across multiple files named example.hits. I can extract the number of occurrences from the files individually but when you use grep -c with multiple files you get the output similar to... (5 Replies)
Discussion started by: MrAd
5 Replies
3. Shell Programming and Scripting
Seems like can use awk and perl command. But I don't have the idea to write the command line. Thanks for all of your advise.
For example, if I have the file whose content are:
Sample 1. ATAGCAGAGGGAGTGAAGAGGTGGTGGGAGGGAGCT
Sample 2. ACTTTTATTTGAATGTAATATTTGGGACAATTATTC
Sample 3.... (1 Reply)
Discussion started by: patrick chia
1 Replies
4. Shell Programming and Scripting
how to count the total number of lines of all the files under a directory using perl script..
I mean if I have 10 files under a directory then I want to count the total number of lines of all the 10 files contain. Please help me in writing a perl script on this. (5 Replies)
Discussion started by: adityam
5 Replies
5. Shell Programming and Scripting
Please advice how can we search for a string say (abc) in multiple files and to get total occurrence of that searched string. (Need number of records that exits in period of time).
File look like this (read as filename.yyyymmdd)
a.20100101
b.20100108
c.20100115
d.20100122
e.20100129... (2 Replies)
Discussion started by: zooby
2 Replies
6. Shell Programming and Scripting
i want to find the no:of occurrences of a word in a file
cat 1.txt
unix script unix script
unix script unix script unix script unix script
unix script unix script unix
unix
script
unix script unix script now i want to find , how many times 'unix' was occurred
please help me
thanks... (6 Replies)
Discussion started by: mahesh1987
6 Replies
7. Shell Programming and Scripting
Hi Guys,
I need to write a script to compare the count of two csv files each having 5 columns.
Everyday a csv file is recived.
Now we need to compare the count of todays csv file with yesterday's csv file and if the total count of records is same in todays csv file and yesterday csv file out... (3 Replies)
Discussion started by: Vivekit82
3 Replies
8. Shell Programming and Scripting
Hi,
Very good wishes to all!
Please help to provide the shell script for generating the record counts in filed wise from the .csv file
My question:
Source file:
Field1 Field2 Field3
abc 12f sLm
1234 hjd 12d
Hyd 34
Chn
My target file should generate the .csv file with the... (14 Replies)
Discussion started by: Kirands
14 Replies
9. Shell Programming and Scripting
Hi,
I have a .dat file with contents like the below:
Input file
============SEQ NO-1: COLUMN1==========
9835619
7152815
============SEQ NO-2: COLUMN2 ==========
7615348
7015548
9373086
============SEQ NO-3: COLUMN3===========
9373086
Expected Output: (I just... (1 Reply)
Discussion started by: MS06
1 Replies
10. UNIX for Beginners Questions & Answers
Hello All,
just wanted to export multiple tables from oracle sql using unix shell script to csv file and the below code is exporting only the first table.
Can you please suggest why? or any better idea?
export FILE="/abc/autom/file/geo_JOB.csv"
Export= `sqlplus -s dev01/password@dEV3... (16 Replies)
Discussion started by: Hope
16 Replies
join(1) General Commands Manual join(1)
NAME
join - relational database operator
SYNOPSIS
[options] file1 file2
DESCRIPTION
forms, on the standard output, a join of the two relations specified by the lines of file1 and file2. If file1 or file2 is the standard
input is used.
file1 and file2 must be sorted in increasing collating sequence (see Environment Variables below) on the fields on which they are to be
joined; normally the first in each line.
The output contains one line for each pair of lines in file1 and file2 that have identical join fields. The output line normally consists
of the common field followed by the rest of the line from file1, then the rest of the line from file2.
The default input field separators are space, tab, or new-line. In this case, multiple separators count as one field separator, and lead-
ing separators are ignored. The default output field separator is a space.
Some of the below options use the argument n. This argument should be a or a referring to either file1 or file2, respectively.
Options
In addition to the normal output,
produce a line for each unpairable line in file n, where n is or
Replace empty output fields by string
s.
Join on field
m of both files. The argument m must be delimited by space characters. This option and the following two are provided for
backward compatibility. Use of the and options ( see below ) is recommended for portability.
Join on field
m of file1.
Join on field
m of file2.
Each output line comprises the fields specified in
list, each element of which has the form where n is a file number and m is a field number. The common field is not printed
unless specifically requested.
Use character
c as a separator (tab character). Every appearance of c in a line is significant. The character c is used as the field sepa-
rator for both input and output.
Instead of the default output,
produce a line only for each unpairable line in file_number, where file_number is or
Join on field
f of file 1. Fields are numbered starting with 1.
Join on field
f of file 2. Fields are numbered starting with 1.
EXTERNAL INFLUENCES
Environment Variables
determines the collating sequence expects from input files.
determines the alternative blank character as an input field separator, and the interpretation of data within files as single and/or multi-
byte characters. also determines whether the separator defined through the option is a single- or multi-byte character.
If or is not specified in the environment or is set to the empty string, the value of is used as a default for each unspecified or empty
variable. If is not specified or is set to the empty string, a default of ``C'' (see lang(5)) is used instead of If any internationaliza-
tion variable contains an invalid setting, behaves as if all internationalization variables are set to ``C'' (see environ(5)).
International Code Set Support
Single- and multi-byte character code sets are supported with the exception that multi-byte-character file names are not supported.
EXAMPLES
The following command line joins the password file and the group file, matching on the numeric group ID, and outputting the login name, the
group name, and the login directory. It is assumed that the files have been sorted in the collating sequence defined by the or environment
variable on the group ID fields.
The following command produces an output consisting all possible combinations of lines that have identical first fields in the two sorted
files sf1 and sf2, with each line consisting of the first and third fields from and the second and fourth fields from
WARNINGS
With default field separation, the collating sequence is that of with the sequence is that of a plain sort.
The conventions of and are incongruous.
Numeric filenames may cause conflict when the option is used immediately before listing filenames.
AUTHOR
was developed by OSF and HP.
SEE ALSO
awk(1), comm(1), sort(1), uniq(1).
STANDARDS CONFORMANCE
join(1)