Sponsored Content
Full Discussion: Comparing two huge files
Top Forums Shell Programming and Scripting Comparing two huge files Post 302231731 by kmkbuddy_1983 on Wednesday 3rd of September 2008 04:15:53 AM
Old 09-03-2008
Hi Rahul,

Thank you for your reply. Command comm with the above files gives wrong output because i need to compare field 1 of file A and field 5 of file B and output the current and previous line from file A.

comm command will compare the files line by line. None of the lines will match except field 1 of file A and field 5 of file B.

Regards,
Mahesh k
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

comparing Huge Files - Performance is very bad

Hi All, Can you please help me in resolving the following problem? My requirement is like this: 1) I have two files YESTERDAY_FILE and TODAY_FILE. Each one is having nearly two million data. 2) I need to check each record of TODAY_FILE in YESTERDAY_FILE. If exists we can skip that by... (5 Replies)
Discussion started by: madhukalyan
5 Replies

2. UNIX for Dummies Questions & Answers

Difference between two huge files

Hi, As per my requirement, I need to take difference between two big files(around 6.5 GB) and get the difference to a output file without any line numbers or '<' or '>' in front of each new line. As DIFF command wont work for big files, i tried to use BDIFF instead. I am getting incorrect... (13 Replies)
Discussion started by: pyaranoid
13 Replies

3. UNIX for Advanced & Expert Users

Huge files manipulation

Hi , i need a fast way to delete duplicates entrys from very huge files ( >2 Gbs ) , these files are in plain text. I tried all the usual methods ( awk / sort /uniq / sed /grep .. ) but it always ended with the same result (memory core dump) In using HP-UX large servers. Any advice will... (8 Replies)
Discussion started by: Klashxx
8 Replies

4. Shell Programming and Scripting

Compare 2 folders to find several missing files among huge amounts of files.

Hi, all: I've got two folders, say, "folder1" and "folder2". Under each, there are thousands of files. It's quite obvious that there are some files missing in each. I just would like to find them. I believe this can be done by "diff" command. However, if I change the above question a... (1 Reply)
Discussion started by: jiapei100
1 Replies

5. Shell Programming and Scripting

Comparing two huge files on field basis.

Hi all, I have two large files and i want a field by field comparison for each record in it. All fields are tab seperated. file1: Email SELVAKUMAR RAMACHANDRAN Email SHILPA SAHU Web NIYATI SONI Web NIYATI SONI Email VIINII DOSHI Web RAJNISH KUMAR Web ... (4 Replies)
Discussion started by: Suman Singh
4 Replies

6. Shell Programming and Scripting

Comparing 2 huge text files

I have this 2 files: k5login sanwar@systems.nyfix.com jjamnik@systems.nyfix.com nisha@SYSTEMS.NYFIX.COM rdpena@SYSTEMS.NYFIX.COM service/backups-ora@SYSTEMS.NYFIX.COM ivanr@SYSTEMS.NYFIX.COM nasapova@SYSTEMS.NYFIX.COM tpulay@SYSTEMS.NYFIX.COM rsueno@SYSTEMS.NYFIX.COM... (11 Replies)
Discussion started by: linuxgeek
11 Replies

7. Shell Programming and Scripting

Perl: Need help comparing huge files

What do i need to do have the below perl program load 205 million record files into the hash. It currently works on smaller files, but not working on huge files. Any idea what i need to do to modify to make it work with huge files: #!/usr/bin/perl $ot1=$ARGV; $ot2=$ARGV; open(mfileot1,... (12 Replies)
Discussion started by: mrn6430
12 Replies

8. Shell Programming and Scripting

awk to parse huge files

Hello All, I have a situation as below: (1) Read a source file (a single file of 1.2 million rows in it ) (2) Read Destination files one by one and replace the content ( few fields in it ) with the corresponding matching field from source file. I tried as below: ( please note I am not... (4 Replies)
Discussion started by: panyam
4 Replies

9. Shell Programming and Scripting

Work with huge Zipped files

Hello dear members, I have one general and one specific question which I will be very grateful if you could help me with them. Let's start with my general question: 1. I am working on cluster computer shared with other people and I need to manipulate a big zipped text file of 13 GB. There is... (1 Reply)
Discussion started by: Homa
1 Replies

10. Shell Programming and Scripting

Aggregation of Huge files

Hi Friends !! I am facing a hash total issue while performing over a set of files of huge volume: Command used: tail -n +2 <File_Name> |nawk -F"|" -v '%.2f' qq='"' '{gsub(qq,"");sa+=($156<0)?-$156:$156}END{print sa}' OFMT='%.5f' Pipe delimited file and 156 column is for hash totalling.... (14 Replies)
Discussion started by: Ravichander
14 Replies
JOIN(1) 						      General Commands Manual							   JOIN(1)

NAME
join - relational database operator SYNOPSIS
join [ options ] file1 file2 DESCRIPTION
Join forms, on the standard output, a join of the two relations specified by the lines of file1 and file2. If file1 is `-', the standard input is used. File1 and file2 must be sorted in increasing ASCII collating sequence on the fields on which they are to be joined, normally the first in each line. There is one line in the output for each pair of lines in file1 and file2 that have identical join fields. The output line normally con- sists of the common field, then the rest of the line from file1, then the rest of the line from file2. Fields are normally separated by blank, tab or newline. In this case, multiple separators count as one, and leading separators are dis- carded. These options are recognized: -an In addition to the normal output, produce a line for each unpairable line in file n, where n is 1 or 2. -e s Replace empty output fields by string s. -jn m Join on the mth field of file n. If n is missing, use the mth field in each file. -o list Each output line comprises the fields specifed in list, each element of which has the form n.m, where n is a file number and m is a field number. -tc Use character c as a separator (tab character). Every appearance of c in a line is significant. SEE ALSO
sort(1), comm(1), awk(1) BUGS
With default field separation, the collating sequence is that of sort -b; with -t, the sequence is that of a plain sort. The conventions of join, sort, comm, uniq, look and awk(1) are wildly incongruous. JOIN(1)
All times are GMT -4. The time now is 08:53 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy