10-22-2012
Diff on 1gb files
Hey Guys,
I have a scenario to compare two different files which are of size 1gb each. I need to get the uncommon lines. I planned to use sdiff command, which generally works perfect for me. But in this case am facing a error saying
"diff: memory exhausted"
Can anyone please explain this.
Also, can anyone suggest a unix command to compare huge files, to get the common and the uncommon lines between the two files.
Thanks,
Abhishek S.
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hello,
I want to compare two files. All records in file 2 that are not in file 1 should be output to file 3.
For example:
file 1
123
1234
123456
file 2
123
2345
23456
file 3 should have
2345
23456
I have looked at diff, bdiff, cmp, comm, diff3 without any luck! (2 Replies)
Discussion started by: blt123
2 Replies
2. UNIX for Advanced & Expert Users
Hi,
I have to do a search on a zip files whose sizes vary from 1GB to 1.5GB.
I dont want to unzip it since if it goes beyond 2GB.......
also will unzip -p filename | grep create any problems, will it unzip the whole file or will it unzip it piece by piece??
I appreciate your inputs...
... (1 Reply)
Discussion started by: baanprog
1 Replies
3. Shell Programming and Scripting
I need to compare 2 diff type of files and find out the duplicate after comparing each types of files:
Type 1 file name is like: file1.abc
(the extension abc could any 3 characters but I can narrow it down or hardcode for 10/15 combinations).
The other file is file1.bcd01abc (the extension... (2 Replies)
Discussion started by: ricky007
2 Replies
4. Shell Programming and Scripting
Hi Masters,
I have two files named file1 and file2.
Both the files contains the same contents with some difference in comments,space.But no content change.
I tried to find the diff between the two files to make sure that contents are same.
For that i tried
diff -ibw file1 file2
But... (1 Reply)
Discussion started by: ecearund
1 Replies
5. Red Hat
Hi, I have a Linux distribution ( Oralce Enterprise Linux 5.3 i.e. Redhat ) that I have installed. It works fine when I used 2*512Mb dimms or replace them with a single 1Gb dimm. However when I try to go above 1 Gb the bootup and general performance deteriorates badly. The BIOS picks up the memory... (3 Replies)
Discussion started by: jimthompson
3 Replies
6. Shell Programming and Scripting
Hi,
I have 2 files.I want to check if file1 is contained in file2.
A.txt:
-----
AAA
BBB
B.txt:
------
CCC
AAA
BBB
DDD
I want to check if A.txt is contained in B.txt. Can it be done using SED ? (12 Replies)
Discussion started by: giri_luck
12 Replies
7. Shell Programming and Scripting
Hi All,
I have two files which look as below
File1
serial="1" name="abc" type="employee" field="IT"
serial="2" name="cde" type="intern" field="Marketing"
serial="3" name="pqr" type="contractor" field="IT"
serial="4" name="xyz" type="employee" field="Sales"
File2
serial="1"... (3 Replies)
Discussion started by: grajp002
3 Replies
8. Shell Programming and Scripting
Moderator, please, delete this topic (1 Reply)
Discussion started by: optik77
1 Replies
9. UNIX for Dummies Questions & Answers
Hi All,
I know the separate commands for finding files greater than 30 days and finding files greater than 1GB.
How do I combine these two commands?
Meaning how do I find files which are > 1GB and older than 30 days?
;) (4 Replies)
Discussion started by: Hangman2
4 Replies
10. Shell Programming and Scripting
Guys i have 3 files,
but i want to compare and diff only the 2nd column
path=`/home/whois/doms`
for i in `cat domain.tx`
do
whois $i| sed -n '/Registry Registrant ID:/,/Registrant Email:/p' > $path/$i.registrant
whois $i| sed -n '/Registry Admin ID:/,/Admin Email:/p' > $path/$i.admin... (10 Replies)
Discussion started by: kenshinhimura
10 Replies
SDIFF(1) User Commands SDIFF(1)
NAME
sdiff - side-by-side merge of file differences
SYNOPSIS
sdiff [OPTION]... FILE1 FILE2
DESCRIPTION
Side-by-side merge of file differences.
-o FILE --output=FILE
Operate interactively, sending output to FILE.
-i --ignore-case
Consider upper- and lower-case to be the same.
-E --ignore-tab-expansion
Ignore changes due to tab expansion.
-b --ignore-space-change
Ignore changes in the amount of white space.
-W --ignore-all-space
Ignore all white space.
-B --ignore-blank-lines
Ignore changes whose lines are all blank.
-I RE --ignore-matching-lines=RE
Ignore changes whose lines all match RE.
--strip-trailing-cr
Strip trailing carriage return on input.
-a --text
Treat all files as text.
-w NUM --width=NUM
Output at most NUM (default 130) columns per line.
-l --left-column
Output only the left column of common lines.
-s --suppress-common-lines
Do not output common lines.
-t --expand-tabs
Expand tabs to spaces in output.
-d --minimal
Try hard to find a smaller set of changes.
-H --speed-large-files
Assume large files and many scattered small changes.
--diff-program=PROGRAM
Use PROGRAM to compare files.
-v --version
Output version info.
--help Output this help.
If a FILE is `-', read standard input.
AUTHOR
Written by Thomas Lord.
REPORTING BUGS
Report bugs to <bug-gnu-utils@gnu.org>.
COPYRIGHT
Copyright (C) 2002 Free Software Foundation, Inc.
This program comes with NO WARRANTY, to the extent permitted by law. You may redistribute copies of this program under the terms of the
GNU General Public License. For more information about these matters, see the file named COPYING.
SEE ALSO
The full documentation for sdiff is maintained as a Texinfo manual. If the info and sdiff programs are properly installed at your site,
the command
info diff
should give you access to the complete manual.
diffutils 2.8.1 April 2002 SDIFF(1)