09-16-2008
If the little file has any line at all which is not present in the big file, that condition will send the original script into an infinite loop. fpmurphy's fix stops the infinite loop, but the output is probably still wrong. After encountering such a line, my script will output the remainder of the big file. If the little file has a copy of the big file's final line, and all extra lines in the little file follow this final line, I guess it's ok.
BTW, if the little file has any extra lines, it is not a subset at all. A proper subset would have fewer lines. An "improper" or "non-proper" subset would be an exact copy of all lines.
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi,
I have two files file A and File B. File A is a error file and File B is source file. In the error file. First line is the actual error and second line gives the information about the record (client ID) that throws error. I need to compare the first field (which doesnt start with '//') of... (11 Replies)
Discussion started by: kmkbuddy_1983
11 Replies
2. AIX
IBM RS6000 F50
AIX 4.3.2
i am having trouble in calculating the actual size of a set of directories and reconciling the results with the actual Hard Disk space used
I have 33GB disk which is showing 7.8GB used, a byte count of the files in the directory/sub-dirs i`m interested in is 48GB,... (4 Replies)
Discussion started by: cooperuf
4 Replies
3. UNIX for Advanced & Expert Users
Hi , i need a fast way to delete duplicates entrys from very huge files ( >2 Gbs ) , these files are in plain text.
I tried all the usual methods ( awk / sort /uniq / sed /grep .. ) but it always ended with the same result (memory core dump)
In using HP-UX large servers.
Any advice will... (8 Replies)
Discussion started by: Klashxx
8 Replies
4. High Performance Computing
we have one file (11 Million) line that is being matched with (10 Billion) line.
the proof of concept we are trying , is to join them on Unix :
All files are delimited and they have composite keys..
could unix be faster than Oracle in This regards..
Please advice (1 Reply)
Discussion started by: magedfawzy
1 Replies
5. Shell Programming and Scripting
I'm trying simple functionality of replacing the second line of files with some other string.
Problem is these files are huge and there are too many files to process.
Could anyone please suggest me a way to replace the second line of all files with another text in a fastest possible manner.
... (2 Replies)
Discussion started by: satish.pyboyina
2 Replies
6. Programming
On my Linux system there seems to be a massive difference between the value of _POSIX_OPEN_MAX and what sysconf(_SC_OPEN_MAX) returns and also what I'd expect from the table of examples of configuration limits from Advanced Programming In The UNIX Environment, 2nd Ed.
_POSIX_OPEN_MAX: 16... (5 Replies)
Discussion started by: gencon
5 Replies
7. Shell Programming and Scripting
Hi, all:
I've got two folders, say, "folder1" and "folder2".
Under each, there are thousands of files.
It's quite obvious that there are some files missing in each. I just would like to find them. I believe this can be done by "diff" command.
However, if I change the above question a... (1 Reply)
Discussion started by: jiapei100
1 Replies
8. Shell Programming and Scripting
I got three different file:
Part of File 1
ARTPHDFGAA
.
.
Part of File 2
ARTGHHYESA
.
.
Part of File 3
ARTPOLYWEA
.
. (4 Replies)
Discussion started by: patrick87
4 Replies
9. Shell Programming and Scripting
Hi all,
I need help on getting difference between 2 .csv files.
I have 2 large . csv files which has equal number of columns. I nned to compare them and get output in new file which will have difference olny.
E.g.
File1.csv
Name, Date, age,number
Sakshi, 16-12-2011, 22, 56
Akash,... (10 Replies)
Discussion started by: Dimple
10 Replies
10. Shell Programming and Scripting
Hi Friends !!
I am facing a hash total issue while performing over a set of files of huge volume:
Command used:
tail -n +2 <File_Name> |nawk -F"|" -v '%.2f' qq='"' '{gsub(qq,"");sa+=($156<0)?-$156:$156}END{print sa}' OFMT='%.5f'
Pipe delimited file and 156 column is for hash totalling.... (14 Replies)
Discussion started by: Ravichander
14 Replies
LEARN ABOUT DEBIAN
rmgdiff
RMGDIFF(1x) RMGDIFF(1x)
NAME
rmgdiff - use almost any graphical file difference browser to recursively view the differences between two directories.
SYNOPSIS
rmgdiff [-b] [-d] [-g gui] [-n] dir1 dir2
DESCRIPTION
rmgdiff is an awk script that works in conjunction with almost any graphical file difference browser. It is known to work with mgdiff,
tkdiff, and xdiff.
Unless I am mistaken, most of the GUI difference viewers (except for emacs) do not have built-in support for recursing down two directo-
ries, but diff does. Based on diff's output, rmgdiff decides when to invoke the graphical difference viewer.
In addition, rmgdiff also collates diff's output. As soon as a new difference is encountered in a text file, rmgdiff will print to stan-
dard output the name of the file that both directories have in common. It will then start the GUI and block until the user exits. As more
text files with differences are found, the GUI will be started up again.
In the interim, rmgdiff will keep track of differences in binary (non-text) files. It organizes the binary files as executables, shared
libraries, static libraries, object files, and other. Only after all the text files have been displayed will rmgdiff report the binary
differences.
It also keeps track of files and directories that diff reports as being only in one directory or another. rmgdiff organizes these entries
by directory. Thus, files in one directory will be reported in one block, and files that are in the other directory will be reported in a
different block.
In addition to printing the name of the files that are different, rmgdiff defaults to printing the relevant portion of the output from the
file command. This has the unfortunate side-effect of slowing things down; however, I find this information to be invaluable. If you're
just looking for a fast way to collate diff's output, try piping it into sort instead.
COMMAND LINE OPTIONS
-b Sets the basic reporting mode. In basic mode, rmgdiff reports only file names. It does not report the file types involved.
-c By default, files relating to CVS are ignored by rmgdiff. If you want to include CVS files, use this option.
-d Sets rmgdiff to print way too much debugging information.
-g gui Tells rmgdiff which gui you would like to use for viewing differences. By default, mgdiff is used. You can also set $RMGDIFF_GUI
in your environment, but it can be overridden with this option.
-n rmgdiff will not invoke the gui. This is useful, if you only want to view the collated output.
AUTHOR
Paul Serice (paul@serice.net)
RMGDIFF(1x)