Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Difference between two huge files Post 302234783 by pyaranoid on Wednesday 10th of September 2008 10:10:21 AM
Old 09-10-2008
Difference between two huge files

Hi,

As per my requirement, I need to take difference between two big files(around 6.5 GB) and get the difference to a output file without any line numbers or '<' or '>' in front of each new line.

As DIFF command wont work for big files, i tried to use BDIFF instead.

I am getting incorrect number of records.

I have done the following test:

I have got a dat file with a few million records in it and to generate a another file i have used sed '1,100d' oldfile > newfile

so i am using Bdiff oldfile newfile | sed -n '/^</p' > DIFF.DAT

The output(DIFF) should be having 100 records in it. But i am getting an output with several records in it.

Could anyone help me out from this situation?

Thanks

Sue
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Comparing two huge files

Hi, I have two files file A and File B. File A is a error file and File B is source file. In the error file. First line is the actual error and second line gives the information about the record (client ID) that throws error. I need to compare the first field (which doesnt start with '//') of... (11 Replies)
Discussion started by: kmkbuddy_1983
11 Replies

2. AIX

Huge difference in reported Disk usage between ls,df and du

IBM RS6000 F50 AIX 4.3.2 i am having trouble in calculating the actual size of a set of directories and reconciling the results with the actual Hard Disk space used I have 33GB disk which is showing 7.8GB used, a byte count of the files in the directory/sub-dirs i`m interested in is 48GB,... (4 Replies)
Discussion started by: cooperuf
4 Replies

3. UNIX for Advanced & Expert Users

Huge files manipulation

Hi , i need a fast way to delete duplicates entrys from very huge files ( >2 Gbs ) , these files are in plain text. I tried all the usual methods ( awk / sort /uniq / sed /grep .. ) but it always ended with the same result (memory core dump) In using HP-UX large servers. Any advice will... (8 Replies)
Discussion started by: Klashxx
8 Replies

4. High Performance Computing

Huge Files to be Joined on Ux instead of ORACLE

we have one file (11 Million) line that is being matched with (10 Billion) line. the proof of concept we are trying , is to join them on Unix : All files are delimited and they have composite keys.. could unix be faster than Oracle in This regards.. Please advice (1 Reply)
Discussion started by: magedfawzy
1 Replies

5. Shell Programming and Scripting

Replacing second line from huge files

I'm trying simple functionality of replacing the second line of files with some other string. Problem is these files are huge and there are too many files to process. Could anyone please suggest me a way to replace the second line of all files with another text in a fastest possible manner. ... (2 Replies)
Discussion started by: satish.pyboyina
2 Replies

6. Programming

Huge difference between _POSIX_OPEN_MAX and sysconf(_SC_OPEN_MAX).

On my Linux system there seems to be a massive difference between the value of _POSIX_OPEN_MAX and what sysconf(_SC_OPEN_MAX) returns and also what I'd expect from the table of examples of configuration limits from Advanced Programming In The UNIX Environment, 2nd Ed. _POSIX_OPEN_MAX: 16... (5 Replies)
Discussion started by: gencon
5 Replies

7. Shell Programming and Scripting

Compare 2 folders to find several missing files among huge amounts of files.

Hi, all: I've got two folders, say, "folder1" and "folder2". Under each, there are thousands of files. It's quite obvious that there are some files missing in each. I just would like to find them. I believe this can be done by "diff" command. However, if I change the above question a... (1 Reply)
Discussion started by: jiapei100
1 Replies

8. Shell Programming and Scripting

Three Difference File Huge Data Comparison Problem.

I got three different file: Part of File 1 ARTPHDFGAA . . Part of File 2 ARTGHHYESA . . Part of File 3 ARTPOLYWEA . . (4 Replies)
Discussion started by: patrick87
4 Replies

9. Shell Programming and Scripting

Difference between two huge .csv files

Hi all, I need help on getting difference between 2 .csv files. I have 2 large . csv files which has equal number of columns. I nned to compare them and get output in new file which will have difference olny. E.g. File1.csv Name, Date, age,number Sakshi, 16-12-2011, 22, 56 Akash,... (10 Replies)
Discussion started by: Dimple
10 Replies

10. Shell Programming and Scripting

Aggregation of Huge files

Hi Friends !! I am facing a hash total issue while performing over a set of files of huge volume: Command used: tail -n +2 <File_Name> |nawk -F"|" -v '%.2f' qq='"' '{gsub(qq,"");sa+=($156<0)?-$156:$156}END{print sa}' OFMT='%.5f' Pipe delimited file and 156 column is for hash totalling.... (14 Replies)
Discussion started by: Ravichander
14 Replies
trbsd(1)						      General Commands Manual							  trbsd(1)

NAME
trbsd - Translates characters SYNOPSIS
trbsd [-Acs] string1 string2 trbsd -d [-Ac] string1 The trbsd command copies characters from the standard input to the standard output with substitution or deletion of selected characters. OPTIONS
Translates on a byte-by-byte basis. When you specify this option, trbsd does not support extended characters. Complements (inverts) the set of characters in string1 with respect to the universe of characters whose codes are 001 through 377 octal if you specify -A, and all characters if you do not specify -A. Deletes all characters in string1 from output. Changes characters that are repeated output charac- ters in string2 into single characters. DESCRIPTION
Input characters from string1 are replaced with the corresponding characters in string2. The trbsd command cannot handle an ASCII NUL (00) in string1 or string2; it always deletes NUL from the input. The tr command is a System V compatible version of trbsd. Abbreviations such as a-z, standing for a string of characters whose ASCII codes run from character a to character z, inclusive, can be used to introduce ranges of characters. Note that brackets are not special characters. Use the escape character (backslash) to remove the special meaning from any character in a string. Use the followed by 1, 2, or 3 octal digits for the code of a character. If a given character appears more than once in string1, the character in string2 corresponding to its last appearance in string1 will be used in the translation. EXAMPLES
To translate braces into parentheses, enter: trbsd '{}' '()' <textfile >newfile This translates each { (left brace) to a ( (left parenthesis) and each } (right brace) to a ) (right parenthesis). All other char- acters remain unchanged. To translate lowercase ASCII characters to uppercase, enter: trbsd a-z A-Z <textfile >newfile The two strings can be of different lengths: trbsd 0-9 # <textfile >newfile This translates each digit to a # (number sign); if string2 is too short, it is padded to the length of string1 by duplicating its last character. To translate each string of digits to a single # (number sign), enter: trbsd -s 0-9 # <textfile >newfile To trans- late all ASCII characters that are not specified, enter: trbsd -c ' -~' 'A-_' <textfile >newfile This translates each nonprinting ASCII character to the corresponding control key letter (01 translates to A, 02 to B, and so on). ASCII DEL (177), the character that follows ~ (tilde), translates to a ? (question mark). SEE ALSO
Commands: ed(1), sh(1), tr(1) Files: ascii(5) trbsd(1)
All times are GMT -4. The time now is 03:07 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy