Creating DELTA file in UNIX


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Creating DELTA file in UNIX
# 8  
Old 04-15-2010
OK, then it's simple (pseudocode)
Code:
get file.today
sort file.today
diff file.yesterday file.today | grep '^>' to get new/changed lines
mv file.today file.yesterday
sleep 24h

# 9  
Old 04-15-2010
Code:
 
awk 'NR==FNR {a[$0]=$0;next} { if ($0 not in a) print }' file1 file2

not sure of the performance

You can use "comm" as below too to get the lines( present /changed in file2 only)

Code:
 
comm -23 file1 file2

Makre sure you have sorted the twofiles.
# 10  
Old 04-15-2010
Quote:
Originally Posted by pludi
OK, then it's simple (pseudocode)
Code:
get file.today
sort file.today
diff file.yesterday file.today | grep '^>' to get new/changed lines
mv file.today file.yesterday
sleep 24h

No, this assumes that all the records that were sent yesterday, are also sent today.
Code:
get file.today
sort file.today
merge file.yesterday file.today >new.master
diff file.yesterday new.master |grep '^>' >delta
mv new.master file.yesterday

And this doesn't work either, the merge will not know the difference between a changed record and a new record.
You have to define a key portion of the record so that a changed record can be compared to an existing record.
Write a correct sequential update program. You know the kind that used a tape master file as input, a sorted card deck and a tape out.
my.safaribooksonline.com/9780471722618/the_balanced_line_algorithm_for_sequenti#X2ludGVybmFsX0ZsYXNoUmVhZGVyP3htbGlkPTk3ODA0NzE3MjI2MTgvNTY 3JmltYWdlcGFnZT01Njc=

Last edited by pludi; 04-15-2010 at 09:12 AM.. Reason: corrected url
# 11  
Old 04-15-2010
Quote:
Originally Posted by jgt
No, this assumes that all the records that were sent yesterday, are also sent today.
No, it does not. By grepping only for the lines starting with '>' only lines diff thinks were inserted are returned. Example:
Code:
$ cat file1
This is line 1
This is line 2
This is line 3
This is line 4
$ cat file2
This is line 1
This is line 2
This is line 4
This is line 5
$ diff file1 file2
3d2
< This is line 3
4a4
> This is line 5
$ diff file1 file2 | grep '^>'
> This is line 5

This will work for the assumptions I've made here and that were confirmed as correct by the OP, as he only requires the lines added since the last run (and diff thinks of changed lines as one remove and one insert operation) without needing them in any special order.

---------- Post updated at 14:14 ---------- Previous update was at 14:10 ----------

Also, the URL you gave might even lead to the holy grail of sequential file processing, but it only presents a very short preview for those not in the illustrious circle of Safari Books members. Could you perhaps post an excerpt of the relevant parts?
# 12  
Old 04-15-2010
Quote:
Also, the URL you gave might even lead to the holy grail of sequential file processing, but it only presents a very short preview for those not in the illustrious circle of Safari Books members. Could you perhaps post an excerpt of the relevant parts?
"holy grail" = "doing it right"
"illustrious circle" = "those prepared to pay for knowledge"
"relevant parts" = "about 200 lines of code, which I have in COBOL, and FORTRAN V, but unfortunately only on paper.
The OP is welcome to discuss this with me.
# 13  
Old 04-15-2010
assumption correct

Thanks to all for all the great inputs.

The assumption that assumes that all the records that were sent yesterday, are also sent today is correct. The system will work like this only.

Further what is the purpose of sorting the file of today.

sort file.today

Does diff has the dependency that it will work only on sorted file?

---------- Post updated at 10:32 AM ---------- Previous update was at 10:29 AM ----------


Further the produced static file should contain only the records , not the preceding > symbol which comes from diff command. Guess I will need to write some command to remove initial ‘> ‘ from all the lines.
# 14  
Old 04-15-2010
No, sorting not not required for diff. But if you have to trace a problem it'll be easier if similar records are close to each other.

To remove the angle bracket at the beginning, just cut the line from field 3 onward:
Code:
diff file1 file2 | grep '^>' | cut -d '' -f 3-

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Script for creating Control file in UNIX

Delete ---- Original post, restored by mod after being deleted by abhilashnair ---- I have a requirement where, I need to create a control file which will have 3 columns in the header row as below: Filename Count Checksum This above control file has to contain metadata as above... (2 Replies)
Discussion started by: abhilashnair
2 Replies

2. Shell Programming and Scripting

Delta from the first digit

Thanks of your suggestions i was able to calculate the delta between some numbers in a column file with . awk 'BEGIN{last=0}{delta=$1-last; last=$1; print $0" "delta}' the file was like 499849120.00 500201312.00 500352416.00 500402784.00 500150944.00 499849120.00 500150944.00... (3 Replies)
Discussion started by: Board27
3 Replies

3. Shell Programming and Scripting

Comparing delta values of one xml file in other xml file

Hi All, I have two xml files. One is having below input <NameValuePair> <name>Daemon</name> <value>tcp:7474</value> </NameValuePair> <NameValuePair> <name>Network</name> <value></value> </NameValuePair> ... (2 Replies)
Discussion started by: sharsour
2 Replies

4. UNIX for Dummies Questions & Answers

Creating a csv file with header in UNIX

I have a flat file that contains dynamic list of variables like a=1 b=2 c=3 . .. z=26 I need to convert the above into a csv file having the format below: a,b,c,..,z 1,2,3,..,26 Please note, I do not want a comma separating the last variable. I tried to refer the post... (4 Replies)
Discussion started by: vkumbhakarna
4 Replies

5. Shell Programming and Scripting

File delta detection

Hello, I need to compare two flat files (ASCII format), say file OLD and file NEW. Both have similar structure. These files are | delimitted files and have around few million of records (lines) each. Each file has same set columns and same set of key columns (i.e. the 3rd and 5th column of the... (7 Replies)
Discussion started by: manubatham20
7 Replies

6. Shell Programming and Scripting

problem in creating execute profile file in unix

first i created a profile file(my_var.profile) which contains export my_var=20 after that i created shell scripts(my_var.sh) which contains #!/bin/bash . ./my_var.profile echo '$my_var='$my_var but when i am executing sh my_var.sh it is showing error that no such file/directory .profile.... (6 Replies)
Discussion started by: pratikjain998
6 Replies

7. Shell Programming and Scripting

problem in creating my own profile file in unix

I am new in shell scripting. currently i am using cygwin. My problem is i created a profile file in my own folder. file name is first.profile in which i gave following values to variable export a=10 now i am executing this profile file by below command ./.first.profile it executed... (4 Replies)
Discussion started by: pratikjain998
4 Replies

8. UNIX for Dummies Questions & Answers

UNIX script for reading a file and creating another file

Hi, I am a beginner in scripting...I have to do a script where I have to read a file which has list of job names, line by line and for every line execute a dsjob command to find the log details of the job and extract only the start time of the job, if it is greater than jan 01 2008 and create... (1 Reply)
Discussion started by: Vijay81
1 Replies

9. OS X (Apple)

What's The Easiest Route To Creating A Unix Executable File for Terminal?

I've seen the executable open in the application OmniOutliner, can I create an executable with this app? I'd like to be able to create the unix executable and insert it into terminal, but I'm not sure if the Omni app will allow me to create it. Any one have any ideas or possibly familiar with... (10 Replies)
Discussion started by: unimachead
10 Replies

10. Shell Programming and Scripting

Need help in creating a Unix Script to parse xml file

Hi All, My requirement is create an unix script to parse the xml file and display the values of the Elements/value between the tags on console. Like say, I would like to fetch the value of errorCode from the below xml which is 'U007' and display it. Can we use SED command for this? I have tried... (10 Replies)
Discussion started by: Anil.Wmg
10 Replies
Login or Register to Ask a Question