Sponsored Content
Top Forums Shell Programming and Scripting Compare files to pull changed records only Post 302978579 by Saanvi1 on Monday 1st of August 2016 12:45:22 PM
Old 08-01-2016
Compare files to pull changed records only

Hi,
I am using Sun Solaris - SunOS. I have two fixed width files shown below. I am trying to find the changes in the records in the Newfile.txt for the records where the key column matches. The first column is a key column (example: A123).
If there are any new or deletion of records in the Newfile, I do NOT want that in the output.
All I am try to achieve is the output of the changed records where the key column matches.

File1:OldFile.txt
Code:
A123   Sim   Firstname1          Lastname1  123 JESSE DR.     Atlanta   GA32839   Sampleemail@YAHOO.COM
B234   TWD   Firstname2          Lastname2  123 FORTHILL1 DR. Atlanta   GA32839   Sampleemail2@YAHOO.COM
C567   TWD   Firstname3          Lastname3  123 FORTHILL2 DR. Atlanta   GA32839   Sampleemail3@YAHOO.COM
D89012 TWD   Firstname3          Lastname3  123 FORTHILL2 DR. Atlanta   GA32839   Sampleemail3@YAHOO.COM

File2:NewFile.txt
Code:
A123   Sim  UpdatedNewFirstName1 UpdatedNewLastname1  123 JESSE DR.      Atlanta   GA32839  sampleemail@YAHOO.COM
B234   TWD  Firstname2           Lastname2            123 FORTHILL1 DR.  Atlanta   GA32839  Sampleemail2@YAHOO.COM
C5676  TWD  Firstname3           Lastname3            123 FORTHILL2 DR.  Atlanta   GA32839  Sampleemail3@YAHOO.COM
Z12345 TWD  Firstname3           Lastname4            123 FORTHILL2 DR.  Atlanta   GA32839  Sampleemail3@YAHOO.COM

So in above example: Output would be as shown below. The UpdatedNewFirstName1 and UpdatedNewLastname1 for key A123 is changed.
A123 Sim UpdatedNewFirstName1 UpdatedNewLastname1 123 JESSE DR. Atlanta GA32839 sampleemail@YAHOO.COM

and ignore the two records below:
D89012 TWD Firstname3 Lastname3 123 FORTHILL2 DR. Atlanta GA32839 Sampleemail3@YAHOO.COM -- This record dropped in new file. I do not want this in my output.

Z12345 TWD Firstname3 Lastname4 123 FORTHILL2 DR. Atlanta GA32839 Sampleemail3@YAHOO.COM -- This record added in new file. I do not want this either.

PS: Please ignore the format of the files as I created a sample file above, which might be slightly off.

All I need a CHANGED records where the first field keys matches.

Thanks

Last edited by Don Cragun; 08-01-2016 at 02:37 PM.. Reason: Change HTML tags to CODE tags and remove EMAIL tags.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to compare two flat files and get changed data

Hi, I need to compare two flat files (yesterday & today's data) and get only the changed data from flat files. In flat file i dont have data column or anything its just a string data in flat file.Can any one please let me know the script With Regds Shashi (3 Replies)
Discussion started by: jtshashidhar
3 Replies

2. Shell Programming and Scripting

Awk Compare Files w/Multiline Records

I'm trying to compare the first column values in two different files that use a numerical value as the key and output the more meaningful value found in the second column of file1 in front of the matching line(s) in file2. My problem is that file2 has multiple records. For example given: FILE1... (4 Replies)
Discussion started by: RacerX
4 Replies

3. Shell Programming and Scripting

compare files in two directories and output changed files to third directory

I have searched about 30 threads, a load of Google pages and cannot find what I am looking for. I have some of the parts but not the whole. I cannot seem to get the puzzle fit together. I have three folders, two of which contain different versions of multiple files, dist/file1.php dist/file2.php... (4 Replies)
Discussion started by: bkeep
4 Replies

4. Shell Programming and Scripting

Compare Records between to files and extract it

I am not an expert in awk, SED, etc... but I really hope there is a way to do this, because I don't want to have to right a program. I am using C shell. FILE 1 FILE 2 H0000000 H0000000 MA1 MA1 CA1DDDDDD CA1AAAAAA MA2 ... (2 Replies)
Discussion started by: jclanc8
2 Replies

5. UNIX for Dummies Questions & Answers

pull date from header and append to all records

I did some searches, but couldn't really find what I'm looking for. I have a file formatted as below: BOF ABC CO - XYZ COMM DATA OF 07/05/2011 EBA00000001 sdfa rtyus uyml EBB00000001 54682 984w3 EBA00000002 mkiyuasdf 98234 I want to pull the date from the header record and add it... (4 Replies)
Discussion started by: keeferb
4 Replies

6. Shell Programming and Scripting

How to compare data from 2 zip files and capture the new records from file2 to a new file

I have 2 zip files which have about 20 million records in each file. file 2 will have additional records than file 1. I want to compare the records in both the files and capture the new records from file 2 into another file file3. Please help me with a command/script which provides me the desired... (8 Replies)
Discussion started by: koneru
8 Replies

7. Shell Programming and Scripting

Compare 2 files having different number of columns and records

Hi , My requirement is to Compare 2 files having different number of columns and records and get the ouptut containing all the non-matching records from File A(with all column values ) .Example data below : File A contains following : Aishvarya |1234... (4 Replies)
Discussion started by: aishvarya.singh
4 Replies

8. Shell Programming and Scripting

Compare two files with different number of records and output only the Extra records from file1

Hi Freinds , I have 2 files . File 1 |nag|HYd|1|Che |esw|Gun|2|hyd |pra|bhe|3|hyd |omu|hei|4|bnsj |uer|oeri|5|uery File 2 |nag|HYd|1|Che |esw|Gun|2|hyd |uer|oi|3|uery output : (9 Replies)
Discussion started by: i150371485
9 Replies

9. Shell Programming and Scripting

Compare multiple files, identify common records and combine unique values into one file

Good morning all, I have a problem that is one step beyond a standard awk compare. I would like to compare three files which have several thousand records against a fourth file. All of them have a value in each row that is identical, and one value in each of those rows which may be duplicated... (1 Reply)
Discussion started by: nashton
1 Replies

10. Shell Programming and Scripting

awk - compare records of 1 file with 3 files

hi.. I want to compare records present in 1 file with those in 3 other files and print those records of file 1 which are not present in any of the files. for eg - file1 file2 file3 file4 1 1 5 7 2 2 6 9 3 4 5 6 7 8 9 ... (3 Replies)
Discussion started by: Abhiraj Singh
3 Replies
Fields(3pm)						User Contributed Perl Documentation					       Fields(3pm)

NAME
Sort::Fields - Sort lines containing delimited fields SYNOPSIS
use Sort::Fields; @sorted = fieldsort [3, '2n'], @lines; @sorted = fieldsort '+', [-1, -3, 0], @lines; $sort_3_2n = make_fieldsort [3, '2n'], @lines; @sorted = $sort_3_2n->(@lines); DESCRIPTION
Sort::Fields provides a general purpose technique for efficiently sorting lists of lines that contain data separated into fields. Sort::Fields automatically imports two subroutines, "fieldsort" and "make_fieldsort", and two variants, "stable_fieldsort" and "make_sta- ble_fieldsort". "make_fieldsort" generates a sorting subroutine and returns a reference to it. "fieldsort" is a wrapper for the "make_fieldsort" subroutine. The first argument to make_fieldsort is a delimiter string, which is used as a regular expression argument for a "split" operator. The delimiter string is optional. If it is not supplied, make_fieldsort splits each line using "/s+/". The second argument is an array reference containing one or more field specifiers. The specifiers indicate what fields in the strings will be used to sort the data. The specifier "1" indicates the first field, "2" indicates the second, and so on. A negative specifier like "-2" means to sort on the second field in reverse (descending) order. To indicate a numeric rather than alphabetic comparison, append "n" to the specifier. A specifier of "0" means the entire string ("-0" means the entire string, in reverse order). The order in which the specifiers appear is the order in which they will be used to sort the data. The primary key is first, the secondary key is second, and so on. "fieldsort [1, 2], @data" is roughly equivalent to "make_fieldsort([1, 2])->(@data)". Avoid calling fieldsort repeatedly with the same sort specifiers. If you need to use a particular sort more than once, it is more efficient to call "make_fieldsort" once and reuse the subroutine it returns. "stable_fieldsort" and "make_stable_fieldsort" are like their "unstable" counterparts, except that the items that compare the same are maintained in their original order. EXAMPLES
Some sample data (in array @data): 123 asd 1.22 asdd 32 ewq 2.32 asdd 43 rewq 2.12 ewet 51 erwt 34.2 ewet 23 erww 4.21 ewet 91 fdgs 3.43 ewet 123 refs 3.22 asdd 123 refs 4.32 asdd # alpha sort on column 1 print fieldsort [1], @data; 123 asd 1.22 asdd 123 refs 3.22 asdd 123 refs 4.32 asdd 23 erww 4.21 ewet 32 ewq 2.32 asdd 43 rewq 2.12 ewet 51 erwt 34.2 ewet 91 fdgs 3.43 ewet # numeric sort on column 1 print fieldsort ['1n'], @data; 23 erww 4.21 ewet 32 ewq 2.32 asdd 43 rewq 2.12 ewet 51 erwt 34.2 ewet 91 fdgs 3.43 ewet 123 asd 1.22 asdd 123 refs 3.22 asdd 123 refs 4.32 asdd # reverse numeric sort on column 1 print fieldsort ['-1n'], @data; 123 asd 1.22 asdd 123 refs 3.22 asdd 123 refs 4.32 asdd 91 fdgs 3.43 ewet 51 erwt 34.2 ewet 43 rewq 2.12 ewet 32 ewq 2.32 asdd 23 erww 4.21 ewet # alpha sort on column 2, then alpha on entire line print fieldsort [2, 0], @data; 123 asd 1.22 asdd 51 erwt 34.2 ewet 23 erww 4.21 ewet 32 ewq 2.32 asdd 91 fdgs 3.43 ewet 123 refs 3.22 asdd 123 refs 4.32 asdd 43 rewq 2.12 ewet # alpha sort on column 4, then numeric on column 1, then reverse # numeric on column 3 print fieldsort [4, '1n', '-3n'], @data; 32 ewq 2.32 asdd 123 refs 4.32 asdd 123 refs 3.22 asdd 123 asd 1.22 asdd 23 erww 4.21 ewet 43 rewq 2.12 ewet 51 erwt 34.2 ewet 91 fdgs 3.43 ewet # now, splitting on either literal period or whitespace # sort numeric on column 4 (fractional part of decimals) then # numeric on column 3 (whole part of decimals) print fieldsort '(?:.|s+)', ['4n', '3n'], @data; 51 erwt 34.2 ewet 43 rewq 2.12 ewet 23 erww 4.21 ewet 123 asd 1.22 asdd 123 refs 3.22 asdd 32 ewq 2.32 asdd 123 refs 4.32 asdd 91 fdgs 3.43 ewet # alpha sort on column 4, then numeric on the entire line # NOTE: produces warnings under -w print fieldsort [4, '0n'], @data; 32 ewq 2.32 asdd 123 asd 1.22 asdd 123 refs 3.22 asdd 123 refs 4.32 asdd 23 erww 4.21 ewet 43 rewq 2.12 ewet 51 erwt 34.2 ewet 91 fdgs 3.43 ewet # stable alpha sort on column 4 (maintains original relative order # among items that compare the same) print stable_fieldsort [4], @data; 123 asd 1.22 asdd 32 ewq 2.32 asdd 123 refs 3.22 asdd 123 refs 4.32 asdd 43 rewq 2.12 ewet 51 erwt 34.2 ewet 23 erww 4.21 ewet 91 fdgs 3.43 ewet BUGS
Some rudimentary tests now. Perhaps something should be done to catch things like: fieldsort '.', [1, 2], @lines; '.' translates to "split /./" -- probably not what you want. Passing blank lines and/or lines containing the wrong kind of data (alphas instead of numbers) can result in copious warning messages under "-w". If the regexp contains memory parentheses ("(...)" rather than "(?:...)"), split will function in "delimiter retention" mode, capturing the contents of the parentheses as well as the stuff between the delimiters. I could imagine how this could be useful, but on the other hand I could also imagine how it could be confusing if encountered unexpectedly. Caveat sortor. Not really a bug, but if you are planning to sort a large text file, consider using sort(1). Unless, of course, your operating system doesn't have sort(1). AUTHOR
Joseph N. Hall, joseph@5sigma.com SEE ALSO
perl(1). perl v5.8.8 2008-03-25 Fields(3pm)
All times are GMT -4. The time now is 11:40 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy