Sponsored Content
Top Forums Shell Programming and Scripting Fixed width file comparision not working Post 302769123 by drl on Monday 11th of February 2013 09:52:33 AM
Old 02-11-2013
Hi.

This solution relies on components docdiff and a short perl script:
Code:
#!/usr/bin/env bash

# @(#) s2	Demonstrate differences at character level.

pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C perl docdiff 

f1=data1
f2=data2
FILES="$f1 $f2"

pl " Input files $FILES"
head $FILES

pl " perl extraction helper script:"
cat p1

pl " Results, wdiff format, $f1, $f2:"
docdiff --wdiff --char $f1 $f2

pl " Results, wdiff format, $f1, $f2, extracted diff with labels:"
docdiff --wdiff --char $f1 $f2 |
./p1 $f1 $f2

pl " Results, wdiff format, $f2, $f1, extracted diff with labels:"
docdiff --wdiff --char $f2 $f1 |
./p1 $f2 $f1

exit 0

producing:
Code:
% ./s2

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0.8 (lenny) 
bash GNU bash 3.2.39
perl 5.10.0
docdiff 0.3.4

-----
 Input files data1 data2
==> data1 <==
orange
123456789xa
X-klystron

==> data2 <==
orange
123456780xb
Y-klystron

-----
 perl extraction helper script:
#!/usr/bin/env perl

# @(#) p1	Demonstrate wdiff difference format extraction with labels.

$f1 = shift || die " Missing first label.\n";
$f2 = shift || die " Missing second label.\n";

while (<>) {
  @a = m/\[-(.*?)-\]/xmsg;
  print "$f1: ", join( "", @a ), "\n" if defined @a;
  @b = m/\{\+(.*?)\+\}/xmsg;
  print "$f2: ", join( "", @b ), "\n" if defined @b;
}

exit(0);

-----
 Results, wdiff format, data1, data2:
orange
12345678[-9-]{+0+}x[-a-]{+b+}
[-X-]{+Y+}-klystron

-----
 Results, wdiff format, data1, data2, extracted diff with labels:
data1: 9a
data2: 0b
data1: X
data2: Y

-----
 Results, wdiff format, data2, data1, extracted diff with labels:
data2: 0b
data1: 9a
data2: Y
data1: X

The idea is that docdiff can print difference in resolution down to characters. The wdiff-style output is processed by the perl script. The data files were augmented to try to make sure that multiple lines could be processed as well as lines that were identical.

The docdiff utility is written in ruby, is available in Debian-based GNU/Linux repositories, and can also be found at DocDiff: Compare text word by word | Free Development software downloads at SourceForge.net

See man pages for details.

Best wishes ... cheers, drl (125)

---------- Post updated at 08:52 ---------- Previous update was at 08:10 ----------

Hi.

An all-perl solution:
Code:
#!/usr/bin/env perl

# @(#) p1	Demonstrate character differences in same-length lines.

use warnings;
use strict;

my (
  $f1, $f2, $file1, $file2, $i,       @a, @b,
  $s1, $s2, $t1,    $t2,    $changed, $debug
);

$f1 = shift || die " Missing first file.\n";
$f2 = shift || die " Missing second file.\n";

$debug = 1;
$debug = 0;

open( $file1, "<", $f1 ) || die " Cannot open file $f1\n";
open( $file2, "<", $f2 ) || die " Cannot open file $f2\n";
while ( $t1 = <$file1> ) {
  chomp($t1);
  @a = split "", $t1;
  $t2 = <$file2>;
  chomp($t2);
  @b = split "", $t2;
  print "file1,2 = ", join "", @a, " ", join "", @b, "\n" if $debug;
  $changed = 0;
  $s1 = $s2 = "";

  for ( $i = 0; $i <= $#a; $i++ ) {
    if ( $a[$i] ne $b[$i] ) {
      $s1 = "$f1: " if not $changed;
      $s2 = "$f2: " if not $changed;
      $s1 .= $a[$i];
      $s2 .= $b[$i];
      $changed++;
    }
  }
  print "$s1\n" if $changed;
  print "$s2\n" if $changed;
}

exit(0);

producing, using the data files noted above:
Code:
% ./p2 data1 data2
data1: 9a
data2: 0b
data1: X
data2: Y

Best wishes ... cheers, drl
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Fixed Width file using AWK

I am using the following command at the Unix prompt to make my 'infile' into a fixed width file of 100 characters. awk '{printf "%-100s\n",$0}' infile > outfile However, there are some records with a special character "©" These records are using 3 characters in place of one and my record... (2 Replies)
Discussion started by: alok.benjwal
2 Replies

2. Shell Programming and Scripting

adding delimiter to a fixed width file

Hi , I have a file : CSCH74000.00 CSCH74000.00 CSCH74100.00 CSCH74000.00 CSCH74100.00 CSCH74000.00 CSCH74000.00 CSCH74100.00 CSCH74100.00 CSCH74100.00 I have to put a delimiter( say comma) in between after 6th character: CSCH74,000.00 CSCH74,000.00 CSCH74,100.00 (2 Replies)
Discussion started by: sumeet
2 Replies

3. UNIX Desktop Questions & Answers

Help with Fixed width File Parsing

I am trying to parse a Fixed width file with data as below. I am trying to assign column values from each record to variables. When I parse the data, the spaces in all coumns are dropped. I would like to retain the spaces as part of the dat stored in the variables. Any help is appreciated. I... (4 Replies)
Discussion started by: sate911
4 Replies

4. Shell Programming and Scripting

summing up the fields in fixed width file

Hi, I have a fixed width file with some records as given below: " 1000Nalsdjflj243324jljlj" "-0300Njfowjljl309933fsf" " 0010Njsfsjklj342344fsl" I want to sum-up first field values(i.e from 2nd character to 6th character)of each record. so for the above file i want to add (1000 - 300+... (2 Replies)
Discussion started by: srilaxmi
2 Replies

5. Shell Programming and Scripting

Fixed-Width file from Oracle

Hi All, I have created a script which generates FIXED-WIDTH file by executing Oracle query. SELECT RPAD(NVL(col1,CHR(9)),20)||NVL(col2,CHR(9))||NVL(col3,CHR(9) FROM XYZ It generates the data file with proper alignment. But if same file i transfer to windows server or Mainframe... (5 Replies)
Discussion started by: Amit.Sagpariya
5 Replies

6. Shell Programming and Scripting

Comparing two fixed width file

Hi Guys I am checking the treads to get the answer but i am not able to get the answer for my question. I have two files. First file is a pattern file and the second file is the file i want to search in it. Output will be the lines from file2. File1: P2797f12af 44751228... (10 Replies)
Discussion started by: anshul_er
10 Replies

7. UNIX for Dummies Questions & Answers

Length of a fixed width file

I have a fixed width file of length 53. when is try to get the lengh of the record of that file i get 2 different answers. awk '{print length;exit}' <File_name> The above code gives me length 50. wc -L <File_name> The above code gives me length 53. Please clarify on... (2 Replies)
Discussion started by: Amrutha24
2 Replies

8. Shell Programming and Scripting

Alter Fixed Width File

Thank u so much .Its working fine as expected. ---------- Post updated at 03:41 PM ---------- Previous update was at 01:46 PM ---------- I need one more help. I have another file(fixed length) that will get negative value (ex:-00000000003000) in postion (98 - 112) then i have to... (6 Replies)
Discussion started by: vinus
6 Replies

9. Shell Programming and Scripting

Replace using awk on fixed width file.

All, I used to use following command to replace specific location in a fixed width file. Recently looks like my command stopped working as intended. We are on AIX unix. awk 'function repl(s,f,t,v) { return substr(s,1,f-1) sprintf("%-*s", t-f+1, v) substr(s,t+1) } NR<=10 {... (3 Replies)
Discussion started by: pinnacle
3 Replies

10. Answers to Frequently Asked Questions

Fixed width file issue and resolutions

I have a fixed width file with no separators , but the file has JUNK characters in it and i know how to remove it. but in few cases these junk characters having created a mess by splitting single row into 2 or 3 rows. I need to put them back to a single line like rest of the rows. This fixed... (2 Replies)
Discussion started by: santoshkm
2 Replies
All times are GMT -4. The time now is 11:04 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy