Sponsored Content
Top Forums Shell Programming and Scripting To get Non matching records for current day Post 302880121 by drl on Tuesday 17th of December 2013 01:18:43 PM
Old 12-17-2013
Hi.

Here is a demonstration of the failure and a work-around using a perl version of comm:
Code:
#!/usr/bin/env bash

# @(#) s1       Demonstrate comparison of comm, perl comm on Solaris.

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
untab() { perl -wp -e 's/\t/        /g' $1 ; }
flip() { perl -wp -e 's/\r//g' $1 ; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C pll

rm -f f[12]
pl " Input data files data[12]"
file data[12]
wc data[12]

pl " Data file f[12] after conversion:"
cp data1 f1
cp data2 f2

flip f1 > x1 ; mv x1 f1
flip f2 > x1 ; mv x1 f2

file f[12]
wc f[12]

pl " Script ./comm.pl:"
what ./comm.pl

pl " Results, perl comm:"
./comm.pl -13 <(sort f1) <(sort f2) > f4
wc f4
file f4
untab f4 | pll 78

pl " Results, standard comm:"
comm -13 <(sort f1) <(sort f2) > f3
wc f3
file f3
untab f3 | pll 78

pl " Short lines in small files, result from perl comm:"
head data[34]
pe
./comm.pl -13 <( sort data3 ) <( sort data4 )

exit 0

producing:
Code:
$ ./s1

Environment: LC_ALL = POSIX, LANG = POSIX
(Versions displayed with local utility "version")
OS, ker|rel, machine: SunOS, 5.10, i86pc
Distribution        : Solaris 10 10/08 s10x_u6wos_07b X86
bash GNU bash 3.00.16
pll (local) 1.19

-----
 Input data files data[12]
data1:          ascii text
data2:          ascii text
       2    3760   74398 data1
       2    2918   58115 data2
       4    6678  132513 total

-----
 Data file f[12] after conversion:
f1:             ascii text
f2:             ascii text
       2    3760   74396 f1
       2    2918   58113 f2
       4    6678  132509 total

-----
 Script ./comm.pl:
comm.pl Compare two sorted files line by line, perl.

-----
 Results, perl comm:
       1      51     305 f4
f4:             ascii text
 (Longest line: 640; fit into lines of length 78)
         1         2         3     ...     61        62        63        
12345678901234567890123456789012345...45678901234567890123456789012345678
        246        martin        Pa...NULL        NULL        6        NU

-----
 Results, standard comm:
      15    1485   29004 f3
f3:             ascii text
 (Longest line: 2364; fit into lines of length 78)
         1         2         3     ... 233        234        235        2
12345678901234567890123456789012345...89012345678901234567890123456789012
243        Williamss        Serena ...</AcctId>\n    <EventId>\n      <Va
>300852</Value>\n      <Modified>fa...\n    <Val3>285963</Val3>\n    <Whe
009-04-27T09:57:42Z</When>\n    <Wh...eue_item>\n  <queue_item xmlns="urn
j.api.facebook.com">\n    <AcctId>2...ified>false</Modified>\n    </Event
\n    <Msg>Goedendag, op 18 maart j...pe>\n    <Val1>3</Val1>\n    <Val2>
Val2>\n    <Val3>285661</Val3>\n   ...<queue_item xmlns="urn:obj.api.face
k.com">\n    <AcctId>243</AcctId>\n...jn facebook scherp abonnement verle
. Mij...</Msg>\n    <Type>4</Type>\..."urn:obj.api.facebook.com">\n    <A
Id>243</AcctId>\n    <EventId>\n   ...ype>\n    <Val1>1</Val1>\n    <Val2
</Val2>\n    <Val3>287325</Val3>\n ...:59Z</When>\n    <WhenLT>2009-04-28
:47:59+02:00</WhenLT>\n    <Modifie.../Msg>\n    <Type>4</Type>\n    <Val
</Val1>\n    <Val2>88</Val2>\n    <...d>\n      <Value>310313</Value>\n  
 <Modified>false</Modified>\n    </...<Val1>1</Val1>\n    <Val2>131</Val2
    <Val3>286913</Val3>\n    <When>...   <EventId>\n      <Value>312158</
246        martin        Paul      ...NULL        NULL        6        NU

-----
 Short lines in small files, result from perl comm:
==> data3 <==
1 a
2 b

==> data4 <==
2 b
1 c

        1 c

Some comments. It looks like the sort is OK, possibly sort is managing lines on it own. The trouble is with comm, which appears to mangle long lines. The perl version of comm can handle long lines, at the added cost of overhead of a interpreted language.

This output, while busy, shows the input files, then the results of a perl script in a shell function, flip, to remove carriage returns. The Solaris version of file is not as useful as the LInux version, just showing ascii as opposed to:
Code:
data1: UTF-8 Unicode text, with very long lines, with CRLF line terminators
data2: ASCII text, with very long lines, with CRLF line terminators

for example.

The converted files are then sorted and fed into comm and comm.pl. The result is that comm appears to split long lines into chunks, whereas comm.pl handles them without such a flaw.

The long lines are presented in an abbreviated style by a local code, pll, that we use here. The TABS were converted to runs of blanks by another perl code in function untab.

The final run just shows that comm.pl -13 works as one expects comm -13 to work.

You can find comm.pl at http://cpansearch.perl.org/src/CWEST.../comm/comm.mjd along many other perl versions of common *nix commands. There are also many GNU-style commands in /usr/sfw/bin/ on Solaris systems, but comm does not appear to among them.

Best wishes ... cheers, drl

PS Just so that you can see the entire sample input files, here is the result of the display of converted files f1 and f2 above:
Code:
$ untab f1 | pll 78 
 (Longest line: 58137; fit into lines of length 78)
         1         2         3     ...0        5811        5812        58
12345678901234567890123456789012345...12345678901234567890123456789012345
242        Mandella        Martina ...ent_queue_arr>        213        NU
243        Williamss        Serena ...ent_queue_arr>        216        NU

$ untab f2 | pll 78 
 (Longest line: 58137; fit into lines of length 78)
         1         2         3     ...0        5811        5812        58
12345678901234567890123456789012345...12345678901234567890123456789012345
243        Williamss        Serena ...ent_queue_arr>        216        NU
246        martin        Paul      ...NULL        NULL        6        NU


Last edited by drl; 12-18-2013 at 07:14 AM.. Reason: Minor typo.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to compare prev day file to current day file

Hi all: I am new to this board and UNIX programming so please forgive if I don't explain something correctly. I am trying to write a script to keep track of our links, we link one program written for Client A to Client B's directory. What we want to do is to keep track of our linked programs... (1 Reply)
Discussion started by: Smurtzy
1 Replies

2. Shell Programming and Scripting

delete files one day old in current month only

i want to delete files that are one day old condition is files should be of current month only ie if iam running script on 1 march it should not delete files of 28 feb(29 if leap year :-)} any modifications to find $DIR -type f -atime +1 -exec rm -f{}\; (4 Replies)
Discussion started by: maverick
4 Replies

3. UNIX for Dummies Questions & Answers

I want to get day from the current system date

Hi, I want to check what day is today (like mon,Tue,wed) When i checked the syntax, i dont see there is a format specifier for getting the day. Let me know how to get the same. I am very new to unix and so I am asking some basic questions. cheers, gops (2 Replies)
Discussion started by: gopskrish
2 Replies

4. Shell Programming and Scripting

Get current day on Julian date format

Hi guys, I know if I try to get a julian date using a specific date I can but I try to get the current date I got an error as you can see below: This one works fine: date -d "2010/10/30" +%j But I can't get the current date as below: `date -d "+%Y/%m/%d`" +%j Does somebody can... (6 Replies)
Discussion started by: edudiogo
6 Replies

5. Shell Programming and Scripting

Command to list current day files only

Hi All, can anyone pls share the command to list the files of current day only. i want to check if there are any files in a particular directory which are not of current date. (6 Replies)
Discussion started by: josephroyal
6 Replies

6. UNIX for Dummies Questions & Answers

current day remote files from FTP

Hi All, I have to work on a korn shell script to pick up only the current day files dropped on the remote server (using ftp). The file do not have daytimestamp on it. It has to be based on server time (AIX) The file naming convention is "test_file.txt" When I log in into the ftp account... (15 Replies)
Discussion started by: pavan_test
15 Replies

7. UNIX for Dummies Questions & Answers

Move the files between Current day & a previous day

Hi All, I have a requirement where I need to first capture the current day & move all the files from a particular directory based on a previous day. i.e move all the files from one directory to another based on current day & a previous day. Here is what I am trying, but it gives me errors.... (2 Replies)
Discussion started by: dsfreddie
2 Replies

8. UNIX for Dummies Questions & Answers

How to save current day files only?

i want to save current day file daily for this is am using below command. cp -p $(ls -lrt | grep "Apr 15" | awk '{print $9}' in order to script this part, i am saving date output in a file using below command date | awk '{print $2,$3}' >>t1 thru below command i want to list the file of... (7 Replies)
Discussion started by: scriptor
7 Replies

9. UNIX for Beginners Questions & Answers

How to test the current days to compare a given day?

Hi, I tested this : #!/bin/bash set +x CurrentDay=$(date +'%a') (Fri) on my server Fri=$(date -d "Friday" | awk '{print $1}') Sat=$(date -d "Saturday" | awk '{print $1}') if ] ; then echo "ok" ; else echo "ok" ; fi But the output tell me always "ok" why?! Thanks in advance :b: (5 Replies)
Discussion started by: Arnaudh78
5 Replies

10. UNIX for Beginners Questions & Answers

How to get first & last day of a month from current date?

Hi, I need the first & last day of a month from any given date. For better understanding, if i need to back-fill data for date 07/20/2019 i.e July 20 2019, i need the first & last day has 07/01/2019 - 07/31/2019. FYI: I'm using GIT BASH terminal. sample code: export DT=$(date --date='6 days... (2 Replies)
Discussion started by: Rocky975583
2 Replies
All times are GMT -4. The time now is 04:34 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy