How to compare 1, 1.0 and 1.00 ?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to compare 1, 1.0 and 1.00 ?
# 1  
Old 01-22-2016
How to compare 1, 1.0 and 1.00 ?

Hi,

I have a file containing a list of codes, followed by a description, such as below:

Code:
008.0,ESCHERICHIA COLI
008.1,ARIZONA
008.2,AEROBACTER AEROGENES
008.6,ENTERITIS DUE TO SPECIFIED VIRUS
008.8,OTHER ORGANISM, NOT ELSEWHERE CLASSIFIED
008,INTESTINAL INFECTIONS DUE TO OTHER ORGANISMS

I have a second list of codes such as:

Code:
008.0,ESCHERICHIA COLI
008.00,OTHER SPECIFIED BACTERIA
008.1,ARIZONA
008.2,AEROBACTER AEROGENES
008.6,ENTERITIS DUE TO SPECIFIED VIRUS
008.8,OTHER ORGANISM, NOT ELSEWHERE CLASSIFIED
008,INTESTINAL INFECTIONS DUE TO OTHER ORGANISMS

These i want to compare these lists for differences, and make a file listing only the files present in the second list but not the first.

to do this i have :

Code:
FILE1=Codes1_Final.txt
FILE2=Codes2_Final.txt

# make a copy of the file, with only the first column, this will be used for comparison
awk -F, '{print $1}' $FILE1 > col_F1.txt
awk -F, '{print $1}' $FILE2 > col_F2.txt

# compare - lines existing in FILE2 that are not present in FILE1
grep -Fxv -f col_F1.txt col_F2.txt > tmp.txt

rm -f Codes_in_"$FILE2"_Not_In_"$FILE1"

# output codes and descriptions
NUM_LINES=$(awk 'END {print NR}' tmp.txt) 

for ((i=1; i<=$NUM_LINES; i++)) do

CODE=$(awk -v LINE="$i" 'NR==LINE {print $1}' tmp.txt) 

awk -F, -v CODE="$CODE" '{if($1 == CODE) print $0}' $FILE2 >> Codes_in_"$FILE2"_Not_In_"$FILE1"

done

I realise that this is a bit long handed, (I'm a self taught, unskilled programmer) but it works. The problem that I am having is that awk treats 008, 008.0, and 008.00 as the same value when doing the comparison, as it does a numeric comparison. This creates errors in my output. I think that I need to do the comparison as a string, but can't work out how to do it.

How can i solve this?
# 2  
Old 01-22-2016
I think this is what you want - much shorter version:
Code:
awk -F, ' FILENAME=="Codes1_Final.txt" {arr[$1]++}
          FILENAME=="Codes2_Final.txt" {if( ! $1 in arr){ print $0} }' Codes1_Final.txt Codes2_Final.txt

awk has associative arrays and you should consider using them for this kind of data problem.
# 3  
Old 01-22-2016
Would
Code:
awk 'NR==FNR {T[$1]; next} !($1 in T)' file1 file2
008.00,OTHER SPECIFIED BACTERIA

do?

---------- Post updated at 19:27 ---------- Previous update was at 19:22 ----------

Or, even shorter:
Code:
awk 'NR==FNR {T[$1]} !($1 in T)' file1 file2
008.00,OTHER SPECIFIED BACTERIA

In awk, arrays' indexes are of string type, so it will be a string comparison.

Last edited by RudiC; 01-22-2016 at 02:44 PM..
# 4  
Old 01-22-2016
So you are not confused - the two examples - RudiC's awk example and the earlier post are really the same thing. awk really some nifty stuff once you learn a little about it.
RudiC's example is the more advanced version. And how it probably should be written once you learn awk. But if you want somebody to understand what is going on, the earlier one might be better.

-------------------- edit
Actually to a beginner RudiC's code might look like some kind of code obfuscation thing. Come to think of it. It is perfectly good BTW.

-- another edit
RudiC's code needs a field delimiter to work I think. comma is not a default field separator (-F ,)

Last edited by jim mcnamara; 01-22-2016 at 02:35 PM..
This User Gave Thanks to jim mcnamara For This Post:
# 5  
Old 01-22-2016
Thanks, yes, overlooked that! Add the -F, option right after awk.

And, yes, it is very terse, of not TOO high educational value. Feel free to ask about the parts you don't understand.
# 6  
Old 01-22-2016
Thanks for the replies.

Jim,
I don't understand a few parts of your code:
Code:
awk -F, ' FILENAME=="Codes1_Final.txt" {arr[$1]++}           FILENAME=="Codes2_Final.txt" {if( ! $1 in arr){ print $0} }' Codes1_Final.txt Codes2_Final.txt

I understand that the first line is reading the .csv file into an array.

I don't fully understand the second line. I understand the If and print statements but not the rest. Is the first part :
Code:
FILENAME=="Codes2_Final.txt"

using the Codes2_Final.txt as the comparison?

Also what does the following section do? Have the files to use not already been specified to awk?
Code:
Codes1_Final.txt Codes2_Final.txt

Would i just add
Code:
> out.txt

on the end to write the results to a file?
# 7  
Old 01-22-2016
Quote:
Originally Posted by carlr
Also what does the following section do? Have the files to use not already been specified to awk?
Actually no and you have slightly misunderstood the whole purpose of both lines.

If you specifiy several files to awk like this:

Code:
awk '<some awk-program>' file1 file2

awk will first read file1, then file2. Bear that in mind for a moment.

You can specify a logical comparison and - if it evaluates to TRUE - have awk execute some code-block (specified in curly braces). In "normal" prgramming languages you would use an "if"-block for the same:

Code:
# shell programming:

if [ a = b ]
then
    cmd1
    cmd2
    ...
fi

# in awk

a == b      {cmd1; cmd2; ... }

So, the purpose of the parts:
Code:
FILENAME=="Codes1_Final.txt" {....}
FILENAME=="Codes2_Final.txt" {....}

in fact means: IF awk is still processing the Codes1-file, only work the first block and skip the second. IF, to the contrary, the first file is finished and awk works on the second then skip the first block and execute the second. Notice, that "==" means to compare two values, not to give a variable a certain value:

Code:
x==y    # is TRUE or FALSE, depending on x and y being equal
x=y     # assigns to x the value of y


Quote:
Originally Posted by carlr
Would i just add
Code:
> out.txt

on the end to write the results to a file?
Yes, exactly.

I hope this helps.

bakunin
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Compare

file1: test1 123 test2 200 test3 300 test4 400 test5 500 file2: table1 5 test1 111 test2 200 table2 100 test3 300 test4 400 test5 500 my needed result: test1 123 test1 111 test2 200 test2 200 (8 Replies)
Discussion started by: loktamann
8 Replies

2. UNIX for Dummies Questions & Answers

Compare data - Match first column and compare second

Hi guys, looking for some help with a way to compare data in two files but with some conditions. example, File 1 consists of site1,10.1.1.1 site2,20.2.2.2 site3,30.3.3.3 File 2 contains site1,l0.1.1.1 site2,50.1.1.1 site3,30.3.3.3 site4,40.1.1.1 I want to be able to match the... (1 Reply)
Discussion started by: mutley2202
1 Replies

3. Shell Programming and Scripting

Need help on compare

Hi, I need your help with the below mentioned issue: I have 2 files say, file1.txt: id|no|cde a|4|7 b|3|2 c|8|8 d|8|9 file2.txt: id|no|cde a|4|6 b|2|2 c|8|8 (1 Reply)
Discussion started by: lkeswar
1 Replies

4. Shell Programming and Scripting

value compare

Hi, file contains only one row. END OF FILE. ROW COUNT: 8,9 We need to check the file contains exact string "END OF FILE. ROW COUNT: " if yes, get the 8,9 values then compare if both are equal print the "equal" if not "notequal". Thanks, (4 Replies)
Discussion started by: bmk
4 Replies

5. Shell Programming and Scripting

Require compare command to compare 4 files

I have four files, I need to compare these files together. As such i know "sdiff and comm" commands but these commands compare 2 files together. If I use sdiff command then i have to compare each file with other which will increase the codes. Please suggest if you know some commands whcih can... (6 Replies)
Discussion started by: nehashine
6 Replies

6. Shell Programming and Scripting

Compare

grep '^.......$' /usr/dict/words | cut -c1,2,3,5,6,7 This gives a list of 6 character strings, some are words some not. Any suggestions on how I can get rid of the ones that aren't words and print the ones that are? I have tried look with no luck yet. (1 Reply)
Discussion started by: kezkez
1 Replies

7. Shell Programming and Scripting

compare between the two dates

Hi all, How to check whether the given the two dates is minimal. example: Date 1 : 23-03-2008 with timestamp Date 2: 20-03-2008 With tmestamp I want to compare the twodates and which it gives the minimum date i wnat to get the output like this below output: the Date2 is... (1 Reply)
Discussion started by: balaji23_d
1 Replies

8. Shell Programming and Scripting

Compare two arrays in sh or compare two fields

I want a soultion to compare two arrays in sh with an easy way.I want a solution to synchrose users between different AIX servers where no NIS is available. All users are meant to be same on all 10 servers. So the approach is to consider first server as master user repository and whatever the users... (0 Replies)
Discussion started by: rijeshpp
0 Replies

9. Shell Programming and Scripting

How to compare it ??

Hi how do you compare it in ksh ALINES=$(cat AFILE | wc -l) BFIRST=$(head -1 BFILE) I got ALINES=9 and BFRIST=records'9' I want 9=9 , how do you write BLINES=9 from records'9' so I can say ALINES==BLINES Thanks (4 Replies)
Discussion started by: sabercats
4 Replies

10. UNIX for Dummies Questions & Answers

Compare

Hi How can I compare two sendmail.cf from two different systems? We are using Tru64 Unix Thank you, Syed_45 (2 Replies)
Discussion started by: Syed_45
2 Replies
Login or Register to Ask a Question