The UNIX and Linux Forums  


Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Cant figure out this.. Stud33 Shell Programming and Scripting 1 10-25-2007 08:26 PM
Can't figure out else not matching peteroc Shell Programming and Scripting 4 09-19-2006 05:58 PM
figure it out cool_dude UNIX for Dummies Questions & Answers 1 09-11-2006 01:49 PM
i can not figure this out steph UNIX for Dummies Questions & Answers 1 08-21-2002 09:32 AM
diff 2 files; output diff's to 3rd file blt123 Shell Programming and Scripting 2 05-28-2002 12:29 PM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Bulgarian Greek Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 04-05-2008
Movomito Movomito is offline
Registered User
  
 

Join Date: Apr 2008
Posts: 27
Bash uniq/ diff/ and other I cant figure it out

First off thank you for any help.
Here is the problem. I have two text files that fit the same format. The first I created using an ls -d command and then with the help of the forums ran awk resulting in the fallowing output.

W00CHZ0103345-I1CZ44
W00E6S1016722-I01JW159
W00E6S1016722-I01JW160
W00E6S1016722-I01JW161
W00EGS10125151-I01JW176
W00EGS10125151-I01JW177
W00EGS10125151-I01JW178
W00EGS10125151-I01JW179
W00EGS10125151-I01JW180
W00EGS1012593-I00EGS1017114

I have a second text file whose format i was trying to successfully managed to match. Here it is.

W00CHZ0103345-I1CZ44
W00EGS1016051-I00EGS1016053
W00EGS1016054-I00EGS1016056
W00EGS1016057-I00EGS1016059
W00EGS1016060-I00EGS1016062
W00EGS1016181-I1PD10388
W00EGS1016199-I00EGS1016201
W00EGS1016202-I1GS65488
W00EGS1016210-I00EGS1016212
W00EGS1016213-I00EGS1016216

Now these lists are nearly 10,000 lines long and what i need to do is compare them and see what only occurs on the second list. I have tried sorting and using uniq and diff and everything else that i can think of but was unable to generate a list of the lines that only appear in txt2. That being said if there were lines that appeared in the first text and not the second i would want to know but i would need to know from which list it came.

Please help if you can.

Thank you
  #2 (permalink)  
Old 04-05-2008
era era is offline Forum Advisor  
Herder of Useless Cats (On Sabbatical)
  
 

Join Date: Mar 2008
Location: /there/is/only/bin/sh
Posts: 3,652
Look at the comm utility.
  #3 (permalink)  
Old 04-05-2008
cfajohnson's Avatar
cfajohnson cfajohnson is offline Forum Advisor  
Shell programmer, author
  
 

Join Date: Mar 2007
Location: Toronto, Canada
Posts: 2,362
Quote:
Originally Posted by Movomito View Post
That being said if there were lines that appeared in the first text and not the second i would want to know but i would need to know from which list it came.

Use the comm command.
  #4 (permalink)  
Old 04-06-2008
Movomito Movomito is offline
Registered User
  
 

Join Date: Apr 2008
Posts: 27
I looked at comm

i looked at comm and I can't seem to get my desired output.
here is what i am trying maybe someone can help.

first i run sort text1.txt > newtext1.txt

then the same with text2.txt

then i try this:
comm -3 newtext1.txt newtxt2.txt

I have tried a variation of comm -1, comm -2, -1 -2, -3, pretty much everthing that i can think of but i am still getting output that i can find in both files. I want just the lines that occur in only newtext1.txt
  #5 (permalink)  
Old 04-06-2008
era era is offline Forum Advisor  
Herder of Useless Cats (On Sabbatical)
  
 

Join Date: Mar 2008
Location: /there/is/only/bin/sh
Posts: 3,652

Code:
comm -23 newtext1. newtxt2.txt

  #6 (permalink)  
Old 04-06-2008
Movomito Movomito is offline
Registered User
  
 

Join Date: Apr 2008
Posts: 27
comm -23 file1 file2 still not working

so i tried using the comm -23 file1 fil2 per the last suggestion and still am not getting the desired result. Using bbedit i can look and do a line count and there is a difference of appx 800 files from file1 to file2. and so that is the output that i am expecting. Now i have tried sorting the files before running comm and outputting that sort to new files, on which i do the comm. there are two strange things happening here one after comparing my new files by eye after running sort i can see that they are not sorted the same way, and secondly after running comm -23 etc. i get a reurn of one entry which i cant find in either file. Any help is appreciated
  #7 (permalink)  
Old 04-06-2008
era era is offline Forum Advisor  
Herder of Useless Cats (On Sabbatical)
  
 

Join Date: Mar 2008
Location: /there/is/only/bin/sh
Posts: 3,652
Can you pinpoint the sort discrepancy, i.e. prune it down to a list of, say, three lines which sort differently? Do you have locales at play? (Still, comm and sort should obey the same locale; but if they don't, what you describe sort of makes sense.) See also the locale manual page. See if adding LC_ALL=C helps.


Code:
LC_ALL=C sort file1 >sorted1.lc_all=c
diff sorted1.lc_all=c newtext1.txt

Closed Thread

Bookmarks

« Permissions | date »
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 01:39 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0