![]() |
|
|
|
|
|||||||
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here. Shell Script Page. |
|
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| compare 2 files.. | amon | Shell Programming and Scripting | 8 | 4 Weeks Ago 07:34 AM |
| compare two files | charandevu | Shell Programming and Scripting | 7 | 03-30-2008 12:20 PM |
| Compare files | kharen11 | UNIX for Advanced & Expert Users | 25 | 03-14-2007 01:35 AM |
| compare files and beyond | MizzGail | UNIX for Dummies Questions & Answers | 2 | 04-25-2003 10:34 AM |
| compare files | ingunix | UNIX for Dummies Questions & Answers | 3 | 05-24-2001 08:44 AM |
|
|
Submit Tools | LinkBack | Thread Tools | Display Modes |
|
|||
|
compare two files
I have file1 and file2:
file1: 11 xxx kksd ... 22 kkk kdsglg... 33 sss kdfjdksa... 44 kdsf dskjfkas ... hh kdkf kdkkd.. jg dkf dfkdk ... ... file2: jg 22 hh ... I need to check each line of file1. if the field one is in file2, I will keep it; if not, the whole line will be discarded. The result file will be: jg dkf dfkdk ... 22 kkk kdsglg... hh kdkf kdkkd.. ... please tell me how I can do this, thanks! |
| Forum Sponsor | ||
|
|
|
|||
|
Quote:
... <object type="user" id="000039BF228B" encryptedPassword="" maxConnections="" > <checkListAttributes> </checkListAttributes> </object> ... <object type="user" id="0000E2801BFD" encryptedPassword="" > <checkListAttributes> </checkListAttributes> </object> ... and file2 is a list of id, as: ... 000039BF228B 0000E2801BFD ... I want to delete all the blocks whose id is not in file2, and keep those with id in file2. I think we can change the RS (record separator to </object>), but I do not know how to do the whole job. would you help again? |
|
|||
|
Quote:
|
|
|||
|
I think there is a misunderstanding, as I only want to keep the blocks whose id has a match in the second file. If I have a block as:
object type="user" id="999999999999" encryptedPassword="" > <checkListAttributes> </checkListAttributes> </object> and 999999999999 is not in the second file, the whole block should be discarded. but after I run your code, it is still there. any idea? |
|
||||
|
I said GNU awk,
are you using GNU awk? It's hard to troubleshoot, unless I can see the entire file1 and file2 content. Could you also post the output from this commands: Code:
patt="$(printf "id=\"%s\"|" $(<file2))" ; echo "${patt%|}"
Code:
$ awk --version| head -2
GNU Awk 3.1.5
Copyright (C) 1989, 1991-2005 Free Software Foundation.
$ cat file1
<object
type="user"
id="0000E2801BFD"
encryptedPassword=""
>
<checkListAttributes>
</checkListAttributes>
</object>
<object
type="user"
id="999999999999"
encryptedPassword=""
>
<checkListAttributes>
</checkListAttributes>
</object>
$ cat file2
000039BF228B
0000E2801BFD
$ awk '$0 ~ patt{print $0RS}' RS="</object>" patt="${patt%|}" file1
<object
type="user"
id="0000E2801BFD"
encryptedPassword=""
>
<checkListAttributes>
</checkListAttributes>
</object>
|
|
||||
|
Quote:
it's an "ugly" and "buggy" code (think what happens if your file2 is big I'm not able to write a good code in 2 minutes The first command generate your pattern list with a various "or" ("|"). The second tests all the records (RS="</object>" assumed) in file1 against it. |
|
|||
|
comment
Quote:
|
|
|||
|
Quote:
|
|
||||
|
Quote:
Code:
patt="$(printf "id=\"%s\"|" $(<file2))" Code:
awk '$0 ~ patt{print $0RS}' RS="</object>" patt="${patt%|}" file1
|
| Thread Tools | |
| Display Modes | |
|
|
|
The 50 most popular UNIX and Linux searches.
Google Search Cloud for The UNIX and Linux Forums
|
| "inappropriate ioctl for device" 421 service not available, remote server has closed connection ^m autosys awk trim bash eval bash exec bash for loop boot: cannot open kernel/sparcv9/unix close_wait command copy/move folder in unix curses.h cut command in unix dead.letter find grep find null character in a unix file grep multiple lines grep or grep recursive grep unique inappropriate ioctl for device logrotate.conf lynx javascript mailx attachment mget mtime |