![]() |
|
|
|
|
|||||||
| Forums | Portal | Register | Rules & FAQ | Contribute | Members List | Arcade | Search | Today's Posts | Mark Forums Read |
| UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !! |
|
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| awk, join or sed | jkl_jkl | Shell Programming and Scripting | 1 | 04-15-2008 02:55 AM |
| Join | jazz8146 | UNIX for Dummies Questions & Answers | 5 | 01-29-2008 07:42 AM |
| join (pls help on join command) | summer_cherry | Shell Programming and Scripting | 1 | 12-31-2007 01:19 AM |
| Strip all non-alphanumerics | braindrain | Shell Programming and Scripting | 3 | 09-17-2006 11:21 AM |
|
|
LinkBack | Thread Tools | Display Modes |
|
|||
|
Use non alphanumerics in join
Hi,
I have a problem while joining two sorted files with "join". File 1.txt Alnus|123 ALO140102|234 ALO 1401 02|345 ALO-1401-02|456 Alobar Holoprosencephalies|567 File 2.txt 1|Alnus| 1|ALO 1401 02| 1|ALO-1401-02| 1|Alobar Holoprosencephalies| If I join the files as follows: join -i -t '|' -1 1 -2 2 file1.txt file2.txt this doesn't work because the join command ignores punctuation i.e. it checks ALO140102 against file 2 and when it doesn't find a match it moves on to Alobar Holoprosencephalies. If ALO140102 IS present in file 2 then the match works fine. Therefore I need to get the join command to recognise non-alphanumerics. Any ideas?!! |
| Forum Sponsor | ||
|
|
|
|||
|
Long-winded...
I've done it a long winded way by replacing punctuation with alphanumeric tags (e.g. REMOVE1) resorted the files and and then do the join. This works fine as the tags are matched exactly whereas the punctuation was not. However, this seems a ridiculous way to do it - there must be a better one!
I think it may be to do with the way UNIX matches which I think you can change with the LC_COLLATE variable but I'm not sure. |
|||
| Google UNIX.COM |