![]() |
|
|
|
|
|||||||
| Forums | Portal | Register | Rules & FAQ | Contribute | Members List | Arcade | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here. |
|
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Join two files | koti_rama | Shell Programming and Scripting | 4 | 06-10-2008 03:15 AM |
| How to ignore incomplete files | sentak | SUN Solaris | 6 | 02-14-2008 12:03 PM |
| How to ignore incomplete files | sentak | Shell Programming and Scripting | 6 | 02-14-2008 10:29 AM |
| Join Files | choppas | Shell Programming and Scripting | 2 | 10-18-2006 07:03 AM |
| append newline to files with incomplete last line | ziyi | UNIX for Dummies Questions & Answers | 1 | 04-14-2004 06:00 AM |
|
|
LinkBack | Thread Tools | Display Modes |
|
|||
|
Join of files is incomplete?!
Hi folks,
I am using the join command to join two files on a common field as follows: File1.txt Adsorption|H01.181.529.047 Adult|M01.060.116 Children|M01.055 File2.txt 5|Adsorption|C0001674 7|Adult|C000001 6|Children|C00002 join -i -t "|" -a 2 -1 1 -2 2 File1.txt File2.txt This works fine for some lines but not all - Adult is missed whatever I try to do e.g. put to lower case etc? Adsorption|H01.181.529.047|5|C0001674 7|Adult|C000001 Children|M01.055|6|C00002 |
| Forum Sponsor | ||
|
|
|
|||
|
There's this from the 'join' manual at www.gnu.org
'Either file1 or file2 (but not both) can be `-', meaning standard input. file1 and file2 should be already sorted in increasing textual order on the join fields, using the collating sequence specified by the LC_COLLATE locale...' Another site mentions that:- 'However, as a GNU extension, if the input has no unpairable lines the sort order can be any order that considers two fields to be equal if and only if the sort comparison described above considers them to be equal.' Which suggests to me that experimenting with the LC_COLLATE environment variable may allow the command to work. |
|
|||
|
System - SunOS 5.9
I am using Unix join to join the following two files. FileA _______________ 1,-1 3,-1 5,-1 49,-3 51,-1 52,-1 53,-1 54,-1 56,-2 57,-2 61,-1 62,-2 65,-1 66,-2 71,-1 72,-2 81,-3 82,-3 91,-4 99,-1 100,-5 FileB ________ 1,2222 3,3222 5,2342 11,2418 15,1890 16,2445 20,2465 21,1889 30,1588 30,1888 31,2887 40,3423 45,4321 49,2345 51,5567 52,5210 53,4444 54,4567 56,1111 57,5678 61,6754 62,6742 65,1231 66,6765 71,1234 71,1991 72,7168 81,7777 82,8765 91,8766 99,9812 99,9998 100,8888 100,8981 First I sort them as - sort -b -n -t ',' +0 FileA > A_sort sort -b -n -t ',' +0 FileB > B_sort Then I join them as, join -t ',' -j1 1 -j2 1 -o 0 1.2 2.2 A_sort B_sort and get - 1,2222,-1 3,3222,-1 5,2342,-1 51,5567,-1 52,5210,-1 53,4444,-1 54,4567,-1 56,1111,-2 57,5678,-2 61,6754,-1 62,6742,-2 65,1231,-1 66,6765,-2 71,1234,-1 71,1991,-1 72,7168,-2 81,7777,-3 82,8765,-3 91,8766,-4 99,9812,-1 99,9998,-1 I miss the following - 49,2345,-3 100,8888,-5 100,8981,-5 Why is this happening? Are they being internally treated as character though I specify -n in sort? What do i need to do? btw, both LC_COLLATE and LC_CTYPE are set to "". Should I set them as POSIX or C or something? Many thanks in advance to all the Unix enthusiasts in this forum |
|||
| Google UNIX.COM |
| Tags |
| linux |
| Thread Tools | |
| Display Modes | |
|
|