Joining Two Files Does not Work as Expected


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Joining Two Files Does not Work as Expected
# 1  
Old 08-23-2012
Joining Two Files Does not Work as Expected

Hi,

I would like some help with the above awk command.

I am trying to use the join command to join two files, no luck.
I need to put the second column from file2.txt into each matching field of file1.txt.

It works OK up to the value of 1000 of the matching column (1at column in both files), then nothing.

Any help would be appreciated,
thanks
Sid


file1.txt


Code:
950 0.0 1612.0
950 212.0 1762.0
950 488.0 1912.0
950 772.0 2024.0
950 1032.0 2199.0
950 1308.0 2474.0
950 1548.0 2799.0
950 1776.0 3062.0
950 2028.0 3324.0
950 2320.0 3524.0
950 3000.0 4000.0
950 3500.0 4000.0
1000 0.0 1612.0
1000 165.0 1855.0
1000 288.0 1887.0
1000 496.0 1949.0
1000 724.0 2024.0
1000 896.0 2124.0
1000 1052.0 2274.0
1000 1320.0 2524.0
1000 1548.0 2880.0
1000 1684.0 3012.0
1000 1880.0 3112.0
1000 2264.0 3337.0
1000 3000.0 4000.0
1000 3500.0 4000.0
1050 0.0 1612.0
1050 152.0 1780.0
1050 248.0 1834.0
1050 359.0 1923.0
1050 488.0 1962.0
1050 652.0 2037.0
1050 808.0 2099.0
1050 948.0 2212.0
1050 1064.0 2324.0
1050 1312.0 2574.0
1050 1428.0 2712.0
1050 1556.0 2837.0
1050 1652.0 2924.0
1050 1752.0 3037.0
1050 1944.0 3187.0
1050 2260.0 3424.0
1050 3000.0 4000.0
1050 3500.0 4000.0
1100 0.0 1612.0
1100 95.0 1719.0
1100 184.0 1834.0
1100 324.0 1937.0
1100 512.0 1974.0
1100 756.0 2062.0
1100 1060.0 2224.0
1100 1292.0 2399.0
1100 1528.0 2624.0
1100 1672.0 2762.0
1100 1848.0 2962.0
1100 2204.0 3423.0
1100 3000.0 4000.0
1100 3500.0 4000.0
1150 0.0 1612.0
1150 139.0 1794.0
1150 244.0 1916.0
1150 333.0 1997.0
1150 460.0 2062.0
1150 608.0 2087.0
1150 736.0 2099.0
1150 868.0 2162.0
1150 1072.0 2299.0
1150 1296.0 2474.0
1150 1484.0 2699.0
1150 1664.0 2974.0
1150 1836.0 3137.0
1150 2120.0 3399.0
1150 3000.0 4000.0
1150 3500.0 4000.0


file2.txt


Code:
950 -163.34
951 -163.33
952 -163.31
953 -163.30
954 -163.29
955 -163.27
956 -163.26
957 -163.24
958 -163.23
959 -163.22
960 -163.20
961 -163.19
962 -163.17
963 -163.16
964 -163.14
965 -163.13
966 -163.12
967 -163.10
968 -163.09
969 -163.07
970 -163.06
971 -163.05
972 -163.03
973 -163.02
974 -163.00
975 -162.99
976 -162.98
977 -162.96
978 -162.95
979 -162.93
980 -162.92
981 -162.90
982 -162.89
983 -162.88
984 -162.86
985 -162.85
986 -162.83
987 -162.82
988 -162.81
989 -162.79
990 -162.78
991 -162.77
992 -162.75
993 -162.74
994 -162.73
995 -162.72
996 -162.70
997 -162.69
998 -162.68
999 -162.66
1000 -162.65
1001 -162.64
1002 -162.63
1003 -162.62
1004 -162.60
1005 -162.59
1006 -162.58
1007 -162.57
1008 -162.56
1009 -162.55
1010 -162.54
1011 -162.53
1012 -162.52
1013 -162.51
1014 -162.50
1015 -162.49
1016 -162.48
1017 -162.47
1018 -162.46
1019 -162.45
1020 -162.44
1021 -162.43
1022 -162.42
1023 -162.41
1024 -162.40
1025 -162.39
1026 -162.38
1027 -162.37
1028 -162.36
1029 -162.36
1030 -162.35
1031 -162.34
1032 -162.33
1033 -162.32
1034 -162.31
1035 -162.31
1036 -162.30
1037 -162.29
1038 -162.28
1039 -162.28
1040 -162.27
1041 -162.26
1042 -162.25
1043 -162.25
1044 -162.24
1045 -162.23
1046 -162.22
1047 -162.21
1048 -162.21
1049 -162.20
1050 -162.19
1051 -162.18
1052 -162.17
1053 -162.17
1054 -162.16
1055 -162.15
1056 -162.14
1057 -162.13
1058 -162.13
1059 -162.12
1060 -162.11
1061 -162.10
1062 -162.10
1063 -162.09
1064 -162.08
1065 -162.07
1066 -162.07
1067 -162.06
1068 -162.05
1069 -162.04
1070 -162.04
1071 -162.03
1072 -162.02
1073 -162.01
1074 -162.00
1075 -162.00
1076 -161.99
1077 -161.98
1078 -161.98
1079 -161.97
1080 -161.96
1081 -161.95
1082 -161.95
1083 -161.94
1084 -161.93
1085 -161.93
1086 -161.92
1087 -161.92
1088 -161.91
1089 -161.91
1090 -161.90
1091 -161.89
1092 -161.89
1093 -161.88
1094 -161.88
1095 -161.87
1096 -161.87
1097 -161.86
1098 -161.86
1099 -161.86
1100 -161.85
1101 -161.85
1102 -161.84
1103 -161.84
1104 -161.84
1105 -161.83
1106 -161.83
1107 -161.83
1108 -161.82
1109 -161.82
1110 -161.82
1111 -161.81
1112 -161.81
1113 -161.81
1114 -161.81
1115 -161.80
1116 -161.80
1117 -161.80
1118 -161.79
1119 -161.79
1120 -161.79
1121 -161.79
1122 -161.78
1123 -161.78
1124 -161.78
1125 -161.77
1126 -161.77
1127 -161.77
1128 -161.76
1129 -161.76
1130 -161.76
1131 -161.75
1132 -161.75
1133 -161.75
1134 -161.74
1135 -161.74
1136 -161.73
1137 -161.73
1138 -161.73
1139 -161.72
1140 -161.72
1141 -161.71
1142 -161.71
1143 -161.71
1144 -161.70
1145 -161.70
1146 -161.69
1147 -161.69
1148 -161.68
1149 -161.68
1150 -161.67
1151 -161.67
1152 -161.66
1153 -161.66
1154 -161.65
1155 -161.65
1156 -161.65
1157 -161.64
1158 -161.64
1159 -161.63
1160 -161.63
1161 -161.62
1162 -161.62
1163 -161.62
1164 -161.61
1165 -161.61
1166 -161.60
1167 -161.60
1168 -161.59
1169 -161.59
1170 -161.59
1171 -161.58
1172 -161.58
1173 -161.57
1174 -161.57
1175 -161.56
1176 -161.56
1177 -161.56
1178 -161.55
1179 -161.55
1180 -161.54
1181 -161.54
1182 -161.54
1183 -161.53
1184 -161.53
1185 -161.52
1186 -161.52
1187 -161.52
1188 -161.51
1189 -161.51
1190 -161.50
1191 -161.50
1192 -161.49
1193 -161.49
1194 -161.48
1195 -161.48
1196 -161.48
1197 -161.47
1198 -161.47
1199 -161.46
1200 -161.46


I only get


Code:
950 0.0 1612.0 -163.34
950 212.0 1762.0 -163.34
950 488.0 1912.0 -163.34
950 772.0 2024.0 -163.34
950 1032.0 2199.0 -163.34
950 1308.0 2474.0 -163.34
950 1548.0 2799.0 -163.34
950 1776.0 3062.0 -163.34
950 2028.0 3324.0 -163.34
950 2320.0 3524.0 -163.34
950 3000.0 4000.0 -163.34
950 3500.0 4000.0 -163.34


I need matching for when the values of column 1 are higher...


using

Code:
awk 'NR==FNR{D[$1]=$0}; (NR!=FNR) && D[$1] && (!S[$1]++) { print D[$1], $0 }' file1.txt file2.txt

gives me

Code:
950 3500.0 4000.0 950 -163.34
1000 3500.0 4000.0 1000 -162.65
1050 3500.0 4000.0 1050 -162.19
1100 3500.0 4000.0 1100 -161.85
1150 3500.0 4000.0 1150 -161.67


but I need all the instances of
Code:
950
1000
1050..


etc.


thanks
Sid

Moderator's Comments:
Mod Comment edit by bakunin: first off, do not hijack other threads. You have your own problem - open your own thread! We happen to have spare unused threads in abundance and you can use one of them free of any extra charge.

Second: Please view this code tag video for how to use code tags when posting code and data.

I have moved this post to its own thread. Thanks for your consideration.

Last edited by bakunin; 08-23-2012 at 07:06 AM..
# 2  
Old 08-23-2012
Probably "awk" is the wrong command for this. Have a look at the "man" page of the "join" command.

I hope this helps.

bakunin
# 3  
Old 08-23-2012
Thanks, I have tried join, as mentioned above. But it stops at 1000....as soon as the width of the joining column changes from 3 characters to 4 Characters. I wish join could do it.....I could not get it done wit join, and was told it can not do it......I would be happy with any solution......thanks
# 4  
Old 08-23-2012
Attaching large files instead of posting more than a few sample lines would make the post much easier to read and follow.
Pls post the join command you used.

Reading (and considering) the error msg sometimes helps:
Code:
join: file2:51: is not sorted: 1000 -162.65

If you can afford it, add leading 0s to the entries up to 1000, and it will work:
Code:
.
.
.
0950 2028.0 3324.0 -163.34
0950 2320.0 3524.0 -163.34
0950 3000.0 4000.0 -163.34
0950 3500.0 4000.0 -163.34
1000 0.0 1612.0 -162.65
1000 165.0 1855.0 -162.65
1000 288.0 1887.0 -162.65
1000 496.0 1949.0 -162.65
.
.
.

If not, add them before the join and remove them in the output file with sed, awk, etc.

Last edited by RudiC; 08-23-2012 at 07:32 AM..
This User Gave Thanks to RudiC For This Post:
# 5  
Old 08-23-2012
Sorry for the long inserts......could not see attachment button.....will look for it.
Command tried was :
Code:
join file1.txt file2.txt

Also
Code:
join -1 1 -2 1 file1.txt file2.txt

With few other switches, none of them helped. Both my files start at 100, and join seems to handle it up.to 1000.
Thanks
Sid


Moderator's Comments:
Mod Comment You were asked by bakunin to use code tags your code and data etc.. Please do so, thanks. There was also some other notes in bakunin's moderator note you should obeye in the future, thanks.

Last edited by zaxxon; 08-23-2012 at 07:41 AM.. Reason: code tags
# 6  
Old 08-23-2012
crosspost... see above
# 7  
Old 08-23-2012
Thanks RudiC,

I have tried the join command on sorted versions of my input files as well.
They are actually sorted....first column always increases.
I was suggested the padding with 0 solution....unfortunately can not find a way to do it.
Manually can not do it, as I have hundreds of these pairs to join, reading them in into Excel is not a solution neither.
I would need some batch command line solution here.

Sorry everybody for not knowing th rules of posting and inserting, will review and use as appropriate,
regards
Sid
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Bash script does not work as expected

Repeat this text in a file named notes.txt and run the script Before bash is a good language a blank line appears Also, the following notes are displayed incorrectly What is bad? ================================== Title : Note 1 ================================== Category: Computer Date... (3 Replies)
Discussion started by: cesar60
3 Replies

2. Shell Programming and Scripting

Why my SETUID does not work as expected?

Hi All, Thanks for your help in advanced. Could you please kindly help on why my SETUID does not work? create a file, only root can read write it /tmp>ls -l a.log -rw------- 1 root root 3 Nov 12 18:57 a.log create a script under root with SETUID /tmp>ls -l a.sh -rwsr-sr-x 1 root... (3 Replies)
Discussion started by: summer_cherry
3 Replies

3. UNIX for Dummies Questions & Answers

sed command does not work as expected

Why when I use this command do I get "E123"? echo NCE123 | sed -n 's/\(.*\)\(\{1,\}\{1,5\}\)\(.*\)/\2/p' But when I used this command, I get NCE123? echo NCE123 | sed -n 's/\(.*\)\(\{3\}\{1,5\}\)\(.*\)/\2/p' I thought \{1,\} would mean any number of characters and \{1,5\ would mean 1-5... (1 Reply)
Discussion started by: newbie2010
1 Replies

4. Shell Programming and Scripting

Help with joining files and adding headers to files

Hi, I have about 20 tab delimited text files that have non sequential numbering such as: UCD2.summary.txt UCD45.summary.txt UCD56.summery.txt The first column of each file has the same number of lines and content. The next 2 column have data points: i.e UCD2.summary.txt: a 8.9 ... (8 Replies)
Discussion started by: rrdavis
8 Replies

5. UNIX for Dummies Questions & Answers

Joining two files

I have two comma separated files. I want to join those filesa nd put the result in separate file. smaple data are: file1: A1,1,100 A2,1,200 B1,2,100 B2,2,200 file2 1,50 1,25 1,25 1,100 1,100 2,50 2,50 (10 Replies)
Discussion started by: pandeesh
10 Replies

6. Shell Programming and Scripting

Parsing XML in awk : OFS does not work as expected

Hi, I am trying to parse regular XML file where I have to reduce number of decimal points in some xml elements. I am using following AWK command to achive that : #!/bin/ksh EDITCMD='BEGIN { FS = ""; OFS=FS } { if ( $3 ~ "*\\.*" && length(substr($3,1+index($3,"."))) == 15 ) {... (4 Replies)
Discussion started by: martin.franek
4 Replies

7. UNIX for Dummies Questions & Answers

For some reason, my grep doesn't work as expected

I am trying to find only those entries where 7018 and another number appear in the end of the line. 7018 2828 1423 2351 7018 2828 14887 2828 7018 1222 123 7018 1487 I am looking for a way to generate only the last two lines. I was trying to do just "grep '7018{1,5}" but it does not... (5 Replies)
Discussion started by: Legend986
5 Replies

8. Shell Programming and Scripting

Script doesn't work as expected when run on cron

The script checks for free space stats on Oracle. If there are any tablespaces with more than 85% usage it prints the details of the tablespace. If all the tablespaces have more than 15% free space, then "All tablespaces have more than 15 pct free space" must be printed on the screen. When I run... (2 Replies)
Discussion started by: RoshniMehta
2 Replies

9. Shell Programming and Scripting

Help with joining two files

Greetings, all. I've got a project that requires I join two data files together, then do some processing and output. Everything must be done in a shell script, using standard unix tools. The files look like the following: File_1 Layout: Acct#,Subacct#,Descrip Sample: ... (3 Replies)
Discussion started by: rjlohman
3 Replies
Login or Register to Ask a Question