File joining and sorting


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers File joining and sorting
# 1  
Old 06-23-2013
File joining and sorting

Hi,

I'm having some trouble joining these two files for some reason. Here is what they look like:

Code:
head *

Quote:
==> f1 <==
2L,1
2L,1.00000E+06
2L,1.00000E+07
2L,1E+06
2L,1E+06
2L,1E+06
2L,1E+06
2L,1E+06
2L,1E+07
2L,1E+07

==> f2 <==
10117182,1
10117309,1
10117431,1
10117467,1
10117536,1
10117554,1
10126359,1
10126386,1
10126486,1
10126597,1
I'm using:
Code:
join -a 1 -1 2 -2 1 f1 f2 -t, > joinfile.out

but unfortunately nothing is happening. I did notice that I was having trouble sorting f1 and I'm not sure why using:
Code:
cat f1 | sort -k 2 -t "," > f1.sort

Any ideas?
# 2  
Old 06-23-2013
For sorting use -g option (general-numeric-sort)
Code:
sort -t, -k2,2g f1

By the way I don't understand how want to join your files.
# 3  
Old 06-23-2013
man join:
Quote:
NAME
join - join lines of two files on a common field
There are no common fields in your files, no surprise you're "having some trouble joining these two files" . . .
# 4  
Old 06-23-2013
Hi Yoda,

Your suggestion for sorting worked. Thanks!

As far as the second point is concerned, I want to join column 2 from file f1 and column 1 from file f2.

when I do
Code:
join -a 1 -1 2 -2 1 f1.sort f2.sort -t, > f3

I get
Quote:
usage: join [-a fileno | -v fileno ] [-e string] [-1 field] [-2 field]
[-o list] [-t char] file1 file2
# 5  
Old 06-23-2013
Note how the file operands are the final arguments in the usage message.

Regards.
Alister
# 6  
Old 06-23-2013
Thank you, Allister. You pointed out the problem for the joining step.

However, even now that it's working, it's not producing the joined output I am looking for.

From f1.sort and f2.sort which look like this:
Quote:
==> f1.sort <==
2L,1
2L,2
2L,3
2L,4
2L,5
2L,6
2L,7
2L,8
2L,9
2L,10

==> f2.sort <==
10117182,1
10117309,1
10117431,1
10117467,1
10117536,1
10117554,1
10126359,1
10126386,1
10126486,1
10126597,1
The output should merge the same column and tell me which lines have a "1" tag in the new output file like this

Quote:
===> f3 <===
2L,9
2L,10
...
...
2L,10117182,1
2L,10117183
...
...
2L,10117309,1
The command I'm using is
Code:
join -a 1 -1 2 -2 1 -t, f1.sort f2.sort > f3



I've also tried turning the two input files, f1.sort and f2.sort, from comma to tab delimited (on a mac, so \t doesn't work with sed. So i've been using cntrl V + tab:
Code:
for i in *.sort ; do sed 's/,/cntrl v + tab/g' $i > $i.tab; done

but the join command still isn't working... Do you think it's failing because of the way the files are sorted? I used Yoda's suggestion but join may not know I used -g with sort...

---------- Post updated at 07:21 PM ---------- Previous update was at 05:50 PM ----------

Okay, I figured it out. It was just a sorting issue.

But this now leads me to a strange problem. I had over 1,000 lines to join from f2 to f1 but only 15 join... why do you guys suppose this is happening? file f1 has ~70000000 lines and file f2 has ~1000 lines. I am not sure why it's only matching 15 lines when all 1000 exist in file f1.
# 7  
Old 06-23-2013
I would suggest using awk instead. There are several examples of using awk to join 2 files in this forum.
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Joining and sorting with csvs with subfields

hello masters, I am working with csv files that open just fine in excel, but have sub-fields which are comma separated as well. a 3 column csv looks like a,b,"c,d,e" f,g,h How do I make join or sort believe that "c,d,e" is just 1 field? (8 Replies)
Discussion started by: senhia83
8 Replies

2. Shell Programming and Scripting

Joining lines in a file - help!

I'm looking for a way to join lines in a file; e.,g consider the following R|This is line 1 R|This is line 2 R|This is line 3 R|This is line 4 R|This is line 5 what i want to end up with is R|This is line 1 R|This is line 2 R|This is line 3 R|This is line 4 R|This is line 5 so... (15 Replies)
Discussion started by: Storms
15 Replies

3. Shell Programming and Scripting

bash - joining lines in a file

I’m writing a bash shell script and I want to join lines together where two variables on each line are the same ie. 12345variablestuff43212morevariablestuff 12345variablestuff43212morevariablestuff 34657variablestuff78945morevariablestuff 34657variablestuff78945morevariablestuff... (12 Replies)
Discussion started by: Cultcha
12 Replies

4. UNIX for Dummies Questions & Answers

Joining lines of a text file using GAWK

sir... am having a data file of customer master., containing some important fields as a set one line after another., what i want is to have one set of these fields(rows) one after another in line.........then the second set... and so on... till the last set completed. ... (0 Replies)
Discussion started by: KANNI786
0 Replies

5. Shell Programming and Scripting

joining multiple files into one while putting the filename in the file

Hello, I know how to join multiple files using the cat function. I want to do something a little more advanced. Basically I want to put the filename in the first column... One thing to note is that the file is tab delimited. e.g. file1.txt joe 1 4 5 6 7 3 manny 2 3 4 5 6 7 ... (4 Replies)
Discussion started by: phil_heath
4 Replies

6. Shell Programming and Scripting

Transposing column to row, joining with another file, then sorting columns

Hello! I am very new to Linux and I do not know where to begin... I have a column with >64,000 elements (that are not in numberical order) like this: name 2 5 9 . . . 64,000 I would like to transpose this column into a row that will later become the header of a very large file... (2 Replies)
Discussion started by: doobedoo
2 Replies

7. UNIX for Dummies Questions & Answers

joining variable to the end of a file name

hi all i have a directory which contain file 20060101-66666-09-08-0.tif 20060101-77777-11-12-0.tif 20051231-54221-66-55.tif 20051231-54221-66-44.tif as you can see the name of the two last files is shorter then the first ones i want to take all the files with the shorter name and to add to... (7 Replies)
Discussion started by: naamas03
7 Replies

8. Shell Programming and Scripting

Joining program to one batch file

I created a batch file (./mybatch) that need to run few programs at a sequnece but i need a command like the DOS call command in order to return to the main batch file to proceed the sequence example: cd /dir1/path/dir2 invoke program1 cd /dir3/path2/ <--- i want to return here (2 Replies)
Discussion started by: eynkesef
2 Replies

9. Shell Programming and Scripting

Joining 2 lines in a file together

Hi guys, I've got a log file which has entries that look like this: ------------------------------------------------------------------------------- 06/08/04 07:57:57 AMQ9002: Channel program started. EXPLANATION: Channel program 'INSCCPQ1.HSMTSPQ1' started. ACTION: None. ... (3 Replies)
Discussion started by: m223464
3 Replies

10. Shell Programming and Scripting

Joining lines in log file

Hi, I need to develop a script to join multiple three lines in a log file into one line for processing with awk and grep. I looked at tr with no success. The first line contains the date time information. The second line contains the error line. The third line is a blank line. Thanks, Mike (3 Replies)
Discussion started by: bubba112557
3 Replies
Login or Register to Ask a Question