Sponsored Content
Full Discussion: File joining and sorting
Top Forums UNIX for Dummies Questions & Answers File joining and sorting Post 302825139 by verse123 on Sunday 23rd of June 2013 07:21:53 PM
Old 06-23-2013
Thank you, Allister. You pointed out the problem for the joining step.

However, even now that it's working, it's not producing the joined output I am looking for.

From f1.sort and f2.sort which look like this:
Quote:
==> f1.sort <==
2L,1
2L,2
2L,3
2L,4
2L,5
2L,6
2L,7
2L,8
2L,9
2L,10

==> f2.sort <==
10117182,1
10117309,1
10117431,1
10117467,1
10117536,1
10117554,1
10126359,1
10126386,1
10126486,1
10126597,1
The output should merge the same column and tell me which lines have a "1" tag in the new output file like this

Quote:
===> f3 <===
2L,9
2L,10
...
...
2L,10117182,1
2L,10117183
...
...
2L,10117309,1
The command I'm using is
Code:
join -a 1 -1 2 -2 1 -t, f1.sort f2.sort > f3



I've also tried turning the two input files, f1.sort and f2.sort, from comma to tab delimited (on a mac, so \t doesn't work with sed. So i've been using cntrl V + tab:
Code:
for i in *.sort ; do sed 's/,/cntrl v + tab/g' $i > $i.tab; done

but the join command still isn't working... Do you think it's failing because of the way the files are sorted? I used Yoda's suggestion but join may not know I used -g with sort...

---------- Post updated at 07:21 PM ---------- Previous update was at 05:50 PM ----------

Okay, I figured it out. It was just a sorting issue.

But this now leads me to a strange problem. I had over 1,000 lines to join from f2 to f1 but only 15 join... why do you guys suppose this is happening? file f1 has ~70000000 lines and file f2 has ~1000 lines. I am not sure why it's only matching 15 lines when all 1000 exist in file f1.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Joining lines in log file

Hi, I need to develop a script to join multiple three lines in a log file into one line for processing with awk and grep. I looked at tr with no success. The first line contains the date time information. The second line contains the error line. The third line is a blank line. Thanks, Mike (3 Replies)
Discussion started by: bubba112557
3 Replies

2. Shell Programming and Scripting

Joining 2 lines in a file together

Hi guys, I've got a log file which has entries that look like this: ------------------------------------------------------------------------------- 06/08/04 07:57:57 AMQ9002: Channel program started. EXPLANATION: Channel program 'INSCCPQ1.HSMTSPQ1' started. ACTION: None. ... (3 Replies)
Discussion started by: m223464
3 Replies

3. Shell Programming and Scripting

Joining program to one batch file

I created a batch file (./mybatch) that need to run few programs at a sequnece but i need a command like the DOS call command in order to return to the main batch file to proceed the sequence example: cd /dir1/path/dir2 invoke program1 cd /dir3/path2/ <--- i want to return here (2 Replies)
Discussion started by: eynkesef
2 Replies

4. UNIX for Dummies Questions & Answers

joining variable to the end of a file name

hi all i have a directory which contain file 20060101-66666-09-08-0.tif 20060101-77777-11-12-0.tif 20051231-54221-66-55.tif 20051231-54221-66-44.tif as you can see the name of the two last files is shorter then the first ones i want to take all the files with the shorter name and to add to... (7 Replies)
Discussion started by: naamas03
7 Replies

5. Shell Programming and Scripting

Transposing column to row, joining with another file, then sorting columns

Hello! I am very new to Linux and I do not know where to begin... I have a column with >64,000 elements (that are not in numberical order) like this: name 2 5 9 . . . 64,000 I would like to transpose this column into a row that will later become the header of a very large file... (2 Replies)
Discussion started by: doobedoo
2 Replies

6. Shell Programming and Scripting

joining multiple files into one while putting the filename in the file

Hello, I know how to join multiple files using the cat function. I want to do something a little more advanced. Basically I want to put the filename in the first column... One thing to note is that the file is tab delimited. e.g. file1.txt joe 1 4 5 6 7 3 manny 2 3 4 5 6 7 ... (4 Replies)
Discussion started by: phil_heath
4 Replies

7. UNIX for Dummies Questions & Answers

Joining lines of a text file using GAWK

sir... am having a data file of customer master., containing some important fields as a set one line after another., what i want is to have one set of these fields(rows) one after another in line.........then the second set... and so on... till the last set completed. ... (0 Replies)
Discussion started by: KANNI786
0 Replies

8. Shell Programming and Scripting

bash - joining lines in a file

I’m writing a bash shell script and I want to join lines together where two variables on each line are the same ie. 12345variablestuff43212morevariablestuff 12345variablestuff43212morevariablestuff 34657variablestuff78945morevariablestuff 34657variablestuff78945morevariablestuff... (12 Replies)
Discussion started by: Cultcha
12 Replies

9. Shell Programming and Scripting

Joining lines in a file - help!

I'm looking for a way to join lines in a file; e.,g consider the following R|This is line 1 R|This is line 2 R|This is line 3 R|This is line 4 R|This is line 5 what i want to end up with is R|This is line 1 R|This is line 2 R|This is line 3 R|This is line 4 R|This is line 5 so... (15 Replies)
Discussion started by: Storms
15 Replies

10. UNIX for Dummies Questions & Answers

Joining and sorting with csvs with subfields

hello masters, I am working with csv files that open just fine in excel, but have sub-fields which are comma separated as well. a 3 column csv looks like a,b,"c,d,e" f,g,h How do I make join or sort believe that "c,d,e" is just 1 field? (8 Replies)
Discussion started by: senhia83
8 Replies
JOIN(1) 							   User Commands							   JOIN(1)

NAME
join - join lines of two files on a common field SYNOPSIS
join [OPTION]... FILE1 FILE2 DESCRIPTION
For each pair of input lines with identical join fields, write a line to standard output. The default join field is the first, delimited by whitespace. When FILE1 or FILE2 (not both) is -, read standard input. -a FILENUM print unpairable lines coming from file FILENUM, where FILENUM is 1 or 2, corresponding to FILE1 or FILE2 -e EMPTY replace missing input fields with EMPTY -i, --ignore-case ignore differences in case when comparing fields -j FIELD equivalent to `-1 FIELD -2 FIELD' -o FORMAT obey FORMAT while constructing output line -t CHAR use CHAR as input and output field separator -v FILENUM like -a FILENUM, but suppress joined output lines -1 FIELD join on this FIELD of file 1 -2 FIELD join on this FIELD of file 2 --check-order check that the input is correctly sorted, even if all input lines are pairable --nocheck-order do not check that the input is correctly sorted --help display this help and exit --version output version information and exit Unless -t CHAR is given, leading blanks separate fields and are ignored, else fields are separated by CHAR. Any FIELD is a field number counted from 1. FORMAT is one or more comma or blank separated specifications, each being `FILENUM.FIELD' or `0'. Default FORMAT outputs the join field, the remaining fields from FILE1, the remaining fields from FILE2, all separated by CHAR. Important: FILE1 and FILE2 must be sorted on the join fields. E.g., use `sort -k 1b,1' if `join' has no options. If the input is not sorted and some lines cannot be joined, a warning message will be given. AUTHOR
Written by Mike Haertel. REPORTING BUGS
Report join bugs to bug-coreutils@gnu.org GNU coreutils home page: <http://www.gnu.org/software/coreutils/> General help using GNU software: <http://www.gnu.org/gethelp/> COPYRIGHT
Copyright (C) 2009 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. SEE ALSO
The full documentation for join is maintained as a Texinfo manual. If the info and join programs are properly installed at your site, the command info coreutils 'join invocation' should give you access to the complete manual. GNU coreutils 7.1 July 2010 JOIN(1)
All times are GMT -4. The time now is 07:16 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy