Sponsored Content
Top Forums Shell Programming and Scripting Split files by pairwise combination Post 302927899 by RudiC on Sunday 7th of December 2014 04:51:01 AM
Old 12-07-2014
You intermix awk and shell, which cannot work. Even if you'd get your approach running, like
Code:
awk    '       {print $1 FS $2 FS "A""\n"$3 FS $4 FS "B" >> "tmp_"$1"_"$3 
                 system ("grep "$1"/"$3 " file2 >> tmp_"$1"_"$3) }' file1

, you had the out_* files with field 3 missing, so overall you would have to run quite some programs in quite some processes, touching files several times, which is not too efficient.


Try
Code:
awk     'NR==FNR        {OUT[$1,$3]=$1 FS $2 FS substr($1,1,1) "\n"
                         OUT[$1,$3]=OUT[$1,$3] $3 FS $4 FS substr($3,1,1) "\n"
                         next}
                        {n=split ($1, T, "/")
                         OUT[T[1],T[2]]=OUT[T[1],T[2]] $1 FS $2 FS substr(T[1],1,1) "/" substr(T[2],1,1) "\n"
                        }
         END            {for (i in OUT) printf "%s",  OUT[i] > "out_" i}
        ' SUBSEP="_" file1 file2
cf out*
out_A1_B1:
A1 File1a A
B1 File1b B
A1/B1 File3 A/B
A1/B1 File4 A/B
A1/B1 File5 A/B
A1/B1 File6 A/B
A1/B1 File7 A/B
out_A1_B3:
A1 File1a A
B3 File3b B
A1/B3 File14 A/B
A1/B3 File15 A/B
A1/B3 File16 A/B
A1/B3 File17 A/B
out_A2_B2:
A2 File2a A
B2 File2b B
A2/B2 File8 A/B
A2/B2 File9 A/B
A2/B2 File10 A/B
A2/B2 File11 A/B
A2/B2 File12 A/B
A2/B2 File13 A/B

That would rum awk only once and also touch any file involved only once. Depending on data sizes, some decent memory might have to be allocated, though.

Last edited by RudiC; 12-07-2014 at 06:03 AM..
This User Gave Thanks to RudiC For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

split files into specified number of output files

Hi everyone, I have some large text files that I need to split into a specific number of files of equal size. As far as I know (and I don't really know that much :)) the split command only lets you specify the number of lines or bytes. The files are all of a different size, so the number of... (4 Replies)
Discussion started by: Migrainegirl
4 Replies

2. UNIX for Dummies Questions & Answers

to split gz files

Hi, I want to know how to split a gz file ( with out uncompressing it ) Eg:- split -b 10m file.gz (2 Replies)
Discussion started by: daptal
2 Replies

3. Shell Programming and Scripting

Split a files into many files when condition

Hi Everyone, file.txt +++ a b c +++ d +++ asdf fefe fff Would like to have the output: file1.txt (22 Replies)
Discussion started by: jimmy_y
22 Replies

4. Shell Programming and Scripting

how to calculate all pairwise distances in two dimensions and transform them into a matrix

Hello to all, I am very new in the shell scripting and I need help. I have data for several individuals in several rows followed by a tag and by 5 values per row, with the name of the individual in the first column, e.g.: IND1 H1 12 13 12 15 14 IND2 H2 12 12 15 14 14 IND3 H1 12 15... (2 Replies)
Discussion started by: Bemar
2 Replies

5. UNIX for Dummies Questions & Answers

Extract unique combination of rows from text files

Hi Gurus, I have 100 tab-delimited text files each with 21 columns. I want to extract only 2nd and 5th column from each text file. However, the values in both 2bd and 5th column contain duplicate values but the combination of these values in a row are not duplicate. I want to extract only those... (3 Replies)
Discussion started by: Unilearn
3 Replies

6. Shell Programming and Scripting

Split files

Hi , I have 100 records in a.txt file Need to split the a.txt file in to 5 files 1ST File: ex: My file name should be a1.txt - line count in file should be 1 to 15 2ND File: ex: My file name should be a2.txt - line count in file should be 16 to 40 3ND File: ex: My file name... (1 Reply)
Discussion started by: satish1222
1 Replies

7. Shell Programming and Scripting

Split and rename files

Hello, Need to split files into n number of files and rename the files Example: Input: transaction.txt.1aa transaction.txt.1ab ...... Output: transaction.txt.1 transaction.txt.2 transaction.txt.3 (3 Replies)
Discussion started by: krux_rap
3 Replies

8. UNIX for Beginners Questions & Answers

Split and Rename Split Files

Hello, I need to split a file by number of records and rename each split file with actual filename pre-pended with 3 digit split number. What I have tried is the below command with 2 digit numeric value split -l 3 -d abc.txt F (# Will Produce split Files as F00 F01 F02) How to produce... (19 Replies)
Discussion started by: techedipro
19 Replies

9. UNIX for Beginners Questions & Answers

Automate splitting of files , scp files as each split completes and combine files on target server

i use the split command to split a one terabyte backup file into 10 chunks of 100 GB each. The files are split one after the other. While the files is being split, I will like to scp the files one after the other as soon as the previous one completes, from server A to Server B. Then on server B ,... (2 Replies)
Discussion started by: malaika
2 Replies

10. UNIX for Beginners Questions & Answers

Create 'n' number random pairwise combination of words

File 1 contains the list of words that needed to be randomly paired: Tiger Cat Fish Frog Dog Mouse Elephant Monkey File 2 contains the pairs that should not be used (in any solution) during random pairing. Elephant-Dog Cat-Fish Monkey-Frog Dog-Elephant, Fish-Cat, Frog-Monkey... (1 Reply)
Discussion started by: sammy777888
1 Replies
IGAWK(1)							 Utility Commands							  IGAWK(1)

NAME
igawk - gawk with include files SYNOPSIS
igawk [ all gawk options ] -f program-file [ -- ] file ... igawk [ all gawk options ] [ -- ] program-text file ... DESCRIPTION
Igawk is a simple shell script that adds the ability to have ``include files'' to gawk(1). AWK programs for igawk are the same as for gawk, except that, in addition, you may have lines like @include getopt.awk in your program to include the file getopt.awk from either the current directory or one of the other directories in the search path. OPTIONS
See gawk(1) for a full description of the AWK language and the options that gawk supports. EXAMPLES
cat << EOF > test.awk @include getopt.awk BEGIN { while (getopt(ARGC, ARGV, "am:q") != -1) ... } EOF igawk -f test.awk SEE ALSO
gawk(1) Effective AWK Programming, Edition 1.0, published by the Free Software Foundation, 1995. AUTHOR
Arnold Robbins (arnold@skeeve.com). Free Software Foundation Nov 3 1999 IGAWK(1)
All times are GMT -4. The time now is 11:28 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy