Sponsored Content
Top Forums Shell Programming and Scripting Split files by pairwise combination Post 302927899 by RudiC on Sunday 7th of December 2014 04:51:01 AM
Old 12-07-2014
You intermix awk and shell, which cannot work. Even if you'd get your approach running, like
Code:
awk    '       {print $1 FS $2 FS "A""\n"$3 FS $4 FS "B" >> "tmp_"$1"_"$3 
                 system ("grep "$1"/"$3 " file2 >> tmp_"$1"_"$3) }' file1

, you had the out_* files with field 3 missing, so overall you would have to run quite some programs in quite some processes, touching files several times, which is not too efficient.


Try
Code:
awk     'NR==FNR        {OUT[$1,$3]=$1 FS $2 FS substr($1,1,1) "\n"
                         OUT[$1,$3]=OUT[$1,$3] $3 FS $4 FS substr($3,1,1) "\n"
                         next}
                        {n=split ($1, T, "/")
                         OUT[T[1],T[2]]=OUT[T[1],T[2]] $1 FS $2 FS substr(T[1],1,1) "/" substr(T[2],1,1) "\n"
                        }
         END            {for (i in OUT) printf "%s",  OUT[i] > "out_" i}
        ' SUBSEP="_" file1 file2
cf out*
out_A1_B1:
A1 File1a A
B1 File1b B
A1/B1 File3 A/B
A1/B1 File4 A/B
A1/B1 File5 A/B
A1/B1 File6 A/B
A1/B1 File7 A/B
out_A1_B3:
A1 File1a A
B3 File3b B
A1/B3 File14 A/B
A1/B3 File15 A/B
A1/B3 File16 A/B
A1/B3 File17 A/B
out_A2_B2:
A2 File2a A
B2 File2b B
A2/B2 File8 A/B
A2/B2 File9 A/B
A2/B2 File10 A/B
A2/B2 File11 A/B
A2/B2 File12 A/B
A2/B2 File13 A/B

That would rum awk only once and also touch any file involved only once. Depending on data sizes, some decent memory might have to be allocated, though.

Last edited by RudiC; 12-07-2014 at 06:03 AM..
This User Gave Thanks to RudiC For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

split files into specified number of output files

Hi everyone, I have some large text files that I need to split into a specific number of files of equal size. As far as I know (and I don't really know that much :)) the split command only lets you specify the number of lines or bytes. The files are all of a different size, so the number of... (4 Replies)
Discussion started by: Migrainegirl
4 Replies

2. UNIX for Dummies Questions & Answers

to split gz files

Hi, I want to know how to split a gz file ( with out uncompressing it ) Eg:- split -b 10m file.gz (2 Replies)
Discussion started by: daptal
2 Replies

3. Shell Programming and Scripting

Split a files into many files when condition

Hi Everyone, file.txt +++ a b c +++ d +++ asdf fefe fff Would like to have the output: file1.txt (22 Replies)
Discussion started by: jimmy_y
22 Replies

4. Shell Programming and Scripting

how to calculate all pairwise distances in two dimensions and transform them into a matrix

Hello to all, I am very new in the shell scripting and I need help. I have data for several individuals in several rows followed by a tag and by 5 values per row, with the name of the individual in the first column, e.g.: IND1 H1 12 13 12 15 14 IND2 H2 12 12 15 14 14 IND3 H1 12 15... (2 Replies)
Discussion started by: Bemar
2 Replies

5. UNIX for Dummies Questions & Answers

Extract unique combination of rows from text files

Hi Gurus, I have 100 tab-delimited text files each with 21 columns. I want to extract only 2nd and 5th column from each text file. However, the values in both 2bd and 5th column contain duplicate values but the combination of these values in a row are not duplicate. I want to extract only those... (3 Replies)
Discussion started by: Unilearn
3 Replies

6. Shell Programming and Scripting

Split files

Hi , I have 100 records in a.txt file Need to split the a.txt file in to 5 files 1ST File: ex: My file name should be a1.txt - line count in file should be 1 to 15 2ND File: ex: My file name should be a2.txt - line count in file should be 16 to 40 3ND File: ex: My file name... (1 Reply)
Discussion started by: satish1222
1 Replies

7. Shell Programming and Scripting

Split and rename files

Hello, Need to split files into n number of files and rename the files Example: Input: transaction.txt.1aa transaction.txt.1ab ...... Output: transaction.txt.1 transaction.txt.2 transaction.txt.3 (3 Replies)
Discussion started by: krux_rap
3 Replies

8. UNIX for Beginners Questions & Answers

Split and Rename Split Files

Hello, I need to split a file by number of records and rename each split file with actual filename pre-pended with 3 digit split number. What I have tried is the below command with 2 digit numeric value split -l 3 -d abc.txt F (# Will Produce split Files as F00 F01 F02) How to produce... (19 Replies)
Discussion started by: techedipro
19 Replies

9. UNIX for Beginners Questions & Answers

Automate splitting of files , scp files as each split completes and combine files on target server

i use the split command to split a one terabyte backup file into 10 chunks of 100 GB each. The files are split one after the other. While the files is being split, I will like to scp the files one after the other as soon as the previous one completes, from server A to Server B. Then on server B ,... (2 Replies)
Discussion started by: malaika
2 Replies

10. UNIX for Beginners Questions & Answers

Create 'n' number random pairwise combination of words

File 1 contains the list of words that needed to be randomly paired: Tiger Cat Fish Frog Dog Mouse Elephant Monkey File 2 contains the pairs that should not be used (in any solution) during random pairing. Elephant-Dog Cat-Fish Monkey-Frog Dog-Elephant, Fish-Cat, Frog-Monkey... (1 Reply)
Discussion started by: sammy777888
1 Replies
ppmtosixel(1)						      General Commands Manual						     ppmtosixel(1)

NAME
ppmtosixel - convert a portable pixmap into DEC sixel format SYNOPSIS
ppmtosixel [-raw] [-margin] [ppmfile] DESCRIPTION
Reads a portable pixmap as input. Produces sixel commands (SIX) as output. The output is formatted for color printing, e.g. for a DEC LJ250 color inkjet printer. If RGB values from the PPM file do not have maxval=100, the RGB values are rescaled. A printer control header and a color assignment table begin the SIX file. Image data is written in a compressed format by default. A printer control footer ends the image file. OPTIONS
-raw If specified, each pixel will be explicitly described in the image file. If -raw is not specified, output will default to com- pressed format in which identical adjacent pixels are replaced by "repeat pixel" commands. A raw file is often an order of magni- tude larger than a compressed file and prints much slower. -margin If -margin is not specified, the image will be start at the left margin (of the window, paper, or whatever). If -margin is speci- fied, a 1.5 inch left margin will offset the image. PRINTING
Generally, sixel files must reach the printer unfiltered. Use the lpr -x option or cat filename > /dev/tty0?. BUGS
Upon rescaling, truncation of the least significant bits of RGB values may result in poor color conversion. If the original PPM maxval was greater than 100, rescaling also reduces the image depth. While the actual RGB values from the ppm file are more or less retained, the color palette of the LJ250 may not match the colors on your screen. This seems to be a printer limitation. SEE ALSO
ppm(5) AUTHOR
Copyright (C) 1991 by Rick Vinci. 26 April 1991 ppmtosixel(1)
All times are GMT -4. The time now is 06:30 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy