Visit Our UNIX and Linux User Community


Outputting 1 file per row if pattern exists between files


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Outputting 1 file per row if pattern exists between files
# 1  
Old 05-22-2014
Outputting 1 file per row if pattern exists between files

I have many files that can have various amounts of rows. I essentially want to output each row into a new file if a pattern is matched between two files.

I have some code that does something similar but I want it to output every single input row from every file into a separate output file; that is, 1 output per row. The names of the output can be something like bat.1.outpu2_1, bat.1.output2_2, etc..

The input file can look like this for example:

Code:
 name,start,end
bat.1,231, 234
bat.1,230, 232


The file I want to match looks like this:

Code:
 V1 V2  V3 V4 V5    V6 V7 V8
  0 230 -77 -1  D     xx  0  0
  1 231 -77  0  R      tt  0  0
  2 232 -77  1  T     yy  0  0
  3 233 -76 -1  Y     uu  0  0
  4 234 -76  0  U     re  0  0
  5 235 -76  1  I      dd  0  0


the code I have to do this is :

Code:
[[ ! -d odir_30for ]] && mkdir odir_30for
for f in enriched_chunks/*.output
do
   infile=${f##*/}
   infile="${infile%[.]output}"
   awk '
   NR==2 { b=$2; e=$3+30;}
   NR>FNR {$0=" " $0; if (($2>=b) && ($2<=e)) o=o $6;}
   END {of=FILENAME; sub(".*/", "", of) ;  print of; print o}
   ' FS=, "$f" FS="[ \t]*" output_transcripts3/output2/"$infile" > odir_30for/"$infile".output2
done


how can I generate an output for each row in a file, for many files? Right now the above code will only output a single row.

output should look like this:

bat.1_1
Code:
bat.1_1
RTYU

bat.1_2
Code:
bat.1_2
DRT

# 2  
Old 05-22-2014
Maybe it's me, but i don't get the logic of the example you gave.
Providing a code sample should not be a reason not to take the time to explain the logic you follow.
That would be nice for the readers if you could fully explain the logic you want your output to be built with (without waiting for people to reverse engineer your whole code).
Thanks for your understanding
This User Gave Thanks to ctsgnb For This Post:
# 3  
Old 05-23-2014
Apologies for being unclear. Im trying compare values between files and if they match I want to extract some characters in between those values for many files. They are in two directories and have the same filename but one ends in .enr. They look like this.

Code:
cat bat.1.enr

Code:
name,start,end
bat.1,233, 235
bat.1,230, 232

Code:
head bat.1

Code:
 V1 V2  V3 V4 V5    V6 V7 V8
  0 230 -77 -1  D     xx  0  0
  1 231 -77  0  R      tt  0  0
  2 232 -77  1  T     yy  0  0
  3 233 -76 -1  Y     uu  0  0
  4 234 -76  0  U     re  0  0
  5 235 -76  1  I      dd  0  0

Im essentially trying to extract the characters from column V5 in bat.1 that fall between the lines bat.1.enr that contain the values in column V2 that bat.1.enr has. The output should look like this for this example.

The output should look like this:

Code:
cat bat.1.out_1

Code:
bat.1
YUI

Code:
cat bat.1.out_2

Code:
bat.1
DRT

The number of rows in the file containing the .enr extension can vary from 1 to more than 2.
# 4  
Old 05-23-2014
Try (based on my proposal to your recent similar problem):
Code:
awk -F, 'NR>1   {FN=$1;LL=$2;UL=$3; FNO=++X[$1]
                 FS=" "
                 print $1 > FN".out_" FNO
                 while (getline < FN)
                    if ($2>=LL && $2<=UL) printf "%s", $5 > FN".out_" FNO
                 printf "\n" > FN".out_" FNO
                 close (FN)
                 FS=","
                }
        ' *.enr
cf bat.1.out*
bat.1.out_1:
bat.1
YUI
bat.1.out_2:
bat.1
DRT

This is not necessarily the most efficient solution, as it reads that bat.1 file for every single input line, but in case that file changes often it seems pointless to read it and assign to an array...
# 5  
Old 05-23-2014
Thanks for the response. I have a question. How are you reading in the files that do not have the .enr extension?
# 6  
Old 05-24-2014
Code:
awk -F, 'NR>1   {FN=$1;LL=$2;UL=$3; FNO=++X[$1]
                 FS=" "
                 print $1 > (FN".out_" FNO)
                 while (getline < FN)
                    if ($2>=LL && $2<=UL) printf "%s", $5 > (FN".out_" FNO)
                 printf "\n" > (FN".out_" FNO)
                 close (FN)
                 FS=","
                }
        ' *.enr

The code marked in red above determines the name of the file to be read, reads lines from it, and closes it when it hits EOF. As written this depends on a feature that is left unspecified by the standards. To make it more portable, you might want to add the parentheses shown in light blue surrounding the components of the output file name in all three print and printf statements.
 

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

If file pattern exists in directory then continue

he below looks in $dir for any pattern of fileone. As is, it executes but only returns File found if the exact format in the script exsists. Why isn't a pattern of fileone being looked for and if it is in $dir, File found. I think that is what should happen. Thank you :). dir=/path/to if... (5 Replies)
Discussion started by: cmccabe
5 Replies

2. Shell Programming and Scripting

Outputting characters after a given string and reporting the characters in the row below --sed

I have this fastq file: @M04961:22:000000000-B5VGJ:1:1101:9280:7106 1:N:0:86 GGGGGGGGGGGGCATGAAAACATACAAACCGTCTTTCCAGAAATTGTTCCAAGTATCGGCAACAGCTTTATCAATACCATGAAAAATATCAACCACACCA +test-1 GGGGGGGGGGGGGGGGGCCGGGGGFF,EDFFGEDFG,@DGGCGGEGGG7DCGGGF68CGFFFGGGG@CGDGFFDFEFEFF:30CGAFFDFEFF8CAF;;8... (10 Replies)
Discussion started by: Xterra
10 Replies

3. Shell Programming and Scripting

Perl script to fill the entire row of Excel file with color based on pattern match

Hi All , I have to write one Perl script in which I need to read one pre-existing xls and based on pattern match for one word in some cells of the XLS , I need to fill the entire row with one color of that matched cell and write the content to another excel Please find the below stated... (2 Replies)
Discussion started by: kshitij
2 Replies

4. Shell Programming and Scripting

How to check more than 1 file specified files exists?

Hi all, One of my script crated created 2 files in a dirs Output.log and Output.tmp. Now in another script i need to check if both of the above mentioned files are present in a directory or not. I know to check one file but need to check both the files. Anyone could please tell me how... (3 Replies)
Discussion started by: girijajoshi
3 Replies

5. Shell Programming and Scripting

Search if file exists for a file pattern stored in array

Hi experts, I have two arrays one has the file paths to be searched in , and the other has the files to be serached.For eg searchfile.dat will have abc303 xyz123 i have to search for files that could be abc303*.dat or for that matter any extension . abc303*.dat.gz The following code... (2 Replies)
Discussion started by: 100bees
2 Replies

6. Shell Programming and Scripting

Split Large Files Based On Row Pattern..

Hi all. I've tried searching the web but could not find similar problem to mine. I have one large file to be splitted into several files based on the matching pattern found in each row. For example, let's say the file content: ... (13 Replies)
Discussion started by: aimy
13 Replies

7. Shell Programming and Scripting

Check for Pattern if exists write to file

Hi ! All I just want to search and write to new file if pattern is found in text file following are my text files by which I want to search Month and last column number my text file1 15-Jan-2011 25 ARTS 1255 125 125 178 198 15-Jan-2011 25 ARTS 1255 125 125 178 198 15-Jan-2011 25... (3 Replies)
Discussion started by: nex_asp
3 Replies

8. UNIX for Dummies Questions & Answers

Need help finding a file where a pattern exists and the file has a timestamp

So, I know how to do some of this stuff on an individual level, but I'm drawing a blank as to how to put it all together. I have a pattern that I'm looking for in a log file. The log file I know came in yesterday, so I want to limit the search to that day's listing of files. How would I do... (5 Replies)
Discussion started by: kontrol
5 Replies

9. Shell Programming and Scripting

Grep pattern from different file and display if it exists in the required file

Hi, I have two files say xxx.txt and yyy.txt. xxx.txt is with list of patterns within double quotes. Eg. "this is the line1" "this is the line2" The yyy.txt with lot of lines. eg: "This is a test message which contains rubbish information just to fill the page which is of no use. this is... (3 Replies)
Discussion started by: abinash
3 Replies

10. Shell Programming and Scripting

check if file exists with pattern matching

Hello friends, I am writing a simple shell script which will copy one particular type of files to backup folder if files exists. If files doesn't exists, mv command should not be executed. My file pattern is like wcm-spider-maestro.log.2009-07-15, wcm-spider-maestro.log.2009-07-16 etc.. I... (6 Replies)
Discussion started by: sreenu.shell
6 Replies