awk to match multiple regex and create separate output files


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers awk to match multiple regex and create separate output files
# 1  
Old 08-04-2011
awk to match multiple regex and create separate output files

Howdy Folks,

I have a list that looks like this:
(file2.txt)

AAA
BBB
CCC
DDD

and there are 24 of these short words.

I am matching these patterns to another file with 755795 lines (file1.txt).

I have this code for matching:

Code:
awk -v f2=file2.txt '
    BEGIN {
        while( (getline<f2) > 0 )   # read and collect records from f2
        {
            key = $1;
            ki = kidx[key]++;        # track number of duplicate keys (0 based)
            k2rec[key,ki] = $0;      # save unique record by key and dup count
        }
        close( f2 );
    }

    {
        key = $1;
        for( i = 0; i < kidx[key]; i++ )          # for each duplicate of key
            printf( "%s\t%s\n", k2rec[key,i], $0 );   # print f2 record, followed by current f1 record
    }
' <file1.txt > output

In this form every line in file1 that is matched goes into the same output file. What I would like to do is change the code such that each of the patterns in file2.txt gets its own file. For example:

all of the lines in file1.txt that match AAA would go in AAA.txt while all of the lines matching BBB would go in BBB.txt.

I'm not exactly sure how to handle controlling the output.

Thanks guys, I appreciate your help.
Robert
# 2  
Old 08-04-2011
Not tested:
Code:
printf( "%s\t%s\n", k2rec[key,i], $0) > k2rec[key,i]

This User Gave Thanks to yazu For This Post:
# 3  
Old 08-04-2011
It will at least get me started, thanks very much for your time.
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to create subdirectory based on match between two files

In the below awk I am trying to mkdir based of an exact match between file2 line starting with R_2019.... and file1 line starting with R_2019. When a match is found there is a folder located at /home/cmccabe/run with the same name as the match where each $2 in file1 is a new subdirectory in that... (2 Replies)
Discussion started by: cmccabe
2 Replies

2. UNIX for Beginners Questions & Answers

Awk: output lines with common field to separate files

Hi, A beginner one. my input.tab (tab-separated): h1 h2 h3 h4 h5 item1 grpA 2 3 customer1 item2 grpB 4 6 customer1 item3 grpA 5 9 customer1 item4 grpA 0 0 customer2 item5 grpA 9 1 customer2 objective: output a file for each customer ($5) with the item number ($1) only if $2 matches... (2 Replies)
Discussion started by: beca123456
2 Replies

3. Shell Programming and Scripting

awk to create separate files but not include specific field in output

I am trying to use awk to create (in this example) 3 seperate text file from the unique id in $1 in file, if it starts with the pattern aa. The contents of each row is used to populate each text file except for $1 which is not needed. It seems I am close but not quite get there. Thank you :). ... (3 Replies)
Discussion started by: cmccabe
3 Replies

4. Shell Programming and Scripting

Join two files combining multiple columns and produce mix and match output

I would like to join two files when two columns in each file matches with each other and then produce an output when taking multiple columns. Like I have file A 1234,ABCD,23,JOHN,NJ,USA 2345,ABCD,24,SAM,NY,USA 5678,GHIJ,24,TOM,NY,USA 5678,WXYZ,27,MAT,NJ,USA and file B ... (2 Replies)
Discussion started by: mady135
2 Replies

5. Shell Programming and Scripting

Create Multiple UNIX Files for Multiple SQL Rows output

Dear All, I am trying to write a Unix Script which fires a sql query. The output of the sql query gives multiple rows. Each row should be saved in a separate Unix File. The number of rows of sql output can be variable. I am able save all the rows in one file but in separate files. Any... (14 Replies)
Discussion started by: Rahul_Bhasin
14 Replies

6. Shell Programming and Scripting

create separate files from one excel file with multiple sheets

Hi, I have one requirement, create separate files (".csv") from one excel file(xlsx) with multiple sheets. These ".csv" files are my source files. So anybody please suggest me the process. Thanks in Advance. Regards, Harris (3 Replies)
Discussion started by: harris
3 Replies

7. Shell Programming and Scripting

extract DDL - output every match to separate file

Hi, i want to extract the 'CREATE INDEX' or 'CREATE UNIQUE INDEX' statements from a ddl file and output each match to a separate file. i was looking around the net but couldnīt find anything. a possible sed-script could be: sed -n '/CREATE*INDEX*/,/COMMIT/p' filename.ddlbut i couldnīt find out... (11 Replies)
Discussion started by: CactusMoon
11 Replies

8. UNIX for Dummies Questions & Answers

Using AWK: Extract data from multiple files and output to multiple new files

Hi, I'd like to process multiple files. For example: file1.txt file2.txt file3.txt Each file contains several lines of data. I want to extract a piece of data and output it to a new file. file1.txt ----> newfile1.txt file2.txt ----> newfile2.txt file3.txt ----> newfile3.txt Here is... (3 Replies)
Discussion started by: Liverpaul09
3 Replies

9. Shell Programming and Scripting

handling multiple files using awk command and wants to get separate out file for each

hai all I am new to the world of shell scripting I wanted to extract two columns from multiple files say around 25 files and i wanted to get the separate outfile for each input file tired using the following command to extract two columns from 25 files awk... (2 Replies)
Discussion started by: hema dhevi
2 Replies

10. Shell Programming and Scripting

Using AWK to separate data from a large XML file into multiple files

I have a 500 MB XML file from a FileMaker database export, it's formatted horribly (no line breaks at all). The node structure is basically <FMPXMLRESULT> <METADATA> <FIELD att="............." id="..."/> </METADATA> <RESULTSET FOUND="1763457"> <ROW att="....." etc="...."> ... (16 Replies)
Discussion started by: JRy
16 Replies
Login or Register to Ask a Question