awk in multiple files


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers awk in multiple files
# 1  
Old 01-24-2011
awk in multiple files

Hello all,
I have two files:
one that looks like this
 
string1
string2
string3
string6
and another that looks like this
 
Wstring1
apple
Wstring2
vegetable
Wstring3
mouse
Wstring4
globe
Wstring5
night
Wstring6
letter
cup
table
the word string in the second file always starts with the capital W

I want to find, for every string that is written in the first file, the corresponding lines under this string in the second file.

For this example the output would be:
 
Wstring1
apple
Wstring2
vegetable
Wstring3
mouse
Wstring6
letter
cup
table

I suppose it is an awk command...
# 2  
Old 01-24-2011
It is a filtering join (not an outer join (all) or cartesian product (3 lines and 4 lines yields 12 possible lines) join), so the first question is, is it many to many. Second question is, are they sorted the same (or can be). Many to many requires multiple passes or storage. Sorting is necessary for a merge join, which is very popular and robust.

Awk or one of the more advanced shells with associative arrays can do it, for modest amounts of data and speed, even if they are not sorted and 0/1 to 1/many, just put one file into the associative array and then look it up using the data from the other file.
# 3  
Old 01-25-2011
I did not understand your answer...
# 4  
Old 01-25-2011
Join is a RDBMS / set logic term.

If you start by readin a line from file 1 into one variable and two lines from file 2 into 2 other variables, then you are started into a merge. If the values match, you print, but then which do you read next?




It will not work unless the technique and sometimes data accomodate any data challenges. You know the data. (pick one):
  • you store all of file1 in an associative array and then look up as you go through file 2.
  • both files are sorted by string, and the values in one file are unique, so when one file has a low string, that is where you read. We could do the sort, but two line records in file 2 need to be merged to one line. The join command can do this for you.
  • you test every line of one file for every record in the other file, M x N comparisons, which is OK for small files.

Last edited by DGPickett; 01-25-2011 at 10:11 AM..
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk, multiple files input and multiple files output

Hi! I'm new in awk and I need some help. I have a folder with a lot of files and I need that awk do something in each file and print a new file with the output. The input file name should be modified when I print the outpu files. Thanks in advance for help! :-) ciao (5 Replies)
Discussion started by: gabrysfe
5 Replies

2. Shell Programming and Scripting

Awk with Multiple files

Hello, I wanted to know the best way to do some matching between two files. I am currently using awk. Example Question: If column 4 from file A is contained in column 1 from file B. Print the whole row from file A of mix of information from both files. i.e. File A ... (1 Reply)
Discussion started by: rafir
1 Replies

3. Shell Programming and Scripting

perform 3 awk commands to multiple files in multiple directories

Hi, I have a directory /home/datasets/ which contains a bunch (720) of subdirectories called hour_1/ hour_2/ etc..etc.. in each of these there is a single text file called (hour_1.txt in hour_1/ , hour_2.txt for hour_2/ etc..etc..) and i would like to do some text processing in them. Each of... (20 Replies)
Discussion started by: amarn
20 Replies

4. UNIX for Dummies Questions & Answers

Using AWK: Extract data from multiple files and output to multiple new files

Hi, I'd like to process multiple files. For example: file1.txt file2.txt file3.txt Each file contains several lines of data. I want to extract a piece of data and output it to a new file. file1.txt ----> newfile1.txt file2.txt ----> newfile2.txt file3.txt ----> newfile3.txt Here is... (3 Replies)
Discussion started by: Liverpaul09
3 Replies

5. UNIX for Dummies Questions & Answers

best method of replacing multiple strings in multiple files - sed or awk? most simple preferred :)

Hi guys, say I have a few files in a directory (58 text files or somthing) each one contains mulitple strings that I wish to replace with other strings so in these 58 files I'm looking for say the following strings: JAM (replace with BUTTER) BREAD (replace with CRACKER) SCOOP (replace... (19 Replies)
Discussion started by: rich@ardz
19 Replies

6. Shell Programming and Scripting

extract multiple cloumns from multiple files; skip rows and include filenames; awk

Hello, I am trying to write a bash shell script that does the following: 1.Finds all *.txt files within my directory of interest 2. reads each of the files (25 files) one by one (tab-delimited format and have the same data format) 3. skips the first 10 rows of the file 4. extracts and... (4 Replies)
Discussion started by: manishabh
4 Replies

7. UNIX for Dummies Questions & Answers

awk multiple files

Hi there! I'm with a problem because I'm working on a script to do a calculation of a mean using multiple files with 3 columns and N-lines. My input are like this File1 0001 0023 5 0120 0376 6 0412 3412 3 ... File2 0001 0023 3 0120 0376 nan 0412 3412 7 ... (1 Reply)
Discussion started by: philstar
1 Replies

8. Shell Programming and Scripting

Multiple search string in multiple files using awk

Hi, filenames: contains name of list of files to search in. placelist contains the names of places to be searched in all files in "filenames" for i in $(<filenames) do egrep -f placelist $i if ] then echo $i fi done >> outputfile Output i am getting: (0 Replies)
Discussion started by: pinnacle
0 Replies

9. Shell Programming and Scripting

awk with multiple files

I have 2 files first is category: 1:Phone 2:Keyboard 3:Printer 4:Scanner 5:Mouse the second is product: 1:iphone:1:..... 2:blackberry:1:..... 3:Mitsumi:2:..... 4:abc:5:..... 5:def:4:..... 6:noi:3:..... which bold numbers are "foreign key" in product file and they are "primary... (5 Replies)
Discussion started by: WuZun
5 Replies

10. Shell Programming and Scripting

Splitting input files into multiple files through AWK command

Hi, I needs to split *.txt files from single directory depends on the some mutltiple input values. i have wrote the code like below for file in *.txt do grep -i -h "value1|value2" $file > $file; done. My requirment is more input values needs to be given in grep; let us say 50... (3 Replies)
Discussion started by: arund_01
3 Replies
Login or Register to Ask a Question