AWK looping over 2 variables


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting AWK looping over 2 variables
# 8  
Old 09-12-2012
Quote:
Originally Posted by Don Cragun
The script Chubler_XL provided in the message before this should work fine as long as you don't care about the order in which lines appear in the output files and don't want to append to existing files.
My script keeps lines in same order as original file -- to append to existing files change ">" to ">>" on the 2 print lines.
# 9  
Old 09-13-2012
Quote:
Originally Posted by Don Cragun
Unfortunately, the script above is functionally equivalent to:
Code:
END {print $2 > "junk" }

which only writes the 2nd field from the last line in your input file into a file named junk. But, I think understand what you're trying to do now.

To be sure that I do understand what you want, please confirm or correct the following statements:
  1. You want files named file_x for 1 <= x <= 173 which contain the copies of the lines from the file named file where the value in the first field in the line is in corresponding range.
  2. Then for each file named file_x you want a file named file_x_f that contains the same number of lines as the file_x file, but only contains the contents of the 2nd field of each line instead of the entire line.
  3. In your description above you sometimes talk about files named file_x and at other times talk about files named output_x. Am I correct in assuming that "output_" was a typo and you meant "file_"?
Is this correct?

Note that since you're creating up to 346 output files from this script, the script is going to have to open and close files while it is running rather than opening everything and letting awk automatically close them when the script terminates.

Please also answer the following questions:
  1. Do you want empty files created for files that don't have any lines that will be directed to those files?
  2. Do existing file_x and file_x_f files need to be removed when this script starts?
  3. If not, should lines to be written by this script replace the contents of existing files or append lines to them?
I'm hoping that you either want all existing files to be removed or overwritten by the script rather than appending to existing files. The file handling logic is much more difficult in an awk script if you want to portably append to existing files. Given the script: print >> file_x some systems will create file_x if it doesn't already exist. Others will only create a file when using print > file_x and will give an error if you try print >> file_x when file_x doesn't already exist.

The script Chubler_XL provided in the message before this should work fine as long as you don't care about the order in which lines appear in the output files and don't want to append to existing files. If you want to append rather than replace, or if you want to have all entries in the output files be in the same order that they appeared in the input file, but script will be more complex.

================
I apologize. Chubler_XL's script does indeed maintain order, and (as he said) you can just replace > with >> if you want to append rather than overwrite. (It is w >> file in ex that may fail if file doesn't already exist. In awk >> file is guaranteed to create the file if it didn't exist and append to it if it did exist.)


The script Chubler_XL wrote works great. I am not in need of specific ordering of lines, so it's all sorted! I was impressed by your use of arrays for the problem.

Do I understand correctly that this is a 2D array comprised of the nearest integer defined by the bucket function and v[bucket]++?
Code:
l[bucket,v[bucket]]=$0


Also, could you explain what the purpose of
Code:
 v[bucket]++

is?
# 10  
Old 09-13-2012
Bucket is the integer file number you requested, for example:
Code:
-6.650 --> 1
-1.203 --> 55
 0.000 --> 68
 1.293 --> 80
 6.650 --> 134

v[bucket]++ counts how many lines have been found for each bucket, each line is then stored as l[bucket, bucket_line].

This allows two things:
  1. Each line gets a unique array index.
  2. Output order can be made to match input order.
The array solution is much more efficient than writing output files as each line is processed, and works well for small to mid-sized files (say less than 2GB). As awk isn't constantly opening and closing output files.
If you have a huge input file, this solution is likley to fail (by running out of memory or blowing awk internal array size limits). The slower technique of opening each output file as a line is processed and closing it again would become necessary.

If you have a huge input file, this solution is likley to fail (by running out of memory or blowing awk internal array size limits). The slower technique of opening each output file as a line is processed and closing it again would become necessary.

Last edited by Chubler_XL; 09-13-2012 at 05:48 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk nested looping?

I am trying to parse a text file and send its output to another file but I am having trouble conceptualizing how I am supposed to do this in awk. The text file has a organization like so: Name Date Status Location (city, state, zip fields) Where each of these is on a separate line in... (1 Reply)
Discussion started by: kellyanneghj
1 Replies

2. Shell Programming and Scripting

Help on looping using awk

I have the data like this: PONUMBER,SUPPLIER,LINEITEM,SPLITLINE,LINEAMOUNT,CURRENCY IR5555,Supplier1,1,1,83.1,USD IR5555,Supplier1,1,3,40.4,USD IR5555,Supplier1,1,6,54.1,USD IR5555,Supplier1,1,8,75.1,USD IR5556,Supplier2,1,1,41.1,USD IR5556,Supplier2,1,3,43.1,USD ... (3 Replies)
Discussion started by: jeffreybsu
3 Replies

3. Shell Programming and Scripting

looping in awk

How do I remove last comma? echo "xx yy zz" | awk 'BEGIN{FS=" "}{for (i=1; i<=NF; i++) printf "%s,", $i}'output: xx,yy,zz, required output: xx,yy,zz or (ideally!): xx, yy & zz many thanks in advance! (4 Replies)
Discussion started by: euval
4 Replies

4. UNIX for Dummies Questions & Answers

Help with AWK looping

I'm trying to parse a configuration text file using awk. The following is a sample from the file I'm searching. I can retrieve the formula and recipe names easily but now I want to take it one step farther. In addition to the formula name, I would like to also get the value of the attribute... (6 Replies)
Discussion started by: new2awk
6 Replies

5. Shell Programming and Scripting

Urgent - Looping using AWK

Hi I have a file which is having following text. The file is in a tabular form with 5 fields. i.e field1, field2 ..... field5 are its columns and there are many rows in it say COUNT is the number of rows Field 1 Field2 Field3 Field4 Field5 ------- ------- ... (8 Replies)
Discussion started by: skyineyes
8 Replies

6. Shell Programming and Scripting

Looping script with variables

If I have a file with a bunch of various numbers in one column, how can I make a script to take each number in the file and put in into a command line? Example: cat number_file 2 5 8 11 13 34 55 I need a loop to extract each of these numbers and put them into a command line... (1 Reply)
Discussion started by: jojojmac5
1 Replies

7. Shell Programming and Scripting

looping and awk/sed help

I am pretty new to this, but imagine what I am trying to do is possible iI am trying to make an automated DB comparison tool that selects all columns in all tables and compares them to the same thing in another DB. anyway I have created 2 files to help with this the first file is a... (13 Replies)
Discussion started by: Zelp
13 Replies

8. Shell Programming and Scripting

Awk: looping problem!

I am having a problem with awk when I run it with a loop. It works perfectly when I echo a single line from the commandline. For example: echo 'MFG009 9153852832' | awk '$2 ~ /^0-9]$/{print $2}' The Awk command above will print field 2 if field 2 matches 10 digits, but when I run the loop... (5 Replies)
Discussion started by: cstovall
5 Replies

9. Shell Programming and Scripting

looping through variables

Hi, I tired to do this in a korn shell on an HP-UX 9000/800... var1="a b c" var2="d e f" vars="var1 var2" for i in $vars do for j in $i do echo $i $j done done When run, this would output var1 var1 (1 Reply)
Discussion started by: andyfaeglasgow
1 Replies

10. UNIX for Advanced & Expert Users

Looping in awk

Can somebody give me a cleaner way of writing the following script. I was thinking that I could use a loop in the awk statement. It works fine the way it is but I just want the script to be cleaner. #!/usr/bin/sh for r in 0 1 2 3 4 5 6 do DAY=`gdate --date="${r} days ago" +%m\/%d\/%y`... (3 Replies)
Discussion started by: keelba
3 Replies
Login or Register to Ask a Question