Sponsored Content
Top Forums Shell Programming and Scripting Pass an array to awk to sequentially look for a list of items in a file Post 303033571 by LMHmedchem on Monday 8th of April 2019 01:21:22 PM
Old 04-08-2019
Pass an array to awk to sequentially look for a list of items in a file

Hello,

I need to collect some statistical results from a series of files that are being generated by other software. The files are tab delimited. There are 4 different sets of statistics in each file where there is a line indicating what the statistic set is, followed by 5 lines of values. It looks like this,

Code:
train statistics
r2	0.7834
MeAE	0.36
MdAE	0.33
SE	0.34
n	400
...
...
...
test statistics
r2	0.7042
MeAE	0.39
MdAE	0.32
SE	0.41
n	400

There is more data on each line, but that is not an issue. There can also be up to 4 sets that need to be retrieved.

What I would generally do here is something like,
Code:
#!/bin/sh

# stat file being processed
input_file='inputfile.txt'
# where to write the output
logfile='logfile.txt'
# statistic set we are looking for
current_stat='train statistics'

cat $input_file | \
awk -v st_label="$current_stat" '          F == 1 { line_array[++a_count] = $0; line_count++ }
                                  line_count == 5 { for(i=1; i<=a_count; i++) print line_array[i];
                                                    delete line_array;
                                                    a_count = 0;
                                                    F = 0;
                                                    line_count = 0 }
                                    $0 ~ st_label { F = 1; line_count = 0 }
                                 ' > $logfile

This would find the line containing whatever was passed in as $current_stat and start saving lines at the next line. After the 5th line has been saved, the saved array is printed and the array, save flag, and counters are reset. Of course, if we are only looking for one set of data to print, the reinitalization is not necessary and we could exit there instead.

My question is about the best way to capture several sets in one pass through the file. My thought was to put the labels for what I wanted to find in an array in bash and then call awk with the array instead of a single variable. I would then look for each array element in succession until all had been found. I thought that would look like,

Code:
#!/bin/sh

# stat file being processed
input_file='inputfile.txt'
# where to write the output
logfile='logfile.txt'
# abeld for 4 sets we are looking for
LABELS=("train statistics" "test statistics" "validate statistics" "ival statistics")

cat $input_file | \
awk -v st_arr="${LABELS[*]}" '              BEGIN { a_pos = 0 }
                                           F == 1 { line_array[++a_count] = $0; line_count++ }
                                  line_count == 5 { for(i=1; i<=a_count; i++) print line_array[i];
                                                    delete line_array;
                                                    a_count = 0;
                                                    F = 0;
                                                    line_count = 0;
                                                    a_pos = 0 }
                               $0 ~ st_arr[a_pos] { F = 1; line_count = 0 }
                             ' > $logfile

This was intended to start st_arr at 0 and look for whatever value was there. This doesn't work and gives an error, attempt to use scalar `st_arr' as an array. I think I have the syntax correct for passing an array to awk but it doesn't see to have worked. Do I need to translate the bash array into an awk array on the BEGIN line? Is this just not the right way to do this?

I would probably just save everything I captured in a single array and print it at the end instead of printing after each set is recovered. Even if the above works, I'm not sure how to avoid an array boundary error with st_arr[] since I think that the above would increment it past its size.

Thanks,

LMHmedchem

Last edited by LMHmedchem; 04-08-2019 at 02:27 PM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Pass array variabel to awk from shell

Hi I need to pass an array to Awk script from Shell. Can you please tell how to do it? How to pass this array add_ct_arr to an awk script or access it in awk? i=1 while ; do add_ct_arr=$(echo ${adda_count} | awk -v i=$i -F" " '{print $i;}') echo ${add_ct_arr} ... (1 Reply)
Discussion started by: appsguy616
1 Replies

2. Shell Programming and Scripting

hw to insert array values sequentially in a file

Hi All :), I am very new to unix. I am requiring ur help in developing shell script for below problem. I have to replace the second field of file with values of array sequentially where first field is ValidateKeepVar <File> UT-ExtractField 1 | &LogEntry &Keep(DatatoValidate)... (3 Replies)
Discussion started by: rohiiit.sharma
3 Replies

3. Shell Programming and Scripting

awk between items including items

OS=HP-UX ksh The following works, except I want to include the <start> and <end> in the output. awk -F '<start>' 'BEGIN{RS="<end>"; OFS="\n"; ORS=""} {print $2} somefile.log' The following work in bash but not in ksh sed -n '/^<start>/,/^<end>/{/LABEL$/!p}' somefile.log (4 Replies)
Discussion started by: Ikon
4 Replies

4. Shell Programming and Scripting

Pass awk array variable to shell

Hi, all suppose I have following myfile (delimited by tab) aa bb cc dd ee ffand I have following awk command: awk 'BEGIN{FS="\t"}{AwkArrayVar_1=$1;AwkArrayVar_2=$2};END{for(i=0; i<NR; i++) print i, AwkArrayVar_1, AwkArrayVar_2,}' myfileMy question is: how can I assign the awk array... (7 Replies)
Discussion started by: littlewenwen
7 Replies

5. Shell Programming and Scripting

[Solved] awk command to read sequentially from a file until last record

Hello, I have a file that looks like this: Generated geometry (...some special descriptor) 1 0.56784 1.45783 -0.87965 8 1.29873 -0.8767 1.098789 ... ... ... ... Generated geometry (....come special descriptor) ... .... ... ... ... ... ... ... and... (4 Replies)
Discussion started by: jaldo0805
4 Replies

6. Shell Programming and Scripting

How to pass an array containing file names to a sftp script?

hi, i want to pass an array parameters to a sftp script so that i can transfer each file in the array to the remote server by connecting only once to the sftp remote server. i thought of using a variable that contains list of file names separated by a space and pass the variable to the sftp... (3 Replies)
Discussion started by: Little
3 Replies

7. Shell Programming and Scripting

Split list of files into an array and pass to function

There are two parts to this. In the first part I need to read a list of files from a directory and split it into 4 arrays. I have done that with the following code, # collect list of file names STATS_INPUT_FILENAMES=($(ls './'$SET'/'$FOLD'/'*'in.txt')) # get number of files... (8 Replies)
Discussion started by: LMHmedchem
8 Replies

8. Shell Programming and Scripting

sed to delete items in an array from a file

I need to create a shell script to delete multiple items (Strings) at a time from a file. I need to iterate through a list of strings. My plan is to create an array and then iterate through the array. My code is not working #!/bin/bash -x declare -a array=(one, two, three, four)... (5 Replies)
Discussion started by: bash_in_my_head
5 Replies

9. Shell Programming and Scripting

Read a lis, find items in a file from the list, change each item

Hello, I have some tab delimited text data, file: final_temp1 aname val NAME;r'(1,) 3.28584 r'(2,)<tab> NAME;r'(3,) 6.13003 NAME;r'(4,) 4.18037 r'(5,)<tab> You can see that the data is incomplete in some cases. There is a trailing tab after the first column for each incomplete row. I... (2 Replies)
Discussion started by: LMHmedchem
2 Replies

10. Shell Programming and Scripting

Script to process a list of items and uncomment lines with that item in a second file

Hello, I have a src code file where I need to uncomment many lines. The lines I need to uncomment look like, C CALL l_r(DESNAME,DESOUT, 'Gmax', ESH(10), NO_APP, JJ) The comment is the "C" in the first column. This needs to be deleted so that there are 6 spaces preceding "CALL".... (7 Replies)
Discussion started by: LMHmedchem
7 Replies
All times are GMT -4. The time now is 11:08 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy