Sponsored Content
Top Forums Shell Programming and Scripting Pass an array to awk to sequentially look for a list of items in a file Post 303033680 by RudiC on Wednesday 10th of April 2019 08:05:04 AM
Old 04-10-2019
Making some assumptions on your data structure (until Don Cragun's questions have been fully and finally answered), and making up my own sample data files, I have come up with
Code:
awk '
BEGIN           {LAB="train statistics|test statistics|validate statistics|ival statistics"
                }

FNR == 1        {if (NR != 1)   {printf "filename"
                                 for (i=1; i<=CNT; i++) printf OFS"%s", HD[i]
                                 printf ORS
                                 HDDONE = 1
                                }
                 printf "%s", FN
                 for (i=1; i<=CNT; i++) printf OFS"%s", VAL[HD[i]]
                 printf ORS
                 split ("", VAL)
                }

$0 ~ LAB        {PH = $1
                 FN = FILENAME
                 for (i=1; i<=5; i++)   {getline
                                         IX = PH "_" $1
                                         VAL[IX] = $2
                                         if (! HDDONE) HD[++CNT] = IX 
                                        }
                } 

END             {printf "%s", FN
                 for (i=1; i<=CNT; i++) printf OFS"%s", VAL[HD[i]]
                 printf ORS
                }

' OFS="\t" file[34] | column -t 

filename  train_r2  train_MeAE  train_MdAE  train_SE  train_n  test_r2  test_MeAE  test_MdAE  test_SE  test_n  ival_r2  ival_MeAE  ival_MdAE  ival_SE  ival_n  validate_r2  validate_MeAE  validate_MdAE  validate_SE  validate_n
file3     0.7834    0.36        0.33        0.34      400      0.7042   0.39       0.32       0.41     400     0.7834   0.36       0.33       0.34     400     0.7042       0.39           0.32           0.41         400
file4     0.7834    0.36        0.33        0.34      400      0.7042   0.39       0.32       0.41     400     0.7834   0.36       0.33       0.34     400     0.7042       0.39            0.32           0.41         400

Give it a try and report back.
These 2 Users Gave Thanks to RudiC For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Pass array variabel to awk from shell

Hi I need to pass an array to Awk script from Shell. Can you please tell how to do it? How to pass this array add_ct_arr to an awk script or access it in awk? i=1 while ; do add_ct_arr=$(echo ${adda_count} | awk -v i=$i -F" " '{print $i;}') echo ${add_ct_arr} ... (1 Reply)
Discussion started by: appsguy616
1 Replies

2. Shell Programming and Scripting

hw to insert array values sequentially in a file

Hi All :), I am very new to unix. I am requiring ur help in developing shell script for below problem. I have to replace the second field of file with values of array sequentially where first field is ValidateKeepVar <File> UT-ExtractField 1 | &LogEntry &Keep(DatatoValidate)... (3 Replies)
Discussion started by: rohiiit.sharma
3 Replies

3. Shell Programming and Scripting

awk between items including items

OS=HP-UX ksh The following works, except I want to include the <start> and <end> in the output. awk -F '<start>' 'BEGIN{RS="<end>"; OFS="\n"; ORS=""} {print $2} somefile.log' The following work in bash but not in ksh sed -n '/^<start>/,/^<end>/{/LABEL$/!p}' somefile.log (4 Replies)
Discussion started by: Ikon
4 Replies

4. Shell Programming and Scripting

Pass awk array variable to shell

Hi, all suppose I have following myfile (delimited by tab) aa bb cc dd ee ffand I have following awk command: awk 'BEGIN{FS="\t"}{AwkArrayVar_1=$1;AwkArrayVar_2=$2};END{for(i=0; i<NR; i++) print i, AwkArrayVar_1, AwkArrayVar_2,}' myfileMy question is: how can I assign the awk array... (7 Replies)
Discussion started by: littlewenwen
7 Replies

5. Shell Programming and Scripting

[Solved] awk command to read sequentially from a file until last record

Hello, I have a file that looks like this: Generated geometry (...some special descriptor) 1 0.56784 1.45783 -0.87965 8 1.29873 -0.8767 1.098789 ... ... ... ... Generated geometry (....come special descriptor) ... .... ... ... ... ... ... ... and... (4 Replies)
Discussion started by: jaldo0805
4 Replies

6. Shell Programming and Scripting

How to pass an array containing file names to a sftp script?

hi, i want to pass an array parameters to a sftp script so that i can transfer each file in the array to the remote server by connecting only once to the sftp remote server. i thought of using a variable that contains list of file names separated by a space and pass the variable to the sftp... (3 Replies)
Discussion started by: Little
3 Replies

7. Shell Programming and Scripting

Split list of files into an array and pass to function

There are two parts to this. In the first part I need to read a list of files from a directory and split it into 4 arrays. I have done that with the following code, # collect list of file names STATS_INPUT_FILENAMES=($(ls './'$SET'/'$FOLD'/'*'in.txt')) # get number of files... (8 Replies)
Discussion started by: LMHmedchem
8 Replies

8. Shell Programming and Scripting

sed to delete items in an array from a file

I need to create a shell script to delete multiple items (Strings) at a time from a file. I need to iterate through a list of strings. My plan is to create an array and then iterate through the array. My code is not working #!/bin/bash -x declare -a array=(one, two, three, four)... (5 Replies)
Discussion started by: bash_in_my_head
5 Replies

9. Shell Programming and Scripting

Read a lis, find items in a file from the list, change each item

Hello, I have some tab delimited text data, file: final_temp1 aname val NAME;r'(1,) 3.28584 r'(2,)<tab> NAME;r'(3,) 6.13003 NAME;r'(4,) 4.18037 r'(5,)<tab> You can see that the data is incomplete in some cases. There is a trailing tab after the first column for each incomplete row. I... (2 Replies)
Discussion started by: LMHmedchem
2 Replies

10. Shell Programming and Scripting

Script to process a list of items and uncomment lines with that item in a second file

Hello, I have a src code file where I need to uncomment many lines. The lines I need to uncomment look like, C CALL l_r(DESNAME,DESOUT, 'Gmax', ESH(10), NO_APP, JJ) The comment is the "C" in the first column. This needs to be deleted so that there are 6 spaces preceding "CALL".... (7 Replies)
Discussion started by: LMHmedchem
7 Replies
RDIFF-BACKUP(1) 						   User Manuals 						   RDIFF-BACKUP(1)

NAME
rdiff-backup-statistics - summarize rdiff-backup statistics files SYNOPSIS
rdiff-backup-statistics [--begin-time time] [--end-time time] [--minimum-ratio ratio] [--null-separator] [--quiet] repository DESCRIPTION
rdiff-backup-statistics reads the matching statistics files in a backup repository made by rdiff-backup and prints some summary statistics to the screen. It does not alter the repository in any way. The required argument is the pathname of the root of an rdiff-backup repository. For instance, if you ran "rdiff-backup in out", you could later run "rdiff-backup-statistics out". The output has two parts. The first is simply an average of the all matching session_statistics files. The meaning of these fields is explained in the FAQ included in the package, and also at http://rdiff-backup.nongnu.org/FAQ.html#statistics. The second section lists some particularly significant files (including directories). These files are either contain a lot of data, take up increment space, or contain a lot of changed files. All the files that are above the minimum ratio (default 5%) will be listed. If a file or directory is listed, its contributions are subtracted from its parent. That is why the percentage listed after a directory can be larger than the percentage of its parent. Without this, the root directory would always be the largest, and the output would be boring. OPTIONS
--begin-time time Do not read statistics files older than time. By default, all statistics files will be read. time should be in the same format taken by --restore-as-of. (See TIME FORMATS in the rdiff-backup man page for details.) --end-time time Like --begin-time but exclude statistics files later than time. --minimum-ratio ratio Print all directories contributing more than the given ratio to the total. The default value is .05, or 5 percent. --null-separator Specify that the lines of the file_statistics file are separated by nulls (). The default is to assume that newlines separate. Use this switch if rdiff-backup was run with the --null-separator when making the given repository. --quiet Suppress printing of the "Processing statistics from session..." output lines. BUGS
When aggregating multiple statistics files, some directories above (but close to) the minimum ratio may not be displayed. For this reason, you may want to set the minimum-ratio lower than need. AUTHOR
Ben Escoto <ben@emerose.org>, based on original script by Dean Gaudet. SEE ALSO
rdiff-backup(1), python(1). The rdiff-backup web page is at http://rdiff-backup.nongnu.org/. Version 1.2.8 March 2009 RDIFF-BACKUP(1)
All times are GMT -4. The time now is 06:38 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy