using awk for setting variable but change the output of this variable within awk

using awk for setting variable but change the output of this variable within awk

Hi all,

Hope someone can help me out here.
I have this BASH script (see below)

My problem lies with the variable path.
The output of the command find will give me several fields. The 9th field is the path. I want to captured that and the I want to filter this to a specific level.

The output of the path is somewhat like this: /Volume/Server/department/otherlevels/....

The part I want to capture in the variable is the department.
I can do this in a single line with awk -F/ '{print $5}' But I need it in an array.

Using it like it is now will give me an error.

Any idea how to resolve this ?

Many regards,


find . -iname $1 -exec ls -l {} \; | awk 'BEGIN {

# Initialize all Arrays
        size = "0";
# Assign field names
        path=$( echo $9 awk {-F/ '{print $5}'} )

Something like below..? stores the $5 value in array a[]
find . -iname $1 -exec ls -l {} \; | awk -F/ '{a[NR]=$5;next}END{for(i in a) print i,a[i]}'

Originally Posted by michaelrozar17
Something like below..? stores the $5 value in array a[]
find . -iname $1 -exec ls -l {} \; | awk -F/ '{a[NR]=$5;next}END{for(i in a) print i,a[i]}'

Not quite.

I do not want to change the find command because I am using multiple fields from that output. In fact I need to capture the 9th ($9) field and manipulate the output so that I only capture the department folder.

ex: the output is like:
-rwx------+ 1 admin staff 71680 Jun 12 2010 /Volumes/Sys_Data/Server One/Department/.../.../.../.../...

From this output I capture the $5 field for size counting and the $3 field for user tracking. Now I want to capture the $9 field with the path and only capture the Department field. So I can keep track of the departments as well.

is there any idea why this doesn't work?

# Search Command where $1 is the type to find. Search will start at current location.
find . -iname $1 -exec ls -l {} \; | awk 'BEGIN {

# Initialize all Arrays
size = "0";
# Assign field names
{ path=`for (I=9 ; I<=NF ; I++) { printf "%s",$I } print`
Originally Posted by Cowardly
Not quite.

I do not want to change the find command because I am using multiple fields from that output. In fact I need to capture the 9th ($9) field and manipulate the output so that I only capture the department folder.

ex: the output is like:
-rwx------+ 1 admin staff 71680 Jun 12 2010 /Volumes/Sys_Data/Server One/Department/.../.../.../.../...

From this output I capture the $5 field for size counting and the $3 field for user tracking. Now I want to capture the $9 field with the path and only capture the Department field. So I can keep track of the departments as well.

is there any idea why this doesn't work?
The syntax is wrong for awk. Post the error you get. Alternatively can you try the below sed if the output from the find command is as you posted in post# 3..
find . -iname $1 -exec ls -l {} \; |  sed 's|[^/]*/[^/]*/[^/]*/[^/]*/\([^/]*\)/.*|\1|'

Originally Posted by michaelrozar17
The syntax is wrong for awk. Post the error you get. Alternatively can you try the below sed if the output from the find command is as you posted in post# 3..
find . -iname $1 -exec ls -l {} \; |  sed 's|[^/]*/[^/]*/[^/]*/[^/]*/\([^/]*\)/.*|\1|'

Thanks for the reply.

But I do not want to change my original find command because of the other fields I am using. Besides that with the sed you filter out the / but it is undefined how many /'s there are in the directory path. So this will not work completely

I will try to be more specific. Here is the complete code of my original script:
#!/usr/bin/env bash
# Bash Script written by R. Blaas
# This script will find files of a specific type and displays the full path and size
# Also a total of found files and size is displayed per user and an overal total found files and size
# if nothing is passed to the script, show usage and exit
[[ -n "$1" ]] || { echo “Usage: [Variable]“; exit 0 ; }

# Make sure only root can run our script
if [ "$(id -u)" != "0" ]; then
    echo "This script must be run as root" 1>&2
    exit 1

# Variables

DATE=`date +"%d%m%Y"` 
DATIME=`date +"%Y%m%d%H%M"`
DAGNAAM=`date +"%A"`

# Set current directory to variable $CURRENT

# Set the log directory

# Catch search variable
SEARCH=`echo $1 | sed 's/*//' | sed 's/.//'`

# Check if directory exists
if test ! -d "$DIR_LOG"
    mkdir "$DIR_LOG"

# Create temporary file for log messages

logger "Script start ("
echo `date` start of script > $POSTMLOG
echo "" >> $POSTMLOG
echo Search variable = $SEARCH >> $POSTMLOG
echo "" >> $POSTMLOG

# Search Command where $1 is the type to find. Search will start at current location.
find . -iname $1 -exec ls -l {} \; | awk 'BEGIN { 

# Initialize all Arrays
    size = "0"; 
# Assign field names
    user= $3

# Count of number of files
    all_count["* *"]++;

# Count disc space used
    all_size["* *"]+=sizes;
    for (I=9 ; I<=NF ; I++) { printf "%s",$I } print " :",$5}{ x++ } { size=size+$5 } 
END { print "\n" x " Files Found""\n" } 
END { hum[1024**4]="Tb"; hum[1024**3]="Gb"; hum[1024**2]="Mb"; hum[1024]="Kb"; 
    for (x=1024**4; x>=1024; x/=1024) { if (size>=x) { { printf "Total Size = ",NR } 
    printf "%.2f %s\n\n",size/x,hum[x];break } 
# Output
    { FS = ":";
    format = "%11s %6s %-16s\n";
    prinft "\n"
    printf ( format, "Size","Count","Who" ) }
    for (i in u_count) {
        if (i != "") {
        { hum[1024**4]="Tb"; hum[1024**3]="Gb"; hum[1024**2]="Mb"; hum[1024]="Kb";
        for (x=1024**4; x>=1024; x/=1024) { if (u_size[i]>=x) {
    usersize = sprintf ( "%.2f %s", u_size[i]/x,hum[x] )
        printf ( format,usersize, u_count[i], i);break } } } 
    for (i in all_count) {
        if (i != "") {
        { hum[1024**4]="Tb"; hum[1024**3]="Gb"; hum[1024**2]="Mb"; hum[1024]="Kb";
        for (x=1024**4; x>=1024; x/=1024) { if (all_size[i]>=x) {
    allsize = sprintf ( "%.2f %s", all_size[i]/x,hum[x] )
        printf ( format,allsize, all_count[i], "Total");break } } } 
} ' >> $POSTMLOG

echo "" >> $POSTMLOG
echo `date` end of search >> $POSTMLOG

echo `date` send report to POSTMASTER >> $POSTMLOG

# Mail report to POSTMASTER
mail -s " REPORT `date`" $POSTMASTER < $POSTMLOG
 if [ "$?" == "0" ]; then
  echo Mail sent successful! >> $POSTMLOG
 else echo Mail sent unsuccesful! >> $POSTMLOG

echo "" >> $POSTMLOG
echo `date` end of script >> $POSTMLOG
echo "" >> $POSTMLOG

# Preserving logfile

logger "Script end ("

And here is a possible outcome:
wo 10 aug 2011 15:52:42 CEST start of script

Search variable = doc

/Library/MailDownloads/ObtainingNetVaultLicenseKeysLATEST.doc : 36352
./Library/MailDownloads/Philips.doc : 43008
./Library/MailDownloads/PO-nov2010.doc : 110080
./Library/MailDownloads/UseragreementVPNServeraccess&Authentication(v-1.1).doc : 34304

99 Files Found

Total Size = 42,22 Mb

       Size  Count Who             
   42,22 Mb     99 ronald          
   42,22 Mb     99 Total           

wo 10 aug 2011 15:52:46 CEST end of search
wo 10 aug 2011 15:52:46 CEST send report to POSTMASTER

wo 10 aug 2011 15:52:46 CEST end of script

Now the thing is that in stead of the user I want to see the department directory (these are on a different server, but it is on 1 level)

As you can see in the output I can see the complete path. But I am want the 5th value (the level of the departments dirs) in a variable.

Many thanks
here is a possible solution. You may have to tweak it to the input.

Declare a new variable "dept" like shown below. See how I am using split on $9 which should have your dept name
Once you have done that, replace in your script anywhere you find u_count[user]++; with u_count[dept]++;

# Assign field names
user= $3
Here is the sample output. Like I said you may have to tweak. I am assuming my dept. names to be Documents and Downloads :).

Search variable = zip

./Documents/ : 4777
./Downloads/ : 1561317
./Downloads/ : 4777
./Downloads/ : 26494027

4 Files Found

Total Size = 26.76 Mb

Size Count Who
26.76 Mb 3 Downloads
4.67 Kb 1 Documents
26.76 Mb 4 Total
Originally Posted by dude2cool
here is a possible solution. You may have to tweak it to the input.

This is really good.. And it works as well Smilie Many thanks!

Could you explain a bit more about the variable?
- What does the a mean in the split ?
- what does the [2] do in the variable ?

edit: Ok, done some google on it and this explains it pretty good:

split(string, array, fieldsep)
    This divides string into pieces separated by fieldsep, and stores the pieces in array.
    The first piece is stored in array[1], the second piece in array[2], and so forth. 
    The string value of the third argument, fieldsep, is a regexp describing where to split string (much as FS can be a regexp describing where to split input records).
    If the fieldsep is omitted, the value of FS is used. split returns the number of elements created. The split function, then,
    splits strings into pieces in a manner similar to the way input lines are split into fields. For example:

    split("auto-da-fe", a, "-")

    splits the string `auto-da-fe' into three fields using `-' as the separator. It sets the contents of the array a as follows:

    a[1] = "auto"
    a[2] = "da"
    a[3] = "fe"

again many thanks!

Still have to do some tweaking (as you said) because some Dirs have spaces in them..

