awktidy


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awktidy
# 1  
Old 10-30-2007
awktidy

Hi.

A post by ghostdog74 at split files by specifying a string (bash shell) - LinuxQuestions.org made me think that many awk scripts could benefit from being tidied.

I know that {aigles awk bakunin cfaj futurelet ghostdog74 radoulov summer_cherry vgersh99 Ygor} among others post awk code. Often the code is clever, but it takes me a while to understand it because my thought processes are not exactly like those of the author. If it were to be in a standard format then perhaps the intent and method would be made more clear.

I'm posting this at unix.com because it seems to be the place that has the most awk action.

There is an awkpretty at awk tools , but it's not trivial to install (i.e. I tried it and failed, modified a few items, installed it, but it will not run past a certain point -- you're probably better at recognizing the problem(s) with it than I am, so you're welcome to try your hand at it if you wish). It has the advantage of being very general. As I understand it, it used the awk interpreter's parser, but a potential drawback of that is then it might not recognize all the {g, n, m, other}awk extensions.

Would an awktidy be useful for others?

Do you know of any such codes?

What would be some useful features for you in such a code? ... cheers, drl

Last edited by drl; 10-30-2007 at 12:40 PM..
# 2  
Old 10-30-2007
Just to add that the pretty-printing and profiling
are already included in the GNU Awk (pgawk).
# 3  
Old 10-30-2007
Quote:
Originally Posted by drl
[...]
I know that {aigles awk bakunin cfaj futurelet ghostdog74 radoulov summer_cherry Ygor} among others post awk code.
[...]
And why vgersh99 is missing Smilie?
# 4  
Old 10-30-2007
Hi.
Quote:
Originally Posted by radoulov
And why vgersh99 is missing Smilie?
Because I'm not perfect Smilie -- added, thanks ... cheers drl
# 5  
Old 11-07-2007
# 6  
Old 11-07-2007
Hi.

Thanks for the pointers and comments.

I looked at gawk's --profile option suggested by radoulov. Although well-hidden, and a non-intuitive place in which to put a tidy option, it did seem to work quite well, especially when dealing with operator-operand separation. Mainly I want code to be output that can be used. However, the HTML conversion mentioned by Ygor could be useful for display purposes.

That being said, I found the following drawbacks to gawk's pretty-printing:

1) the code is not really ready for replacing into a script because of the TABs that are inserted,

2) most awk scripts (at least those that I write and use) are embedded into shell scripts. The gawk processor doesn't take kindly to such situations,

3) the coding is very loose; too loose even for me,

So I wrote some code to deal with these issues. Here is a sample of code flowing through this pipeline of processes:
Code:
 ( pass prep, lines read 8, lines written 2 )
 ( pass gawk, lines read 2, lines written 12 )
 ( pass    3, lines read 12, lines written 7 )
 ( pass post, lines read 7, lines written: 5 )

from beginning to end:
Code:
-- Input file (marked invisibles):
#!/usr/bin/env sh$
$
# @(#) s1^IDemonstrate awktidy.$
$
FILE=${1?'usage -'"$0"', must supply filename.'}$
$
awk '/^CNL/&&x=="NEW"$3&&$0=y RS$0$
{x=$1$3;y=$0}' filename$

-- Prepared input, removed shell, awk:
/^CNL/&&x=="NEW"$3&&$0=y RS$0
{x=$1$3;y=$0}

-- Processed by gawk:
        # gawk profile, created Wed Nov  7 08:43:59 2007

        # Rule(s)

        /^CNL/ && x == ("NEW" $3) && $0 = ((y RS) $0)   {
                print $0
        }

        {
                x = ($1 $3)
                y = $0
        }

-- Adjusted, collapsed, aligned:
# gawk profile, created Wed Nov  7 08:43:59 2007
# Rule(s)
/^CNL/ && x == ("NEW" $3) && $0 = ((y RS) $0){ print $0 }
        {
          x = ($1 $3)
          y = $0
        }

-- Final process, removed gratuitous comments:
/^CNL/ && x == ("NEW" $3) && $0 = ((y RS) $0){ print $0 }
        {
          x = ($1 $3)
          y = $0
        }

-- End of sample run.

I consider this to be a good start, but the extra passes have a number of drawbacks. This works in most of the cases I have thrown at it, but it's still alpha-level code. I don't know if I will put in the time to make it a public tool or not. The extra passes are written in perl Smilie ... cheers, drl
Login or Register to Ask a Question

Previous Thread | Next Thread
Login or Register to Ask a Question