Thanks Corona688! This what I tried by combining Don's reply.
but did not work. What did I miss?
Following your attempt a little closer than the way Corona688 did it...
This is an awk script, not a shell script; so $ before a variable name references the field named by contents of the variable instead of the contents of the variable AND awk variables are not expanded in quoted strings AND ${#var} and $((expr)) are shell expressions that are not valid awk expressions.
If what you are trying to do is produce 8 character output strings (assuming the length of the input string is never more than 8 characters) with varying length alphabetic and numeric parts in the input, try:
Assuming test.file contains:
as shown in post #18 in this thread, it produces the output:
Remember the input file format I specified in post #10 in this thread. Your sample input file has trailing spaces (violating item #2: Each input string is an alphanumeric string ending in one or more decimal digits.)
If we remove all of the trailing spaces from file.test or change all occurrences of $0 in the above script to $1 (so we just look at the first field instead of the entire line), the output produced is:
which I assume is closer to what you were trying to do.
As a learning exercise, can you explain why the awk script Corona688 suggested didn't have a problem with trailing spaces while my script above does have a problem with trailing spaces?
And, no, I'm not a lawyer. But I did like the Perry Mason, Matlock, and Boston Legal TV series. And, in my last job, my boss referred to me as his standards lawyer because he could get me to answer any questions about why we were failing POSIX/UNIX standards conformance tests and directions on how to fix our code (when we had a bug) or how to file a bug report against the test suite (when there was a bug in the test suite or it was assuming behavior above and beyond what the standards require).
Thanks Don! can you explain why the awk script Corona688 suggested didn't have a problem with trailing spaces while my script above does...?Not quite sure, nor ever thought about it, but my guess is:
In Crona688's reply, sub(/^[^0-9]*/,"",A)removed all the trailing spaces, ---No, this only removes the leading chars!!!
Do the printf format modifiers make any difference?
Yours is
vs
This is related to my original un-clear that I want clarify.
By the way, thanks for your legal story. It is interesting.
Thanks Don! As a learning exercise, can you explain why the awk script Corona688 suggested didn't have a problem with trailing spaces while my script above does have a problem with trailing spaces?Not quite sure, but my guess is:
In Crona688's reply, sub(/^[^0-9]*/,"",A)removed all the trailing spaces, but yours did not substr($0, match($0, /[[:digit:]]*$/)) Correct?
Do the printf format modifiers make any difference?
Yours is
vs
This is my original un-clear that I want clarify.
No.
deletes everything from the start of the string that is not a decimal digit.
then computes the length of the alphabetic part of your input as the original line length minus the line length of the input with the characters that are not decimal digits at the start of your input removed. With the input line "S2<space><space>", N is set to 1, i.e., 4 (4 input characters) - 3 (the length of "2<space><space>" after removing the leading "S").
and this works because the substr() extracts the 1st character from the original input and prints it using the format %s and prints "2<space><space>" using the format %07d which prints the 7 (or more) digit leading zero filled number specified by A (and the %d format specifier ignores anything in the string it evaluates starting with the 1st character that is not part of a valid numeric value).
My code looks for trailing digits to determine the numeric part of your input (allowing other digits to appear elsewhere in the prefix). When there aren't any trailing decimal digits:
saves the trailing decimal digits in the variable NUMBER and the call to match has the side effect of setting RSTART to the offset in $0 where the first decimal digit was found (zero if not match was found) and setting RLENGTH to the number of decimal digits found at the end of the $0 (-1 if no match is found).
sets PREFIX to the prefix (your alphabetic part, but this will take use the longest string at the start of the line that does not end in a decimal digit). And it sets PRELEN to the length of that string.
sets DIGITS to the length of the string you want (8) minus the length of PREFIX (PRELEN).
and here we print the PREFIX saved above and uses the same %0xd format to print the decimal digits found at the end of your input.
And, with the input "S2<space><space>", what we find if we look closely is that the awk match() function on Mac OS X (from BSD) does not conform to the standards. When no match is found RSTART should be set to zero, but instead it is being set to the length of the input string plus one. So, the almost reasonable output I showed you in post #22 for the input with trailing spaces is not what a standards-conforming awk should do. (I just love it when I find conformance bugs in UNIX implementations when I'm trying to explain how things should work! ) So, now I need to check to see what other implementations do to determine if this is a bug in the standards or a bug in BSD/Apple awk. Are we having fun yet...
Update: Please ignore the grayed out paragraph above... I obviously hadn't had enough sleep when I wrote it. I'll post an update later today explaining correctly how Apple/BSD awk is doing exactly what it is supposed to be doing with the input string "S2<space><space>"... I apologize for any confusion this may have caused.
Last edited by Don Cragun; 09-15-2015 at 04:09 PM..
This User Gave Thanks to Don Cragun For This Post:
Thanks Don!
My understanding of the code by Corona688 is the same as your explanation. The "you-called-bug" part is too complicated for me at this moment. I have met this problem before, but never thought about it your way, as not able to. It is real good point when dealing with lines with trailing space(s).
Here is the correction to the last paragraph I wrote in post #18...
My code looks for trailing digits to determine the numeric part of your input (allowing other digits to appear elsewhere in the prefix). When there aren't any trailing decimal digits (as with an input line containing "S2<space><space>"), you don't get what you wanted because:
successfully matches zero decimal digits at the end of the string and saves an empty string in the variable NUMBER and the call to match has the side effect of setting RSTART to the offset in $0 where the first of zero decimal digits was found (5 being the address of the null byte terminating the string) and setting RLENGTH to the number of decimal digits found at the end of the $0 (0 in this case). (If the ERE had been [[:digit:]]+$ which matches one or more decimal digits at the end of the string instead of [[:digit:]]*$ which matches zero of more decimal digits at the end of the string, different problems would arise. This code was designed to work under several assumption including:
Quote:
Each input string is an alphanumeric string ending in one or more decimal digits.
sets PREFIX to the prefix (your alphabetic part, but this will use the longest string at the start of the line that does not end in a decimal digit). And it sets PRELEN to the length of that string (4 = 5 - 1 in this case).
sets DIGITS to the length of the string you want (8) minus the length of PREFIX (PRELEN), which in this case is 4 (8 - 4).
and here we print the PREFIX saved above and uses the same %04d format to print the decimal digits found at the end of your input. And, since there weren't any digits at the end of the string and the awk printf %d specifier treats an empty string as the numeric value zero, it prints the entire input string as the prefix followed by four zeros:
If I was writing production code, I would test that the input meets the stated assumptions and produce a diagnostic message for that input line instead of producing garbage output from garbage input. If anyone wants to turn sample code provided by the volunteers here at The UNIX & Linux Forums into production code, verifying that input meets the stated input requirements is part of that task. (Note that I stated the input and output assumptions used by my script in post #10 in this thread. When the input meets all of the input assumptions, it produces the desired output. If one or more of those assumptions are not met, I make no claims about the output produced by my sample code as a result.)
This User Gave Thanks to Don Cragun For This Post:
Hello,
I am looking for a method to use in my bash script which allows me to use long strings with all special characters.
I have found that printf method could be helpful for me but unfortunately, when I trying
root@machine:~# tevar=`printf "%s%c"... (2 Replies)
Hi All
I am working to process txt file into csv commo separated.
Input.txt
1,2,asdf,34sdsd,120,haahha2
2,2,wewedf,45sdsd,130,haahha
.....
....
Errorcode.txt
120
130
140
myawk.awk code:
{
BEGIN{
HEADER="f1,f2,f3,f4,f5,f6" (4 Replies)
Hi Experts,
Quick question:
I am trying to get the output with decimal and floating point but not working:
echo "20.03" | awk '{printf "%03d.2f\n" , $0 }'
020.2f
How to get the output as :
020.03
Thank you. (4 Replies)
I want to print a string say "str1 str2 str3 str4" using printf.
If I try printing it using printf it is printing as follows.
output
-------
str1
str2
str3
str4
btw I'm working in AIX.
This is my first post in this forum :)
regards,
rakesh (4 Replies)
Hi Friends,
I am trying to insert lines of the below format in a file:
# x3a4914 Joe 2010/04/07
# seh Lane 2010/04/07
# IN01379 Larry 2010/04/07
I am formatting the strings as follows using awk printf:
awk 'printf "# %s %9s %18s\n", $2,$3,$4}'
... (2 Replies)
hi all
can any one help me to understand this
bdf -t vfxs | awk '/\//{printf("%-30s%-10s%-10s%-10s%-5s%-10s\n",$1,$2,$3,$4,$5,$6)}'
i want to understand the numbers %-30S% (4 Replies)
Hi I'm having a problem with converting a file:
ID X
1 7
1 8
1 3
2 5
2 7
2 2
To something like this:
ID X1 X2 X3
1 7 8 3
2 5 7 2
I've tried the following loop:
for i in `cat tst.csv| awk -F "," '{print $1}'| uniq`;do grep -h $i... (4 Replies)
I am trying to use printf with a character string that is used within a do loop. The problem is that while in the loop, the printf prints the variable name instead of the value. The do loop calls the variable name from a text file (called device.txt):
while read device
do
cat $device.clean... (2 Replies)
Hi all,
My simple AWK code does C = A - B
If C can be a negative number, how awk printf formating handles it using string format specifier.
Thanks in advance
Kanu
:confused: (9 Replies)
Hi Folks!
Can you help me with this find -printf command. I seem to be unable to execute the printf-command from my shell script. I'm confused: :confused:
My shell script snippet looks like this:
#!/bin/sh
..
COMMAND="find ./* -printf '%p %m %s %u %g \n'"
echo "Command: ${COMMAND}"... (1 Reply)