BTW, awk's default FS is a bracket expression regular expression (/[ \t\n]+/) by itself.
... ... ...
This is a common misconception. With the input we have been discussing in this thread:
if that were the default ERE used for separating fields, the default first field would be the empty string before the space at the start of the line. But the actual default first field is PS028,006 (with no leading or trailing <space>s).
The actual default FS value is a single <space> character which is a regex that has a special meaning in awk (i.e., it does not have this special meaning in most other utilities). It is the only utility in the standards where <space> has this special meaning in an ERE used as a field separator. In awk, when an entire field separator ERE is a single <space> character, awk is required to skip leading and trailing <blank> and <newline> characters (where a <blank> character is any character in the current Locale's blank character class) and then fields shall be delimited by sets of one or more <blank> or <newline> characters. In the C and POSIX locales, a <blank> is either a <space> character or a <tab> character; in other locales additional characters may also be included in the list of characters in the blank character class (thereby being ignored at the start and end of a record and being treated as additional elements in field separators in other places).
These 2 Users Gave Thanks to Don Cragun For This Post:
Thanks, Don Cragun, for this clarification.
Indeed, man gawk is way more explicit:
Quote:
FS The input field separator, a space by default. See Fields, above. .
.
.
In the special case that FS is a single space, fields are separated by runs of spaces and/or tabs and/or newlines.
than is my man mawk:
Quote:
mawk defines <SPACE> as the regular expression /[ \t\n]+/.
which I used in my above post. man gawk does not have this statement.
Note that the behavior with the default FS=" " to skip and delimit using both blanks and newlines, used to be different in older Posix implementations, where blanks were used, but not newlines. mawk and gawk still support this older POSIX defined behavior, with special compatibility command line options.
compare:
to
Likewise for gawk with the --posix option.
Last edited by Scrutinizer; 02-02-2019 at 10:13 AM..
This User Gave Thanks to Scrutinizer For This Post:
It does. Please apply what has been said to the repective line:
Is that clearer now? If you want to remove the leading space from field 1, additional measures must be taken.
OK, thank you so much. I was under the impression that the field separator value was set to the *string* "][" rather than "]" or "[", thus I thought that $1 in the code would have been PS028,006 [KJ <Cj>, rather than PS028,006. This was very helpful. Thank you for taking the time to explain this.
In the awk below I am trying to get the average of the sum of $7 if the string in $4 matches in the line below it. The --- in the desired out is not needed, it is just to illustrate the calculation. The awk executes and produces the current out. I am not sure why the middle line is skipped and the... (2 Replies)
Hi,
I am trying to read an Oracle listener log file line by line and need to separate the lines into several fields. The field delimiter for the line happens to be an asterisk.
I have the script below to start with but when running it, the echo command is globbing it to include other... (13 Replies)
Is there a reliable way to deal with whitespace in array indicies?
I am trying to annotate fails in a database using a table of known fails.
In a begin block I have code like this:
# Read in Known Fail List
getline < "'"$failListFile"'"; getline < "'"$failListFile"'"; getline <... (6 Replies)
Hi
Input:
{ committed = 782958592; init = 805306368; max = 1051394048; used = 63456712; }
Result:
A map (maybe Associative Array) where I can iterate through the key/value. Something like this:
for key in $map
do
echo key=$key value=$map
done
Sample output from the map:
... (2 Replies)
here is what i want to achieve... consider a file contains below contents. the file size is large about 60mb
cat dump.sql
INSERT INTO `table1` (`id`, `action`, `date`, `descrip`, `lastModified`) VALUES (1,'Change','2011-05-05 00:00:00','Account Updated','2012-02-10... (10 Replies)
Hi All,
I got stuck up with shell script where i use awk. The scenario which i am working on is as below.
I have a file text.txt with contents
COL1 COL2 COL3 COL4
1 A 500 400
1 B 500 400
1 A 500 200
2 A 290 300
2 B 290 280
3 C 100 100
I could able to sum col 3 and col4 based on... (3 Replies)
attempting the hangman program. This was an optional assignment from the professor. I have completed the logical coding, debugging now.
##I have an array $wordString that initializes to a string of dashes
##reflecting the number of letters in $theWord
##every time the user enters a (valid)... (5 Replies)
i have a file like this
< '393200103052';'H3G';'20081204'
< '393200103059';'TIM';'20110111'
< '393200103061';'TIM';'20060206'
< '393200103064';'OPI';'20110623'
> '393200103052';'HKG';'20081204'
> '393200103056';'TIM';'20110111'
> '393200103088';'TIM';'20060206'
Now i have to generate a file... (9 Replies)
Hello, I am using awk to match text in a tab separated field and am able to do so when matching the exact word. My problem is that I would like to match any sequence of text in the tab-separated field without having to match it all. Any help will be appreciated. Please see the code below.
awk... (3 Replies)
I'm at wits end with this issue and my troubleshooting leads me to believe it is a problem with the file formatting of the array referenced by my script:
awk -F, '{if (NR==FNR) {a=$4","$3","$2}\
else {print a "," $0}}' WBTSassignments1.txt RNCalarms.tmp
On the WBTSassignments1.txt file... (2 Replies)