One could still try:
without needing to use split() (unless I misunderstood and you changed your input file format to remove the <space> before the <Ob>]).
This User Gave Thanks to Don Cragun For This Post:
I'm so sorry Scrutinizer, but as my input is many thousand lines long I did not notice a potential complicating issue that I was wondering if I could get your help addressing. There are time where the desired string between an initial "[" and "<Ob>] contains a space.
So for example, given:
Which I would pare down with INPUT | awk '$1 ~/^ PS/' to get:
In this case, the desired output would be:
or
The code you helped me with only gives:
Again, I apologize that I did not see the possibility of the space within the desired string until I double-checked the output against INPUT | sed -e 's/.* \[\(.*\) <Ob>\].*/\1/' which gives me the desired string but not the $1 when $1 ~/^ PS/.
Would you be able to help me iron this out?
--- Post updated at 10:02 PM ---
Quote:
Originally Posted by Don Cragun
One could still try:
without needing to use split() (unless I misunderstood and you changed your input file format to remove the <space> before the <Ob>]).
This works well Don except that I represented the desired output strings as "ABC" and "XYZ" which it seems that you took at being a three character string. I should have been more specific and said that "ABC" and "XYZ" represents a string of any length. Thus something like ["some amount of text" <Ob>].
That did it RubiC! Such a simple and elegant way to accomplish it! Thanks so much also to Scrutinizer and Don Cragun for your help!
If I may, could I please ask a question about the field separator value? The man AWK page seems to only imply rather than being explicit that the use of the square brackets when setting the field separator from the command line tells AWK to interpret what is between them as a regex rather than simply a fixed string which would otherwise be indicated by "..."? Is this correct? Thanks again!
--- Post updated at 04:30 PM ---
Quote:
Originally Posted by RudiC
Try also
That did it RubiC! Such a simple and elegant way to accomplish it! Thanks so much also to Scrutinizer and Don Cragun for your help!
If I may, could I please ask two questions about how this code is working? The first is about the field separator value. The man AWK page seems to only imply rather than being explicit that the use of the square brackets when setting the field separator from the command line tells AWK to interpret what is between them as a regex rather than simply a fixed string which would otherwise be indicated by "..."? Is this correct?
Secondly, since the value for FS has been set to "][" how come when the print statement calls for {print $1} is does not print from the beginning of the line to the first instance of "][" but rather prints what would be $1 when FS is set to whitespace? In other words, given:
Why does RudiC's code not give:PS028,006 [KJ <Cj> for {print $1} if FS is set to "]["?
Rather it gives the (desired) first field if FS was at default PS028,006?
That did it RubiC! Such a simple and elegant way to accomplish it! Thanks so much also to Scrutinizer and Don Cragun for your help!
If I may, could I please ask a question about the field separator value? The man AWK page seems to only imply rather than being explicit that the use of the square brackets when setting the field separator from the command line tells AWK to interpret what is between them as a regex rather than simply a fixed string which would otherwise be indicated by "..."? Is this correct? Thanks again!
--- Post updated at 04:30 PM ---
That did it RubiC! Such a simple and elegant way to accomplish it! Thanks so much also to Scrutinizer and Don Cragun for your help!
If I may, could I please ask two questions about how this code is working? The first is about the field separator value. The man AWK page seems to only imply rather than being explicit that the use of the square brackets when setting the field separator from the command line tells AWK to interpret what is between them as a regex rather than simply a fixed string which would otherwise be indicated by "..."? Is this correct?
Secondly, since the value for FS has been set to "][" how come when the print statement calls for {print $1} is does not print from the beginning of the line to the first instance of "][" but rather prints what would be $1 when FS is set to whitespace? In other words, given:
Why does RudiC's code not give:PS028,006 [KJ <Cj> for {print $1} if FS is set to "]["?
Rather it gives the (desired) first field if FS was at default PS028,006?
Thanks again!
Hi jvoot,
The standards clearly state that the value of the awkFS variable is an extended regular expression and it doesn't matter whether it is set using the -F option, using the -v option, using an assignment statement between pathname operands, or using an assignment statement in the awk script itself. When the ERE is set to [][] that is a bracket expression that specifies that the <open-square-bracket> character ([) and the <close-square-bracket> character (]) are each to be treated as separate field separators.
With the FS value RudiC used, field 1 is everything that appears in the record before the 1st open or close square bracket character (including the leading and trailing <space>). I chose to use the default FS value because I didn't think you wanted the leading and trailing <space> characters at the start of lines in your input data to be included in your output.
Hope this helps,
Don
These 2 Users Gave Thanks to Don Cragun For This Post:
...
The first is about the field separator value. The man AWK page seems to only imply rather than being explicit that the use of the square brackets when setting the field separator from the command line tells AWK to interpret what is between them as a regex rather than simply a fixed string which would otherwise be indicated by "..."? Is this correct?
You are partly right, the field separator string will be interpreted as a regex, and always. In Scrutinizers proposal (from which I stole shamelessly), he uses the bracket expression [][]. man regex:
Quote:
A bracket expression is a list of characters enclosed in "[]". It normally matches any single character from the list.
So awk splits the input line at any occurrence of either [ or ] .
BTW, awk's default FS is a bracket expression regular expression (/[ \t\n]+/) by itself.
Quote:
Secondly, since the value for FS has been set to "][" how come when the print statement calls for {print $1} is does not print from the beginning of the line to the first instance of "][" but rather prints what would be $1 when FS is set to whitespace?
It does. Please apply what has been said to the repective line:
Is that clearer now? If you want to remove the leading space from field 1, additional measures must be taken.
In the awk below I am trying to get the average of the sum of $7 if the string in $4 matches in the line below it. The --- in the desired out is not needed, it is just to illustrate the calculation. The awk executes and produces the current out. I am not sure why the middle line is skipped and the... (2 Replies)
Hi,
I am trying to read an Oracle listener log file line by line and need to separate the lines into several fields. The field delimiter for the line happens to be an asterisk.
I have the script below to start with but when running it, the echo command is globbing it to include other... (13 Replies)
Is there a reliable way to deal with whitespace in array indicies?
I am trying to annotate fails in a database using a table of known fails.
In a begin block I have code like this:
# Read in Known Fail List
getline < "'"$failListFile"'"; getline < "'"$failListFile"'"; getline <... (6 Replies)
Hi
Input:
{ committed = 782958592; init = 805306368; max = 1051394048; used = 63456712; }
Result:
A map (maybe Associative Array) where I can iterate through the key/value. Something like this:
for key in $map
do
echo key=$key value=$map
done
Sample output from the map:
... (2 Replies)
here is what i want to achieve... consider a file contains below contents. the file size is large about 60mb
cat dump.sql
INSERT INTO `table1` (`id`, `action`, `date`, `descrip`, `lastModified`) VALUES (1,'Change','2011-05-05 00:00:00','Account Updated','2012-02-10... (10 Replies)
Hi All,
I got stuck up with shell script where i use awk. The scenario which i am working on is as below.
I have a file text.txt with contents
COL1 COL2 COL3 COL4
1 A 500 400
1 B 500 400
1 A 500 200
2 A 290 300
2 B 290 280
3 C 100 100
I could able to sum col 3 and col4 based on... (3 Replies)
attempting the hangman program. This was an optional assignment from the professor. I have completed the logical coding, debugging now.
##I have an array $wordString that initializes to a string of dashes
##reflecting the number of letters in $theWord
##every time the user enters a (valid)... (5 Replies)
i have a file like this
< '393200103052';'H3G';'20081204'
< '393200103059';'TIM';'20110111'
< '393200103061';'TIM';'20060206'
< '393200103064';'OPI';'20110623'
> '393200103052';'HKG';'20081204'
> '393200103056';'TIM';'20110111'
> '393200103088';'TIM';'20060206'
Now i have to generate a file... (9 Replies)
Hello, I am using awk to match text in a tab separated field and am able to do so when matching the exact word. My problem is that I would like to match any sequence of text in the tab-separated field without having to match it all. Any help will be appreciated. Please see the code below.
awk... (3 Replies)
I'm at wits end with this issue and my troubleshooting leads me to believe it is a problem with the file formatting of the array referenced by my script:
awk -F, '{if (NR==FNR) {a=$4","$3","$2}\
else {print a "," $0}}' WBTSassignments1.txt RNCalarms.tmp
On the WBTSassignments1.txt file... (2 Replies)