![]() |
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.
|
|
google unix.com
|
|||||||
| Forums | Register | Forum Rules | Links | Albums | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here. |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Adding a word in front of a word of each line. | Ramesh Vellanki | Shell Programming and Scripting | 4 | 07-02-2008 09:17 AM |
| find a word in a file, and change a word beneath it ?? | vikas027 | Shell Programming and Scripting | 2 | 02-13-2008 04:23 PM |
| Can a shell script pull the first word (or nth word) off each line of a text file? | tricky | Shell Programming and Scripting | 5 | 08-17-2006 06:29 AM |
| gawk HELP | sandeep_hi | Shell Programming and Scripting | 6 | 06-19-2006 08:56 AM |
| rs and ors in gawk ...???? | moxxx68 | Shell Programming and Scripting | 2 | 10-05-2004 12:52 PM |
![]() |
|
|
LinkBack | Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
|
|
||||
|
I wanted to use GAWK's 'word boundary' feature but can't
get it to work. Doesn't GAWK support \<word\>? Sample record: Code:
Title Bats in the fifth act of Chushingura (top);
the world of the bell - the story of Anchin and Kiyohime (bottom)
Series Title Sketches by Yoshitoshi
Title-Alternative Yoshitoshi ryakuga: Komori no godanme (top); Kane no sekai (bottom)
$1 ~ /^Title$/ Bubnoff Last edited by Bubnoff; 06-11-2009 at 10:55 PM.. Reason: formatting issue |
|
||||
|
Update on GAWK boundaries.
Thanks for answering ghostdog74, however, I'm still a bit unclear on
what you mean. I am aware that I do not have the gt lt characters in the data, I was trying to use GAWK's word boundary operators. According to the documentation ( GAWK: Effective ...etc. )the regex operators: \< and \> can be used to indicate word boundaries. They do, but they use a space as the delimiter ( if I would've RTFMed a bit closer I would've saved myself this confusion ). eg. "Title-Alternative" will be true but "TitleAlternative" will be false. This still makes no sense. How is this working? I originally thought I could remove "Title-Alternative" by using the word boundary operators like: \<Title\> But since Title-Alternative has a hyphen it's still legal ( why exactly I can't say ). This regex will remove "titleAlternative" which is closer to the example in the docs, but won't remove "Title-Alternative". So I think my problem was not fully understanding the way GAWK's W.B. operators worked ( still don't ). I am new to AWK and am wondering how others would pull "Title" from a record that looks similar to what is in my above post. Code:
gawk '$1 ~ /^Title$/{print}'
GAWK would be much appreciated. Thanks - Bub |
|
||||
|
Quote:
Code:
awk '{
for(i=1;i<=NF;i++){
if( $i == "Title"){ # or ~ /^Title$/
........
}
}
}
'
|
| Bits Awarded / Charged to ghostdog74 for this Post | |||
| Date | User | Comment | Amount |
| 06-12-2009 | Anonymous | Handy loop example. | 74 |
|
||||
|
Forgot to mention the case possibilities -
Each test has to take into account possible capitalization ( or lack thereof ). So actually, I've been using: Code:
/^[Tt]itle$/ records to analyze and the elements I'm testing for are always in field $1 with the values in fields $2 or $3. "Title" is one of around 15 DC elements I'm testing for plus or minus the screwy ones people insist on adding. Some catalogers capitalize and other do not. I could use another Gnu Awk feature though: Code:
gawk -v IGNORECASE=1 '$1 == "title"{print $1}' test.notes
- as you suggest, instead of with regex -
gawk '$1 ~ /^[Tt]itle$/{print}' test.notes
To distinguish title from: "Title Alternative" or "title alternative", I am using: Code:
gawk '$1 ~ /^[Tt]itle$/&&$2 !~ /[Aa]lternative/{print $1}' test.notes
Bub Last edited by Bubnoff; 06-12-2009 at 03:16 AM.. Reason: Forgot case. |
|
|||||
|
From GNU Regexp Operators - The GNU Awk User's Guide
Quote:
|
| Bits Awarded / Charged to Ygor for this Post | |||
| Date | User | Comment | Amount |
| 06-12-2009 | Anonymous | helpful | 1 |
|
||||
|
GAWK boundaries
Thanks Ygor!
I'm embarrassed to say I read this section at least twice, today alone, and didn't catch that. Its times like these when a person should just step away from the screen, grab a cup o' joe and go for a walk. Bub |
![]() |
| Bookmarks |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|