Complex string operation (awk, sed, other?) | Unix Linux Forums | Shell Programming and Scripting

  Go Back    


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

Complex string operation (awk, sed, other?)

Shell Programming and Scripting


Tags
awk, manipulation, operation, sed, string

Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 03-22-2013
usshadowop usshadowop is offline
Registered User
 
Join Date: Mar 2013
Last Activity: 22 March 2013, 5:46 PM EDT
Posts: 2
Thanks: 2
Thanked 0 Times in 0 Posts
Complex string operation (awk, sed, other?)

I have a file that contains RewriteRules for 200 countries (2 examples for 1 country below):


Code:
RewriteRule ^/at(/|/index.html|)$ http://%{HTTP_HOST}/locate/index.html?locale=de_AT [R=301,L]

#&

RewriteRule ^/at_english(/|/index.html|)$ http://%{HTTP_HOST}/locate/index.html?locale=en_AT [R=301,L]

I have another list of redirects for the mobile versions of these sites in the following format:

Code:
RewriteRule ^/at_engilsh(/|/index.html|)$ http://%{HTTP_HOST}m.website.com/www.website.com/at_engilsh [R=301,L]

Bear in mind the at_english is just 1 of the country codes, there are many more.

So my goals is to go from


Code:
RewriteRule ^/at_english(/|/index.html|)$ http://%{HTTP_HOST}/locate/index.html?locale=en_AT [R=301,L]

#to

RewriteRule ^/at_engilsh(/|/index.html|)$ http://%{HTTP_HOST}m.website.com/www.website.com/at_engilsh [R=301,L]

I'm supplying the awk / pseudo code for one way I've thought to do it.


Code:
awk '
{
newurl="m.website.com/www.website.com/"
one=substr($0,1,14)
two=substr($1,13,37)
rest=substr($4,1)

# The line below this comment is the section I'm having difficulty with because 
#I have country codes in multiple formats at / at_engilsh / at_french
#I want to select all characters between ^/ ---> (  
code=substr($2,1) 
     

printf ("%s%s%s%s%s %s\n", one,code,two,newurl,code, rest)
}' input

So I either need help converting the pseudo code into actual code, or suggestions on a better way to do this operation.

Thank you for any help
Sponsored Links
    #2  
Old 03-22-2013
DGPickett DGPickett is offline Forum Advisor  
Registered User
 
Join Date: Oct 2010
Last Activity: 8 July 2014, 12:19 PM EDT
Location: Southern NJ, USA (Nord)
Posts: 4,378
Thanks: 8
Thanked 535 Times in 514 Posts
The term rewrite rules, to me, says sendmail.cf They have a special syntax and nature, and their placement and exact construction depends on the version fo sendmail you are writing for. Rewrite rules keep being applied until they do not change the entity any more, so sometimes you have to change it a+ to b and then b to a, becaulse a is in a+.

Google helps me see you might be more likely talking apache URL rewrite. I wonder if there is an apache forum?

Most of us write in our head in pseudo-code, not awk, and then translate it into the desired language.

It looks like you are short a slash in the example. If your object to to use m., then you need to not use HTTP_HOST or just prefix it with 'm.' if it is a domain name.
The Following User Says Thank You to DGPickett For This Useful Post:
usshadowop (03-22-2013)
Sponsored Links
    #3  
Old 03-22-2013
hanson44 hanson44 is offline
Registered User
 
Join Date: Mar 2013
Last Activity: 12 May 2013, 11:33 PM EDT
Posts: 858
Thanks: 18
Thanked 180 Times in 177 Posts
What is "at_engilsh"?
    #4  
Old 03-22-2013
Don Cragun's Avatar
Don Cragun Don Cragun is offline Forum Staff  
Moderator
 
Join Date: Jul 2012
Last Activity: 23 July 2014, 6:36 AM EDT
Location: San Jose, CA, USA
Posts: 4,117
Thanks: 160
Thanked 1,406 Times in 1,193 Posts
I'm not sure I understand what you're trying to do either, but I think the following awk script does what your examples seem to request.

Code:
awk 'BEGIN { newurl = "m.website.com/www.website.com/" }
{       match($2, /[^(]*[(]/)
        code = substr($2, 3, RLENGTH - 3)
        match($3, /[^}]*}/)
        printf("%s %s %s%s%s %s\n",
                $1, $2, substr($3, 1, RLENGTH), newurl, code, $4)
}' input

With the following in the file named input:

Code:
RewriteRule ^/at(/|/index.html|)$ http://%{HTTP_HOST}/locate/index.html?locale=de_AT [R=301,L] 
RewriteRule ^/at_english(/|/index.html|)$ http://%{HTTP_HOST}/locate/index.html?locale=en_AT [R=301,L]
RewriteRule ^/at_engilsh(/|/index.html|)$ http://%{HTTP_HOST}m.website.com/www.website.com/at_engilsh [R=301,L]
RewriteRule ^/at_english(/|/index.html|)$ http://%{HTTP_HOST}/locate/index.html?locale=en_AT [R=301,L]
RewriteRule ^/at_french(/|/index.html|)$ http://%{HTTP_HOST}/locate/index.html?locale=fr_AT [R=301,L]

the output produced is:

Code:
RewriteRule ^/at(/|/index.html|)$ http://%{HTTP_HOST}m.website.com/www.website.com/at [R=301,L]
RewriteRule ^/at_english(/|/index.html|)$ http://%{HTTP_HOST}m.website.com/www.website.com/at_english [R=301,L]
RewriteRule ^/at_engilsh(/|/index.html|)$ http://%{HTTP_HOST}m.website.com/www.website.com/at_engilsh [R=301,L]
RewriteRule ^/at_english(/|/index.html|)$ http://%{HTTP_HOST}m.website.com/www.website.com/at_english [R=301,L]
RewriteRule ^/at_french(/|/index.html|)$ http://%{HTTP_HOST}m.website.com/www.website.com/at_french [R=301,L]

As always, if you're running on a Solaris/SunOS system, use /usr/xpg4/bin/awk or nawk instead of awk .

Last edited by Don Cragun; 03-22-2013 at 06:12 PM..
The Following User Says Thank You to Don Cragun For This Useful Post:
usshadowop (03-22-2013)
Sponsored Links
    #5  
Old 03-22-2013
usshadowop usshadowop is offline
Registered User
 
Join Date: Mar 2013
Last Activity: 22 March 2013, 5:46 PM EDT
Posts: 2
Thanks: 2
Thanked 0 Times in 0 Posts
Sorry the at_engilsh was a typo. And it is an apache rewrite rule, though my question really doesn't pertain to the rewrite rule at all. The rules function just fine, the question is more just for operating on one iteration of the string and using awk to transform into the other iteration.

---------- Post updated at 04:46 PM ---------- Previous update was at 04:43 PM ----------

Quote:
Originally Posted by Don Cragun View Post

Code:
wk 'BEGIN { newurl = "m.website.com/www.website.com/" }
{       match($2, /[^(]*[(]/)
        code = substr($2, 3, RLENGTH - 3)
        match($3, /[^}]*}/)
        printf("%s %s %s%s%s %s\n",
                $1, $2, substr($3, 1, RLENGTH), newurl, code, $4)
}' input

Yes this is exactly what I needed, thank you for your assistance and apologies for my less than stellar description of what I was looking for!
Sponsored Links
    #6  
Old 03-25-2013
DGPickett DGPickett is offline Forum Advisor  
Registered User
 
Join Date: Oct 2010
Last Activity: 8 July 2014, 12:19 PM EDT
Location: Southern NJ, USA (Nord)
Posts: 4,378
Thanks: 8
Thanked 535 Times in 514 Posts
We have our ways of extracting requirements from the reticent!
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
sed or awk command to replace a string pattern with another string based on position of this string vivek d r Shell Programming and Scripting 10 06-19-2012 09:35 AM
Help - Search for string, then do string operation on line deepaksinbox Shell Programming and Scripting 6 08-19-2009 10:33 AM
SED complex string replacement cbo0485 Shell Programming and Scripting 5 07-23-2009 09:32 PM
string operation arghya_owen Shell Programming and Scripting 1 06-11-2008 07:11 AM
Complex Sed/Awk Question? SkySmart Shell Programming and Scripting 3 01-13-2007 03:04 PM



All times are GMT -4. The time now is 07:54 AM.