Awk command to replace specific position characters. | Unix Linux Forums | Shell Programming and Scripting

  Go Back    


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

Awk command to replace specific position characters.

Shell Programming and Scripting


Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 01-31-2012
pinnacle pinnacle is offline
Registered User
 
Join Date: Apr 2009
Last Activity: 8 May 2014, 12:25 PM EDT
Posts: 242
Thanks: 9
Thanked 1 Time in 1 Post
Awk command to replace specific position characters.

Hi,
I have a fixed width file.
The way this file works is say for example there are 30 columns in it each with different sizes say 10,5,2, etc...
If data in a field is less than the field size the rest of it is loaded with spaces.
I would like an awk command to that would replace
Quote:
position 96-98 with "ABC"
position 99-113 with "DEF" rest of it with spaces
position 190-198 with "XYZ" rest of it with spaces
I have tried following but doesnt work

Code:
 
awk '{printf "%-"ABC"s\n",$0}' File> File1

Please let me know with the correct command.

---------- Post updated at 07:34 PM ---------- Previous update was at 06:35 PM ----------

I now tried this:


Code:
 
awk -F "" '{for (i=1;i<=NF;i++) if (i==96||i==97||i==98) $i="ABC"}1' OFS="" urfile

Need help with little more tweaking.
Sponsored Links
    #2  
Old 01-31-2012
Chubler_XL's Avatar
Chubler_XL Chubler_XL is offline Forum Advisor  
Registered User
 
Join Date: Oct 2010
Last Activity: 26 November 2014, 10:23 PM EST
Posts: 3,027
Thanks: 114
Thanked 970 Times in 909 Posts
How about this:


Code:
awk 'function repl(s,f,t,v)
{ return substr(s,1,f-1) sprintf("%-*s", t-f+1, v) substr(s,t+1) }
{ a=repl($0,96,98,"ABC")
  a=repl(a,99,113,"DEF")
  a=repl(a,190,198,"XYZ")
  print a
}' infile

The Following User Says Thank You to Chubler_XL For This Useful Post:
mirni (02-01-2012)
Sponsored Links
    #3  
Old 01-31-2012
pinnacle pinnacle is offline
Registered User
 
Join Date: Apr 2009
Last Activity: 8 May 2014, 12:25 PM EDT
Posts: 242
Thanks: 9
Thanked 1 Time in 1 Post
Thanks Chubler_XL. This works.
Sir could you please explain what we are doing in this.
I understand that we defined a function called "repl" which takes four parameters.
And then calling it 3 times.
I dont understand the following:
Quote:

sprintf("%-*s", t-f+1, v) substr(s,t+1) ---> What does this part does

we are storing all the return values in "a" and printing it. So will the values not get overwritten.
a=repl($0,96,98,"ABC")
a=repl(a,99,113,"DEF")
a=repl(a,190,198,"XYZ")
print a
And also when we are replacing characters from 99 to 113 with "DEF" without trailing spaces. How does it not mess the format.

Would really appreciate if you could explain this to me.
    #4  
Old 02-01-2012
mirni mirni is offline
Registered User
 
Join Date: Mar 2011
Last Activity: 2 October 2014, 3:08 AM EDT
Posts: 686
Thanks: 51
Thanked 178 Times in 171 Posts
That's an awesome solution. I will try to analyze it, Chubler please correct if I mess up along the way
Let's look at the sprintf call:

Code:
sprintf("%-*s", t-f+1, v)

The format flags of sprintf mean the following:
Quote:
- Left-justify within the given field width
* The width is not specified in the format string, but as an additional integer value argument preceding the argument that has to be formatted.
So (t-f+1) is the width of the string v to be printed. Now the repl() calls:

Code:
function repl(s,f,t,v)
{ return substr(s,1,f-1) sprintf("%-*s", t-f+1, v) substr(s,t+1) }

takes a string s, position "from" f, position "to" t, and what to insert "v". It will only modify the string from f to t, inserting v properly padded.

Code:
a=repl($0,96,98,"ABC")

will take the whole line ($0) and replace the characters 96-98 with "ABC".

Code:
a=repl(a,99,113,"DEF")

will take a and replace chars 99-113 with "DEF" left aligned and padded.

Code:
a=repl(a,190,198,"XYZ")

will replace chars 190-198 of a with "XYZ" left-aligned and padded as needed.
So it is modifying the variable a, one step at a time, and taking the output of previous call as input for next one. The same 3 calls could be written nested as:

Code:
 a=repl(repl(repl($0,96,98,"ABC"),99,113,"DEF"),190,198,"XYZ")

But we can all agree, that the former is much more readable, debuggable and commentable.
The Following User Says Thank You to mirni For This Useful Post:
Chubler_XL (02-01-2012)
Sponsored Links
    #5  
Old 02-01-2012
Chubler_XL's Avatar
Chubler_XL Chubler_XL is offline Forum Advisor  
Registered User
 
Join Date: Oct 2010
Last Activity: 26 November 2014, 10:23 PM EST
Posts: 3,027
Thanks: 114
Thanked 970 Times in 909 Posts
Ill try and explain this first:


Code:
sprintf("%-*s", t-f+1, v) substr(s,t+1)
s=string
f=from
t=to
v=New value
 
t-f+1 = length of new string

sprintf will left justify New value to width of (t-f+1), padded with spaces.

So repl function returns:

(chars 1 => (FROM-1)) +
padded New value +
(chars (TO + 1) => end)

Note that function originally takes $0 (input line) and stores result in 'a'.
All other calls work on 'a' and store back into 'a' again.

1. a = orginal string replaced at chars 96-98
2. a = a (result from 1) replaced at chars 99-113
3. a = a (result from 2) replaced at chars 190-198

-------
Edit: Nice description Mirni - I was desk checking my description while you posted, but I think you explained it more clearly.
Sponsored Links
    #6  
Old 02-01-2012
pinnacle pinnacle is offline
Registered User
 
Join Date: Apr 2009
Last Activity: 8 May 2014, 12:25 PM EDT
Posts: 242
Thanks: 9
Thanked 1 Time in 1 Post
Thanks Mirni and Chubler.

I got most of it except the following.

substr(s,1,f-1) --> line before the string that needs to be replaced

sprintf("%-*s", t-f+1, v) --> here "s" represents the entire line. t-f+1 lenght of string being replaced, v is the new string to be inserted. But where are we saying the start and end position to the sprintf function for the new string to be inserted. We are just passing the length of the new string wanted.

Here is where i need little more help. I looked at the sprint function man page but is not of much help.


substr(s,t+1) --> line after substring to be replaced.


And also we are not using any concatenate function to join various substrings. But in sql and other Datawarehouse tools we use concatenate function.

Appreciate your help on this.
Sponsored Links
    #7  
Old 02-01-2012
mirni mirni is offline
Registered User
 
Join Date: Mar 2011
Last Activity: 2 October 2014, 3:08 AM EDT
Posts: 686
Thanks: 51
Thanked 178 Times in 171 Posts
Quote:
sprintf("%-*s", t-f+1, v) --> here "s" represents the entire line
No, 's' in the format stands for "string". It tells sprintf that the argument to be printed is a string (as opposed to %d which would expect a decimal number or %f which would expect a float).

Code:
var="MyString"
sprintf("%s", var)  #will print "MyString"

So

Code:
 sprintf("%-*s", t-f+1, v)

is printing the string stored in variable v, with the format "%-*s", t-f+1 telling sprintf to justify left (-) and pad with (t-f+1) spaces.

Space concatenates strings in awk. So

Code:
a="one" "two"

will produce "onetwo"

Code:
a=b c  #concat of variables
a = $1 $2 #concats first and second field

Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Using sed to replace a string in a specific position tinman47 Shell Programming and Scripting 3 09-29-2011 06:35 PM
Sed position specific replace tiggyboo Shell Programming and Scripting 2 01-03-2011 02:47 PM
Add characters at specific position in file dashing201 Shell Programming and Scripting 3 12-02-2009 07:26 AM
Using sed to replace specific character and specific position programmer22 Shell Programming and Scripting 5 10-11-2009 08:50 AM
read space filled file and replace text at specific position COD Shell Programming and Scripting 6 04-21-2008 06:40 AM



All times are GMT -4. The time now is 09:08 AM.