Replacing character "|" in given character range


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Replacing character "|" in given character range
# 1  
Old 05-27-2014
Replacing character "|" in given character range

Hi

I am having file :
Code:
1|2443094                  |FUNG SIU TO |CLEMENT
2|2443095                  |FUNG KIL FO |REMENT

This file contains only 3 fields delimeted by "|". Last field is a decsription filed and it contains character "|". Due to this my output if breaking in 4 fields. I need to replace the last "|" from description fields "FUNG SIU TO |CLEMENT" and make it "FUNG SIU TO _CLEMENT".

Can some one guide how to do this using AWK or Sed?

---------- Post updated at 02:07 AM ---------- Previous update was at 02:01 AM ----------

Length of the last field is also known and fixed.

start character count 29 and length 20 chars.
# 2  
Old 05-27-2014
Code:
awk 'BEGIN{FS = OFS = "|"}
NF > 3 {for(i = 4; i <= NF; i++)
  {$3 = $3 "_" $i};
  NF = 3}1' file

---------- Post updated at 03:48 AM ---------- Previous update was at 03:21 AM ----------

perl solution
Code:
perl -lne '@A = split(/\|/, $_, 3);
  $A[2] =~ s/\|/_/g;
  print join("|", @A)' file

---------- Post updated at 03:48 AM ---------- Previous update was at 03:48 AM ----------

sed
Code:
sed 's/|/_/3g' file

# 3  
Old 05-27-2014
Replacing character "|" in given character range

Hi Srini

Awesome. But My actual case is little different.

Actual file may be like that :
1|24xx|x96 |wewewewewe|Aps (ueasTng) Ltd(00101|2500000)|001012561|558 |NYL|GB |G179300844|1012561038 |Orriva P|LC|GB |O718483442|Y

This is one record and here field no 2 (24xx|x96 ),field no 4(Aps (ueasTng) Ltd(00101|2500000)), and field no. 11(Orriva P|LC) will be having '|' appened. Fields in Red contains the '|'. Each field has fixed length.

---------- Post updated at 03:06 AM ---------- Previous update was at 03:05 AM ----------

Hi Srini

Awesome. But My actual case is little different.

Actual file may be like that :
Code:
1|24xx|x96 |wewewewewe|Aps (ueasTng) Ltd(00101|2500000)|001012561|558 |NYL|GB |G179300844|1012561038 |Orriva P|LC|GB |O718483442|Y

This is one record and here field no 2 (24xx|x96 ),field no 4(Aps (ueasTng) Ltd(00101|2500000)), and field no. 11(Orriva P|LC) will be having '|' appened. Fields in Red contains the '|'. Each field has fixed length.

Last edited by Don Cragun; 05-27-2014 at 05:13 AM.. Reason: get rid of italics in CODE segment
# 4  
Old 05-27-2014
Do you think this matches with your initial requirement?

Quote:
Awesome. But My actual case is little different.
Why couldn't you post this actual case previously.
This User Gave Thanks to clx For This Post:
# 5  
Old 05-27-2014
This is not a little different. There is a HUGE difference between changing all "|" characters after the first 3 on a line to "_" characters and changing an unknown number of "|" characters in the middle of a line to some other unspecified character(s).

What are the exact field widths for this new file format (or what is the format of the file that specifies the file format for the file(s) you want to process)? Are embedded "|" characters all supposed to be changed to "_", or is a different character used in some fields? Do all fields need to be checked? If not, how will your script know which fields should be checked?

What have you tried to solve this problem?
# 6  
Old 05-27-2014
Replacing character "|" in given character range

Hi

Here is the field description [
Code:
1|24xx|x96 |wewe|Aps (ueasTng) Ltd(00101|2500000)|001012561|558 |NYL|GB |G179300844|1012561038 |Orriva P|LC|GB |O718483442|Y
 
 
Field   Length    value 
 
1             1           1
 
2             4           24xx
 
3             3           x96
 
4             4           wewe 
 
5             33         Aps (ueasTng) Ltd(00101|2500000)    
 
6              9              001012561
 
7            3              558
 
8              3              NYL
 
9            10            G179300844
 
10         10             1012561038
 
11         14             Orriva P|LC|GB
 
12          15             O718483442
 
13          1               Y
_________________________________________________

Field seperator is '|'. i need to replace all '|' in fields 5th and 11th.

I can do this using substr function like

Code:
str1=substr() ----Contains fields 1 to 4
str2=substr ()----Contains field 5 (Using Sub function to replace | to _) 
str3=substr ()----Contains fields 6 to 10
str4=substr ()----Contains fields 11  (Using Sub function to replace | to _) 
str5=substr ()----Contains fields 12 to 13

Finally joining all these ..However I am looking for better approach to do this.
Please let me know if I am clear now.
Moderator's Comments:
Mod Comment Please do not use FONT and SIZE tags to override formatting provided by CODE tags!

Last edited by Don Cragun; 05-27-2014 at 05:28 PM.. Reason: Remove FONT and SIZE tags from CODE data.
# 7  
Old 05-27-2014
Sorry, that description doesn't help either. You say the third field is 3 chars long, but your line holds 4 chars: X96 . Same for field 7. And between field 8 and 9 an entire field is missing: GB in your line is not reflected. So which one should we rely on?
BTW - can't you address the problem at the root and persuade the generating application to use different field separators?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

sed random \n for "n" range of character occurrences

I'd like to put paragraph breaks \n\n randomly between 5 - 10 occurrences of the dot character (.), for an entire text file. How to do that? In other words, anywhere between every 5 -10 sentences, a new paragraph will generate. There are no other uses of the (.) except for sentence breaks in... (11 Replies)
Discussion started by: p1ne
11 Replies

2. UNIX for Dummies Questions & Answers

Grep : Filter/Move All The Lines Containing Not More Than One "X" Character Into A Text File

Hi All It's me again with another huge txt files. :confused: What I have: - I have 33 huge txt files in a folder. - I have thousands of line in this txt file which contain many the letter "x" in them. - Some of them have more than one "x" character in the line. What I want to achieve:... (8 Replies)
Discussion started by: Nexeu
8 Replies

3. Shell Programming and Scripting

Using sed to find text between a "string " and character ","

Hello everyone Sorry I have to add another sed question. I am searching a log file and need only the first 2 occurances of text which comes after (note the space) "string " and before a ",". I have tried sed -n 's/.*string \(*\),.*/\1/p' filewith some, but limited success. This gives out all... (10 Replies)
Discussion started by: haggismn
10 Replies

4. Shell Programming and Scripting

How to print range of lines using sed when pattern has special character "["

Hi, My input has much more lines, but few of them are below pin(IDF) { direction : input; drc_pinsigtype : signal; pin(SELDIV6) { direction : input; drc_pinsigtype : ... (3 Replies)
Discussion started by: nehashine
3 Replies

5. Shell Programming and Scripting

Command Character size limit in the "sh" and "bourne" shell

Hi!!.. I would like to know what is maximum character size for a command in the "sh" or "bourne" shell? Thanks in advance.. Roshan. (1 Reply)
Discussion started by: Roshan1286
1 Replies

6. UNIX for Advanced & Expert Users

Command Character size limit in the "sh" and "bourne" shell

Hi!!.. I would like to know what is maximum character size for a command in the "sh" or "bourne" shell? Thanks in advance.. Roshan. (1 Reply)
Discussion started by: Roshan1286
1 Replies

7. UNIX for Dummies Questions & Answers

Command Character size limit in the "sh" and "bourne" shell

Hi!!.. I would like to know what is maximum character size for a command in the "sh" or "bourne" shell? Thanks in advance.. Roshan. (1 Reply)
Discussion started by: Roshan1286
1 Replies

8. Shell Programming and Scripting

removing the "\" and "\n" character using sed or tr

Hi All, I'm trying to write a ksh script to parse a file. When the "\" character is encountered, it should be removed and the next line should be concatenated with the current line. For example... this is a test line #1\ should be concatenated with line #2\ and line number 3 when this... (3 Replies)
Discussion started by: newbie_coder
3 Replies

9. UNIX for Dummies Questions & Answers

extran NUll character added after end of line "\n"

Hi All, I am facing a strange situation and want to find why it is occuring . When i convert the whole line into Hexadecimal character i can find the junk value after new line (\n) . If i look in binary mode it is not visible. PLease let me know how possible the junk character is added... (1 Reply)
Discussion started by: arunkumar_mca
1 Replies

10. Shell Programming and Scripting

replacing the character "\" in a file

I am using sed to replace things like "," and tabs in a file, but what is the code for replacing a \ ???? (5 Replies)
Discussion started by: rjsha1
5 Replies
Login or Register to Ask a Question