Sponsored Content
Top Forums UNIX for Dummies Questions & Answers SED: Can't Repeat Search Character in SED Output Post 302353315 by skmdu on Tuesday 15th of September 2009 05:52:19 AM
Old 09-15-2009
Hope this works for you... Slight modification from your solution:
Code:
Input:
<MY_BIG_TAG>This_is_a_test</MY_BIG_TAG>

Code:
sed '
s/\(<[^>]*>\)\([^>]*\)\(<[^>]*>\)/\1\n\2\3/g 
:loop
s/\n\([^<_]*\)_/\1_Q\n/g 
/\n[^<_]*_/b loop
s/\n//g' a

Output:
<MY_BIG_TAG>This_Qis_Qa_Qtest</MY_BIG_TAG>

Explanation :

1. s/\(<[^>]*>\)\([^>]*\)\(<[^>]*>\)/\1\n\2\3/g
This replaces like <MY_BIG_TAG>\nThis_is_a_test<MY_BIG_TAG>
2. starts loop
3. After \n till < arrives, substitute all underscore to _Q
4. Again checks if the same pattern appears, if it is, go through the loop again.
5. Atlast replace \n with the empty ( which we replaced in line 1).
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Use sed to delete a character

I built a 12 million record file and made a mistake, one field is 1 character too long. The record is 40 bytes and ends always in 999. I am trying to delete the 37 character in each record. Is this possible without doing a cut and paste. (1 Reply)
Discussion started by: bthomas
1 Replies

2. Shell Programming and Scripting

repeat character with printf

It's all in the subject. I try to figure out how to repeat a character a number of time with printf. For example to draw a line in a script output. Thks (13 Replies)
Discussion started by: ripat
13 Replies

3. Shell Programming and Scripting

use SED to replace repeat statements

Hi I need to write a script that read a input file that had same statement repeatedly to replace only 2nd & 5th time repeated statements (ex: This is UNIX forum) with another statement ( UNIX forum threads in Shell programming) with out modifying 1st,3,4th repeated statements. I am planning to do... (2 Replies)
Discussion started by: watsup
2 Replies

4. Shell Programming and Scripting

sed to delete character 0 only when it's on its own?

Hi all I am trying to get my head around doing the following.... I have an input field that could contain either a number a blank field or a whitespace field. What I want to do is delete a 0 (zero) if it's on its own or leading the number. So:- \t0 delete the zero 0 delete the... (8 Replies)
Discussion started by: Bashingaway
8 Replies

5. Shell Programming and Scripting

In Sed how can I replace starting from the 7th character to the 15th character.

Hi All, Was wondering how I can do the following.... I have a String as follows "ACCTRL000005022RRWDKKEEDKDD...." This string can be in a file called tail.out or in a Variable called $VAR2 Now I have another variable called $VAR1="000004785" (9 bytes long), I need the content of... (5 Replies)
Discussion started by: mohullah
5 Replies

6. Shell Programming and Scripting

sed help - search/copy from one file and search/paste to another

I am a newbie and would like some help with the following - Trying to search fileA for a string similar to - AS11000022010 30.4 31.7 43.7 53.8 60.5 71.1 75.2 74.7 66.9 56.6 42.7 32.5 53.3 I then want to replace that string with a string from fileB - ... (5 Replies)
Discussion started by: ncwxpanther
5 Replies

7. Shell Programming and Scripting

Sed: delete on each line before a character and after a character

Hi there, A total sed noob here. Is there a way using sed to delete everything before a character AND after another character on each line in a file? The deletion should also delete the indicating characters(here: an opening and a closing parenthesis). The original file would look like... (3 Replies)
Discussion started by: bnbsd
3 Replies

8. Shell Programming and Scripting

sed searches a character string for a specified delimiter character, and returns a leading or traili

Hi, Anyone can help using SED searches a character string for a specified delimiter character, and returns a leading or trailing space/blank. Text file : "1"|"ExternalClassDEA519CF5"|"Art1" "2"|"ExternalClass563EA516C"|"Art3" "3"|"ExternalClass305ED16B8"|"Art9" ... ... ... (2 Replies)
Discussion started by: fspalero
2 Replies

9. Shell Programming and Scripting

awk sed to repeat every character on same position from the upper line replacing whitespace

Hello is it possible with awk or sed to replace any white space with the previous line characters in the same position? I am asking this because the file I have doesn't always follow a pattern. For example the file I have is the result of a command to obtain windows ACLs: icacls C:\ /t... (5 Replies)
Discussion started by: nakaedu
5 Replies

10. Shell Programming and Scripting

How to repeat a character in a field if it's a single character?

I have a csv dataset like this : C,rs18768 G,rs13785 GA,rs1065 G,rs1801279 T,rs9274407 A,rs730012 I'm thinking of use like awk, sed to covert the dataset to this format: (if it's two character, then keep the same) CC,rs18768 GG,rs13785 GA,rs1065 GG,rs1801279 TT,rs9274407... (7 Replies)
Discussion started by: nengcheng
7 Replies
euctoibmj(1)							   User Commands						      euctoibmj(1)

NAME
euctoibmj, ibmjtoeuc - Code conversion between Japanese EUC and IBM-Japanese SYNOPSIS
euctoibmj [-t] [-u code] [-U] [filename...] ibmjtoeuc [-u code] [-U] [filename...] AVAILABILITY
SUNWjfpu DESCRIPTION
euctoibmj converts the contents of the specified filenames from ASCII/ Japanese EUC to EBCDIC/IBM-Japanese. ibmjtoeuc converts the con- tents of the specified filenames from EBCDIC/IBM-Japanese to ASCII/ Japanese EUC. The both commands write the resultant code to stdout. If filename is not given, input characters are read from the standard input. For Japanese language handling, the euctoibmj/ibmjtoeucj pair of commands provide conversion only between the two code standards. Code con- version among Japanese EUC, JIS, and PC kanji are supported by another set of commands, jistoeuc(1) family or iconv(1). OPTIONS
-u code With this option specified, characters in one code set that do not have corresponding characters in the other are mapped to the code given in four-digit hexadecimal HOST CODE of IBM Japanese (for euctoibmj) or in four-digit JIS Ku-Ten code (for ibmjtoeuc). Without this option, such characters are mapped to HOST CODE 4040 (for euctoibmj) or JIS Ku-Ten code 0101 (for ibmjtoeuc). -U The output is not buffered (The default is buffered output). -t With this option specified, euctoibmj translates Half-Size Katakana (Code Set 2) in Japanese EUC to the corresponding characters in Code Set 1 prior to conversion. Without this option, Code Set 2 characters in Japanese EUC are processed to the illegal charac- ter. ENVIRONMENT VARIABLES
The environment variables LC_CTYPE and LANG control the character classification throughout these commands. For euctoibmj and ibmjtoeuc to work correctly, one or both of the environment variables must be set to ja or an equivalent locale. On entry to these commands, these envi- ronment variables are checked in the following order: LC_CTYPE and LANG. When a valid value is found, remaining environment variables for character classification are ignored. FILES
/usr/lib/jcodetables/ibmj-euc Code conversion table for IBM Japanese. SEE ALSO
iconv(1), jistoeuc(1), iconv_ja(5) DIAGNOSTICS
unexpected data encountered in input. Illegal character code is found in input file. BUGS
The ASCII/EBCDIC conversion table are taken from the 256 character standard in the CACM Nov, 1968. The conversion, while less blessed as a standard, corresponds better to certain IBM print train convertions. There is no universal solution. The Japanese EUC/IBM Japanese conversion table is based on the IBM Kanji codebook (4th edition - September 1987), JIS X 0201, and JIS X 0208-1983. If JIS X 0212 caracter set is specified as input, euctoibmj can not support the conversion correctly. SunOS 5.10 10 Jan 2003 euctoibmj(1)
All times are GMT -4. The time now is 09:05 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy