Replace Special Character With Next Present Byte


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Replace Special Character With Next Present Byte
# 22  
Old 08-20-2014
Quote:
Originally Posted by dineshnak
Hi Robin,

File FTP from windows system without any compression, only data got compressed, we need to expand the data by finding the special character ie. "ú07 & ú1A?.," lies in the file and to pad the next byte ie(space or ?) after the hexadecimal value convert to decimal.

Sample Data:
Input Data:
Code:
ABCD172 2 B10001ú0E Fú08 DineshGú14 KumarNakkaú0E ú3C?INú14?EFGH340

Output Data:
Code:
ABCD172 2 B10001   (14 spaces) DineshG   (14 spaces) KumarNakka   (14 spaces)    ??????(60-?symbol)IN   ??????????????EFGH340

Thanks,
Dines

Moderator's Comments:
Mod Comment Please use CODE tags as in the forum rules. You have two warning already.
Using CODE tags means that the output will be fixed-width characters and multiple spaces will be respected
Your sample output seems to assume that the two characters after the ú are hex digits if at least one of those two characters is a hex digit greater than 9, but are decimal digits if both of those two characters are decimal digits (i.e., 14 is treated as decimal 14 instead of decimal 20). Assuming your original description was what you want (always treat those two characters as hex), the following straight forward awk script seems to do what you want:
Code:
/usr/xpg4/bin/awk -F 'ú' '
#{	printf(" Input:%s\nOutput:", $0) }
{	# Print the 1st field unchanged.
	printf("%s", $1)

	# Loop through remaining fields.
	for(i = 2; i <= NF; i++) {
		# Print 3 spaces.
		printf("   ")

		# Get repetition count
		cnt = sprintf("%d", "0x" substr($i, 1, 2)) + 0

		# Print the character to be repeated cnt times.
		chr = substr($i, 3, 1)
#		printf("|cnt=%d,chr=%s|", cnt, chr)
		for(j = 1; j <= cnt; j++)
			printf("%s", chr)

		# Print the remainder of the field.
		printf("%s", substr($i, 4))
	}
	# Add trailing <newline>.
	print ""
}' Input

With the input file Input containing:
Code:
302619ú1A? 
ABCD172 2 B10001ú0E Fú08 DineshGú14 KumarNakkaú0E ú3C?INú14?EFGH340

produces the output:
Code:
302619   ?????????????????????????? 
ABCD172 2 B10001                 F           DineshG                       KumarNakka                    ????????????????????????????????????????????????????????????IN   ????????????????????EFGH340

or, if you remove the octothorpes in red that turn the tracing printf statements into comments, you can see how it processes each four character string starting with ú:
Code:
 Input:302619ú1A? 
Output:302619   |cnt=26,chr=?|?????????????????????????? 
 Input:ABCD172 2 B10001ú0E Fú08 DineshGú14 KumarNakkaú0E ú3C?INú14?EFGH340
Output:ABCD172 2 B10001   |cnt=14,chr= |              F   |cnt=8,chr= |        DineshG   |cnt=20,chr= |                    KumarNakka   |cnt=14,chr= |                 |cnt=60,chr=?|????????????????????????????????????????????????????????????IN   |cnt=20,chr=?|????????????????????EFGH340

# 23  
Old 08-21-2014
Hi,

Thanks for your input. While running the script we are not getting the output and also I need to write the output in a file. Could you please help out on this.

Script
Code:
#!/bin/bash
Input=`cat sample.dat`
echo "$Input"
#while read Input
#do
/usr/xpg4/bin/awk -F 'ú' '
#{      printf(" Input:%s\nOutput:", $0) }
{       # Print the 1st field unchanged.
        printf("%s", $1)
        # Loop through remaining fields.
        for(i = 2; i <= NF; i++) {
                # Print 3 spaces.
                printf("   ")
                # Get repetition count
                cnt = sprintf("%d", "0x" substr($i, 1, 2)) + 0
                # Print the character to be repeated cnt times.
                chr = substr($i, 3, 1)
#               printf("|cnt=%d,chr=%s|", cnt, chr)
                for(j = 1; j <= cnt; j++)
                        printf("%s", chr)
                # Print the remainder of the field.
                printf("%s", substr($i, 4))
        }
        # Add trailing <newline>.
        print ""
}' Input
#done <output.dat

Sample.dat
Code:
ABC196 7 MIB513AMMODú07 MIByx66mcp00ú06 302619ú1A 00005014072605331600ú0A 980ú32 201407260533160ú14 2TRG212 7 98ú09 3P PAUTO FMGBN  MIB513AMMOAMERICAN MODER
N INSURANCE302619ú40 97871407259787    140724 NBSú19 A2014072520140724

Thanks,
Dines

---------- Post updated at 01:38 AM ---------- Previous update was at 12:38 AM ----------

One more point if it is decimal use the same else convert it to decimal from hexadecimal.

Thanks,
Dines

---------- Post updated at 05:13 AM ---------- Previous update was at 01:38 AM ----------

Hi Don Cragun,

Ran the script but the ouput is not as expected. Please find the code & output below. Facing problem while converting to hexadecimal
Code:
cnt = sprintf("%d", 0x substr($i, 1, 2)) + 0

We changed the input file name the code it's not working
Script
Code:
#!/bin/bash
#Input=`cat /home/dgovk/sample.dat`
#echo "$Input"
#while read Input
#do
/usr/xpg4/bin/awk -F 'ú' '
#{      printf(" Input:%s\nOutput:", $0) }
{       # Print the 1st field unchanged.
        printf("%s", $1)
        # Loop through remaining fields.
        for(i = 2; i <= NF; i++) {
                # Print 3 spaces.
                printf("   ")
               printf(substr($i, 1, 2))
#               printf("%d", substr($i, 1, 2))
                # Get repetition count
                cnt = sprintf("%d", 0x substr($i, 1, 2)) + 0
                # Print the character to be repeated cnt times.
                chr = substr($i, 3, 1)
#               printf("|cnt=%d,chr=%s|", cnt, chr)
                for(j = 1; j <= cnt; j++)
                        printf("%s", chr)
                # Print the remainder of the field.
                printf("%s", substr($i, 4))
        }
        # Add trailing <newline>.
        print ""
}' Input

Input:
Code:
302619ú1A?
ABCD172 2 B10001ú0E Fú08 DineshGú14 KumarNakkaú0E ú3C?INú14?EFGH340

Output:
Code:
302619   ?
ABCD172 2 B10001   F           DineshG                 KumarNakka      ???IN   ??????????????EFGH340

Thanks,
Dinesh

Last edited by rbatte1; 08-21-2014 at 06:11 AM..
# 24  
Old 08-21-2014
This does what you requested:
Code:
#!/bin/bash
# Code supplied by Donald W. Cragun in response to post on www.unix.com:
# https://www.unix.com/shell-programming-and-scripting/250310-replace-special-character-next-present-byte-4.html
/usr/xpg4/bin/awk -F 'ú' '
#{	printf(" Input:%s\nOutput:", $0) }
{	printf("%s", $1)
	for(i = 2; i <= NF; i++) {
		# Print 3 spaces.
		printf("   ")

		# Get repetition count.  (This seems likely to be error prone,
		# Dnd therefore dangerous, but this is what was specified by
		# Dineshkumar Nakka.
		if((cnt = substr($i, 1, 2)) ~ "[0-9][0-9]") {
			# Use decimal if both characters are decimal digits.
			base = cnt
			cnt = cnt + 0
		} else {# Otherwise, use hexadecimal.
			base = "0x" cnt
			cnt = sprintf("%d", "0x" cnt) + 0
		}

		# Print the character to be repeated cnt times.
		chr = substr($i, 3, 1)
#		printf("|cnt=%d(from %s),chr=%s|", cnt, base, chr)
		for(j = 1; j <= cnt; j++)
			printf("%s", chr)

		# Print the remainder of the field.
		printf("%s", substr($i, 4))
	}
	# Add trailing <newline>.
	print ""
}' sample.dat > output.dat

If sample.dat contains:
Code:
302619ú1A? 
ABCD172 2 B10001ú0E Fú08 DineshGú14 KumarNakkaú0E ú3C?INú14?EFGH340
ABC196 7 MIB513AMMODú07 MIByx66mcp00ú06 302619ú1A 00005014072605331600ú0A 980ú32 201407260533160ú14 2TRG212 7 98ú09 3P PAUTO FMGBN  MIB513AMMOAMERICAN MODER
N INSURANCE302619ú40 97871407259787    140724 NBSú19 A2014072520140724

it will produce the output:
Code:
302619   ?????????????????????????? 
ABCD172 2 B10001                 F           DineshG                 KumarNakka                    ????????????????????????????????????????????????????????????IN   ??????????????EFGH340
ABC196 7 MIB513AMMOD          MIByx66mcp00         302619                             00005014072605331600             980                                   201407260533160                 2TRG212 7 98            3P PAUTO FMGBN  MIB513AMMOAMERICAN MODER
N INSURANCE302619                                           97871407259787    140724 NBS                      A2014072520140724

in the file output.dat.
# 25  
Old 08-21-2014
It is no surprise that the conversion:
Code:
cnt = sprintf("%d", 0x substr($i, 1, 2)) + 0

failed. Removing the quotes from the code I suggested:
Code:
cnt = sprintf("%d", "0x" substr($i, 1, 2)) + 0

will not work. I'm surprised that you didn't get a syntax error from this.
# 26  
Old 08-21-2014
Hi,

The hexadecimal conversion is not happening the code, getting the below output, we are using SunOS.
Output:
Code:
302619   |cnt=0(from 0x1A),chr=?|
ABCD172 2 B10001   |cnt=0(from 0x0E),chr= |F   |cnt=8(from 08),chr= |        DineshG   |cnt=14(from 14),chr= |              KumarNakka   |cnt=0(from 0x0E),chr= |   |cnt=0(from 0x3C),chr=?|IN   |cnt=14(from 14),chr=?|??????????????EFGH340

Thanks,
Dines

---------- Post updated at 06:49 AM ---------- Previous update was at 06:19 AM ----------

Hi Don Cragun,

Could you please guide on the conversion issue.

Thanks,
Dines
# 27  
Old 08-21-2014
Yes, I see the problem. The code I provided works fine on OS X, and I don't have a Solaris system available for testing. I'll have something that should work on any system in less than two hours.
# 28  
Old 08-21-2014
Let's try once more... The following awk script will handle decimal and hexadecimal digits following the ú introducing character; and will print an error message in the output file if non-decimal, non-hexadecimal characters are found (and, in this last case, exit with a non-zero exit status):
Code:
#!/bin/bash
# Code supplied by Donald W. Cragun in response to post on www.unix.com:
# https://www.unix.com/shell-programming-and-scripting/250310-replace-special-character-next-present-byte-4.html
/usr/xpg4/bin/awk -F 'ú' '
BEGIN {	x="0123456789ABCDEF" }
#{	printf(" Input:%s\nOutput:", $0) }
{	printf("%s", $1)
	for(i = 2; i <= NF; i++) {
		# Print 3 spaces.
		printf("   ")

		# Get repetition count.  (This seems likely to be error prone,
		# Dnd therefore dangerous, but this is what was specified by
		# Dineshkumar Nakka.
		if((cnt = substr($i, 1, 2)) ~ "[0-9][0-9]") {
			# Use decimal if both characters are decimal digits.
			base = cnt
			cnt = cnt + 0
		} else if (cnt ~ "[0-9A-Fa-f][[0-9A-Fa-f]") {
			# Use hexadecimal.
			base = "0x" cnt
			cnt = (index(x, toupper(substr(cnt, 1, 1))) - 1) * 16 +\
				index(x, toupper(substr(cnt, 2))) - 1
		} else {
			# Bad input.
			printf("\n\n\n*** Invalid count \"%s\" on line %d\n\n",
				cnt, NR)
			ec = 1
		}

		# Print the character to be repeated cnt times.
		chr = substr($i, 3, 1)
#		printf("|cnt=%d(from %s),chr=%s|", cnt, base, chr)
		for(j = 1; j <= cnt; j++)
			printf("%s", chr)

		# Print the remainder of the field.
		printf("%s", substr($i, 4))
	}
	# Add trailing <newline>.
	print ""
}
END {	exit ec }' sample.dat > output.dat

when sample.dat contains:
Code:
302619ú1A? 
ABCD172 2 B10001ú0E Fú08 DineshGú14 KumarNakkaú0E ú3C?INú14?EFGH340
ABC196 7 MIB513AMMODú07 MIByx66mcp00ú06 302619ú1A 00005014072605331600ú0A 980ú32 201407260533160ú14 2TRG212 7 98ú09 3P PAUTO FMGBN  MIB513AMMOAMERICAN MODER
N INSURANCE302619ú40 97871407259787    140724 NBSú19 A2014072520140724

writes the following in output.dat:
Code:
302619   ?????????????????????????? 
ABCD172 2 B10001                 F           DineshG                 KumarNakka                    ????????????????????????????????????????????????????????????IN   ??????????????EFGH340
ABC196 7 MIB513AMMOD          MIByx66mcp00         302619                             00005014072605331600             980                                   201407260533160                 2TRG212 7 98            3P PAUTO FMGBN  MIB513AMMOAMERICAN MODER
N INSURANCE302619                                           97871407259787    140724 NBS                      A2014072520140724

or, with the debugging printf() calls enabled:
Code:
 Input:302619ú1A? 
Output:302619   |cnt=26(from 0x1A),chr=?|?????????????????????????? 
 Input:ABCD172 2 B10001ú0E Fú08 DineshGú14 KumarNakkaú0E ú3C?INú14?EFGH340
Output:ABCD172 2 B10001   |cnt=14(from 0x0E),chr= |              F   |cnt=8(from 08),chr= |        DineshG   |cnt=14(from 14),chr= |              KumarNakka   |cnt=14(from 0x0E),chr= |                 |cnt=60(from 0x3C),chr=?|????????????????????????????????????????????????????????????IN   |cnt=14(from 14),chr=?|??????????????EFGH340
 Input:ABC196 7 MIB513AMMODú07 MIByx66mcp00ú06 302619ú1A 00005014072605331600ú0A 980ú32 201407260533160ú14 2TRG212 7 98ú09 3P PAUTO FMGBN  MIB513AMMOAMERICAN MODER
Output:ABC196 7 MIB513AMMOD   |cnt=7(from 07),chr= |       MIByx66mcp00   |cnt=6(from 06),chr= |      302619   |cnt=26(from 0x1A),chr= |                          00005014072605331600   |cnt=10(from 0x0A),chr= |          980   |cnt=32(from 32),chr= |                                201407260533160   |cnt=14(from 14),chr= |              2TRG212 7 98   |cnt=9(from 09),chr= |         3P PAUTO FMGBN  MIB513AMMOAMERICAN MODER
 Input:N INSURANCE302619ú40 97871407259787    140724 NBSú19 A2014072520140724
Output:N INSURANCE302619   |cnt=40(from 40),chr= |                                        97871407259787    140724 NBS   |cnt=19(from 19),chr= |                   A2014072520140724

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. What is on Your Mind?

Merry Xmas (special present inside)

A Merry Xmas to all of you. And, as a special present to vbe (he knows why) a little exercise: #! /bin/ksh pPrintSnow () { typeset -i iLen=$1 while (( iLen )) ; do if ! (( RANDOM % 31 )) ; then printf "%1s" "." else printf "%1s" " " fi ((... (0 Replies)
Discussion started by: bakunin
0 Replies

2. Shell Programming and Scripting

Replace special characters with backslash and character

Hi, I have a string wherein i need to replace special characters with backslash and that character. Ex: If my string is a=qwerty123@!, then the new string should be a_new=qwerty123\@\!\, Thanks (3 Replies)
Discussion started by: temp_user
3 Replies

3. UNIX for Dummies Questions & Answers

Changing a special line and Byte in a random file

Hello I created 3 files by: dd if=/dev/urandom bs=1024 count=1000000 of=./testfile1 dd if=/dev/urandom bs=1024 count=5000000 of=./testfile2 dd if=/dev/urandom bs=1024 count=10000000 of=./testfile3 Now I want to know how to make a change in a specific byte and/or line of theses files? (2 Replies)
Discussion started by: frhling
2 Replies

4. Shell Programming and Scripting

[Solved] Find and replace till nth occurence of a special character

Hi, I have a requirement to search for a pattern in each line in a file and remove the in between words till the 3rd occurrence of double quote ("). Ex: CREATE TABLE "SCHEMANAME"."AMS_LTV_STATUS" (Note: "SCHEMANAME" may changes for different schemas. Its not a fixed value) I need to... (2 Replies)
Discussion started by: satyaatcgi
2 Replies

5. Shell Programming and Scripting

How to replace special character using sed?

How can I replace the follong text including to number 7000? cat tmp0.txt Winston (UK) Wong I would the 7000 to replace Winston (UK) Wong. I fail with method below: sed ' s /Winston\(UK\)Wong/7000 tmp0.txt' (1 Reply)
Discussion started by: vivien_chu
1 Replies

6. Shell Programming and Scripting

replace /n with special character

I would like to replace /n with ',' and after replace remove last semicolon then put a open brace in starting and closing brace in end of line. See below example: input: 1234 3455 24334 234 output: ('1234,'3455',24334','234') Thanks (3 Replies)
Discussion started by: anupdas
3 Replies

7. Shell Programming and Scripting

How to replace with a special character in String

Hi, I am beginner to Shell Scripting. I have a String like this "testabcdef", i need the first character as it is and the remaining character should be replaced by the the '*' character. e.g(t***********) PLZ Suggest me. (5 Replies)
Discussion started by: nanthagopal
5 Replies

8. Shell Programming and Scripting

cutting long text by special char around 100 byte and newline

Regard, How can i cut the text by special char(|) around 100 byte and write the other of the text at newline using Perl. ... (3 Replies)
Discussion started by: Shawn, Lee
3 Replies

9. Shell Programming and Scripting

how to replace the special character with another using SED

I have the replace the pattern in the file , ); to ); Could someone please help me to get this command. (2 Replies)
Discussion started by: mohan.bit
2 Replies

10. Shell Programming and Scripting

sed special character replace

I need to do the following: text in the format of: ADDRESS=abcd123:1111 - abcd123:1111 is different on every system. replace with: ADDRESS=localhost:2222 sed 's/ADDRESS=<What do I use here?>/ADDRESS=localhost:2222/g' Everything I've tried ends up with: ... (3 Replies)
Discussion started by: toor13
3 Replies
Login or Register to Ask a Question