Sponsored Content
Top Forums UNIX for Beginners Questions & Answers Remove characters and replace with space Post 302985546 by RudiC on Friday 11th of November 2016 05:37:19 AM
Old 11-11-2016
I'm afraid it's not that easy - in UTF8 (and other) encoded files, characters above the ASCII set are represented by more than one byte, of which every single one will be replaced by a space when running above command. Using the -s option, on the other hand, will squeeze any count of adjacent non-ASCII chars into one single byte.

Last edited by RudiC; 11-11-2016 at 07:15 AM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

remove space characters

hello I have this output ifspeed 100000000 ifspeed 100000000 collisions 413 collisions 10 duplex full duplex ... (1 Reply)
Discussion started by: melanie_pfefer
1 Replies

2. Shell Programming and Scripting

Replace long space to become one space?

Hi, i have the log attached. Actually i want the long space just become 1 space left like this : Rgds, (12 Replies)
Discussion started by: justbow
12 Replies

3. Shell Programming and Scripting

How to replace characters with random characters

I've got a file (numbers.txt) filled with numbers and I want to replace each one of those numbers with a new random number between 0 and 9. This is my script so far: #!/bin/bash rand=$(($RANDOM % 9)) sed -i s//$rand/g numbers.txtThe problem that I have is that it replaces each number with just... (2 Replies)
Discussion started by: hellocatfood
2 Replies

4. Shell Programming and Scripting

Insert space between characters using sed

Input: Youcaneasilydothisbyhighlightingyourcode. Putting space after three characters. You can eas ily dot his byh igh lig hti ngy our cod e. How can i do this using sed? (10 Replies)
Discussion started by: cola
10 Replies

5. Shell Programming and Scripting

Replace special characters with Escape characters?

i need to replace the any special characters with escape characters like below. test!=123-> test\!\=123 !@#$%^&*()-= to be replaced by \!\@\#\$\%\^\&\*\(\)\-\= (8 Replies)
Discussion started by: laknar
8 Replies

6. UNIX for Advanced & Expert Users

Need to remove leading space from awk statement space from calculation

I created a awk state to calculate the number of success however when the query runs it has a leading zero. Any ideas on how to remove the leading zero from the calculation? Here is my query: cat myfile.log | grep | awk '{print $2,$3,$7,$11,$15,$19,$23,$27,$31,$35($19/$15*100)}' 02:00:00... (1 Reply)
Discussion started by: bizomb
1 Replies

7. Shell Programming and Scripting

How to remove alphabets/special characters/space in the 5th field of a tab delimited file?

Thank you for 4 looking this post. We have a tab delimited file where we are facing problem in a lot of funny character. I have tried using awk but failed that is not working. In the 5th field ID which is supposed to be a integer only of that file, we are getting corrupted data as below. I... (12 Replies)
Discussion started by: Srithar
12 Replies

8. UNIX for Dummies Questions & Answers

How to replace and remove few junk characters from a specific field?

I would like to remove all characters starting with "%" and ending with ")" in the 4th field - please help!! 1412007819.864 /device/services/heartbeatxx 204 0.547%!i(int=0) 0.434 0.112 1412007819.866 /device/services/heartbeatxx 204 0.547%!i(int=1) 0.423 0.123... (10 Replies)
Discussion started by: snemuk14
10 Replies

9. Shell Programming and Scripting

Remove first 2 characters and last two characters of each line

here's what im trying to do. i have a file containing lines similar to this: data.txt: 1hsRmRsbHRiSFZNTTA1dlEyMWFkbU5wUW5CSlIyeDFTVU5SYjJOSFRuWmpia0ZuWXpKV2FHTnRU 1lKUnpWMldrZFZaMG95V25oYQpSelEyWTBka2QyRklhSHBrUjA1b1kwUkJkd3BOVXpWM1lVaG5k... (5 Replies)
Discussion started by: SkySmart
5 Replies

10. UNIX for Beginners Questions & Answers

Remove first 3 leadings zeros and replace with space

Hi Folks - I need help manipulating a file. For column 2, I need to replace the first 3 leading zeros with spaces. The file looks like such: 00098|00011250000003|00000000000.0200|D|1|07|51|04INDP |04|00820|CS|000000|092717|000000000000.0000|000|... (3 Replies)
Discussion started by: SIMMS7400
3 Replies
GB18030(5)						      BSD File Formats Manual							GB18030(5)

NAME
gb18030 -- GB 18030 encoding method for Chinese text SYNOPSIS
ENCODING "GB18030" DESCRIPTION
The GB18030 encoding implements GB 18030-2000, a PRC national standard for the encoding of Chinese characters. It is a superset of the older GB 2312-1980 and GBK encodings, and incorporates Unicode's Unihan Extension A completely. It also provides code space for all Unicode 3.0 code points. Multibyte characters in the GB18030 encoding can be one byte, two bytes, or four bytes long. There are a total of over 1.5 million code positions. GB 11383-1981 (ASCII) characters are represented by single bytes in the range 0x00 to 0x7F. Chinese characters are represented as either two bytes or four bytes. Characters that are represented by two bytes begin with a byte in the range 0x81-0xFE and end with a byte either in the range 0x40-0x7E or 0x80-0xFE. Characters that are represented by four bytes begin with a byte in the range 0x81-0xFE, have a second byte in the range 0x30-0x39, a third byte in the range 0x81-0xFE and a fourth byte in the range 0x30-0x39. SEE ALSO
euc(5), gb2312(5), gbk(5), utf8(5) Chinese National Standard GB 18030-2000: Information Technology -- Chinese ideograms coded character set for information interchange -- Extension for the basic set, March 2000. The Unicode Standard, Version 3.0, The Unicode Consortium, 2000. STANDARDS
The GB18030 encoding is believed to be compatible with GB 18030-2000. BSD
August 10, 2003 BSD
All times are GMT -4. The time now is 02:29 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy