Need help to delete special characters exists only at the end of the each record in UNIX file?


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users Need help to delete special characters exists only at the end of the each record in UNIX file?
# 8  
Old 01-22-2019
Thanks for pointing out the remaining syntax error - didn't test the command before posting. Corrected in my above post. For your other questions - I don't know what exactly the requestor is after, as the spec is unclear, so I just corrected (partly, haha) the obvious, taking for granted the rest of the original regex. Let's wait for a statement of the requestor, or, even better, a data sample.
# 9  
Old 01-22-2019
You have specified pure ASCII characters but not said if your file(s) have had, say for example, utf-8 characters so...
...to expand on everyones' post so far, as 'sed' can be caught out with some UNICODE characters then:
IF you encounter UNICODE characters inside ANY part of your file this might work using Don's simplest version.
Longhand OSX 10.14.1, default bash terminal; BSD sed version unknown.
Note there are 6 UNICODE characters in the test string.
Code:
Last login: Tue Jan 22 21:16:39 on ttys000
AMIGA:amiga~> echo "abc\πœ†/o*&&^%&^?*s123HGFi(*&*Åå*&â" | iconv -c -f utf-8 -t ascii | sed 's/[^[:alnum:]|]*$//'
abc\/o*&&^%&^?*s123HGFi
AMIGA:amiga~> _

# 10  
Old 01-29-2019
Hi Don Cragun/RudiC/nezabudka/All,

Now I have different requirement,. Totally i have 20 columns in each record. I need to remove ^M characters at the end of the each record and special characters in specific column like say 15th column in each record.

I have tried this
Code:
sed 's/[^\w]$//g' | awk 'BEGIN{FS="|";OFS="|"}{gsub("[^[:alnum:]]","",$15);print }'.

But this command is not working for me. Please let me know how we can write the command for this

Thanks
Rakesh
# 11  
Old 01-29-2019
Bad idea to use sed and awk together. Try this
Code:
awk 'BEGIN {FS="|"; OFS="|"} {sub("[^[:alnum:]]$", ""); gsub("[^[:alnum:]]", "", $15)} 1'

This User Gave Thanks to nezabudka For This Post:
# 12  
Old 01-29-2019
Quote:
Originally Posted by rakeshp
Hi Don Cragun/RudiC/nezabudka/All,

Now I have different requirement,. Totally i have 20 columns in each record. I need to remove ^M characters at the end of the each record and special characters in specific column like say 15th column in each record.

I have tried this
Code:
sed 's/[^\w]$//g' | awk 'BEGIN{FS="|";OFS="|"}{gsub("[^[:alnum:]]","",$15);print }'.

But this command is not working for me. Please let me know how we can write the command for this

Thanks
Rakesh
I'm disappointed that you ignored my request for clarity in the description of your problem again and still won't show us a sample of what you're trying to do.

If we assume that when you say you say you need to "remove ^M characters at the end of each record", you really mean that you want to remove a single <carriage-return> character at the end of the record and that a <carriage-return> character exists at the end of every record you will process, then the first awk sub() call nezabudka suggested may meet that requirement. However, it might be safer to change:
Code:
sub("[^[:alnum:]]$", "");

in her suggestion to:
Code:
sub(/\r$/, "");

which will only remove a <carriage-return> character at the end of a record if there is one and won't take of chance of removing some other character if there is no <carriage-return>. If you mean that you want to remove all adjacent <circumflex> and <capital-latin-M> characters from the end of every record no matter how many of those characters appear at the end of each record, you need a completely different substitution:
Code:
sub(/[M^]*$/, "");

If we assume that by "special characters in specific column like say 15th column" you mean characters in the 15th field that are not alpha-numeric in the current locale, then the awk gsub() call nezabudka suggested will do what you want. But, of course, we really don't have any way to guess what characters you consider special since you haven't given us any definition for what you want that term to mean. And, we don't know if you really want to change the 15th field or some other field that contains data that is in some unspecified way similar to the data in the 15th field. If you mean that you wan't to specify an arbitrary field number to be processed by your awk or sed script, that of course would only be successfully met by using $15 only about 5 percent of the time in records containing twenty fields.

If nezabudka didn't correctly guess at what your requirements are, then PLEASE give us a clear English definition of what you are trying to do, show us a small representative sample input file, and show us the corresponding sample output file that contains the output you want to produce from that sample input file.
These 2 Users Gave Thanks to Don Cragun For This Post:
# 13  
Old 01-30-2019
Quote:
Originally Posted by nezabudka
Bad idea to use sed and awk together. Try this
Code:
awk 'BEGIN {FS="|"; OFS="|"} {sub("[^[:alnum:]]$", ""); gsub("[^[:alnum:]]", "", $15)} 1'


Thanks alot nezabudka. This code is working fine as of now.I will get back to you if there any more calrifications!

--- Post updated 01-30-19 at 03:04 AM ---

nezabudka/Don Cragun,

This code is working fine but it is removing space at the end of the record before the ^M, but i do not want that space to be removed.Please let me know if we can modify this code.

sample data

input
Code:
aa|bb|##$$abch^^$$|xy ^M

output
Code:
aa|bb|##$$abch^^$$|xy ^M

Thanks
Rakesh




Moderator's Comments:
Mod Comment Please use CODE tags as required by forum rules!

Last edited by RudiC; 01-30-2019 at 04:19 AM.. Reason: Added CODE tags.
# 14  
Old 01-30-2019
Quote:
Originally Posted by rakeshp
...
output
aa|bb|##$$abch^^$$|xy ^M
...
Sure that output is what you're after (with the ^M)?


Quote:
This code is working fine but it is removing space at the end of the record before the ^M, but i do not want that space to be removed
Difficult to believe. The code removes ONE single non-alphanumeric character at EOL. ^M if it is there, space if it is not.
Include "space" in the set to be excluded, or try

Code:
awk 'BEGIN {FS="|"; OFS="|"} {sub("[[:cntrl:]]$", ""); gsub("[^[:alnum:]]", "", $15)} 1' file


Last edited by RudiC; 01-30-2019 at 04:32 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Delete special characters

My sed is not working on deleting the entire special characters and leaving what is necessary.grep connections_per a|sed -e 's/\<\!\-\-//g' INPUT: <!-- <connections_per_instance>1</connections_per_instance> --> <method>HALF</method> <!--... (10 Replies)
Discussion started by: kenshinhimura
10 Replies

2. Shell Programming and Scripting

UNIX Special Characters

Any time I do : ls *.txt > mytext.txt I get something like this in the output file: ^ Tue Jan 22 16:19:19 EST 2013 x86_64 x86_64 x86_64 GNU/Linux t1Fam_BrOv :~>alias | grep ls alias l.='ls -d .* --color=tty' alias lR='ls -R' alias la='ls -Al' alias lc='ls -ltcr' alias ldd='ls -ltr |... (5 Replies)
Discussion started by: genehunter
5 Replies

3. Shell Programming and Scripting

How to add trailer record at the end of the flat file in the unix ksh shell scripting?

Hi, How to add trailer record at the end of the flat file in the unix ksh shell scripting can you please let me know the procedure Regards Srikanth (3 Replies)
Discussion started by: srikanth_sagi
3 Replies

4. Shell Programming and Scripting

Need unix commands to delete records from one file if the same record present in another file...

Need unix commands to delete records from one file if the same record present in another file... just like join ... if the record present in both files.. delete from first file or delete the particular record and write the unmatched records to new file.. tried with grep and while... (6 Replies)
Discussion started by: msathees
6 Replies

5. Shell Programming and Scripting

Windows to UNIX FTP Special characters!

I have a file that has the name in one of the lines as MARíA MENDOZA in Windows. When this gets FTPed over to UNIX it appears as MAR�A MENDOZA. Is there anyway to overcome this? Its causing a issue because the file is Postional and fields are getting pushed by 2 digits.. Any help would be... (4 Replies)
Discussion started by: venky338
4 Replies

6. Shell Programming and Scripting

how to delete special characters from the file content

Hello Team, Any one suggest how to delte the below special character from a file which is having one column 10 rows of same below content. ---------------------------------------- Kosten|bersicht gemd_ ' =Welche Kosten kvnnen... (2 Replies)
Discussion started by: kanakaraju
2 Replies

7. Shell Programming and Scripting

sed delete pattern with special characters

Hi all, I have the following lines <b>A gtwrhwrthwr text hghthwrhtwrtw </b><font color='#06C'>; text text (text) <b>B gtwrhwrthwr text hghthwrhtwrtw </b><font color='#06C'>; text text (text) <b>J gtwrhwrthwr text hghthwrhtwrtw </b><font color='#06C'>; text text (text) and I would like to... (5 Replies)
Discussion started by: stinkefisch
5 Replies

8. UNIX for Dummies Questions & Answers

Advice on extracting special characters from a DB2 table to a file in the UNIX ENV

need some advice on the following situation. I have a DB2 table which has a varchar Column. This varchar column can have special characters like ©, ®, ™ . When I extract from this table to a sequential file for this varchar column I am only able to get © and ® . To Get the ™... (1 Reply)
Discussion started by: cosec
1 Replies

9. UNIX for Dummies Questions & Answers

How to delete a file with special characters

I don't now exactly how I did it, but I created a file named " -C " cexdi:/home1 $ls -lt total 1801336 -rw------- 1 cexdi ced-group 922275840 23 mars 10:03 -C How do I delete this file ? cexdi:/home1 $rm -C rm: invalid option -- C Syntax : rm filename ... Doesn't work...... (5 Replies)
Discussion started by: yveslagace
5 Replies

10. UNIX for Dummies Questions & Answers

Unix file does not display special characters

We have a unix file that contains special characters (ie. Ñ, °, É, ¿ , £ , ø ). When I try to read this file I get a codepage error and the characters are replaced by the # symbol. How do I keep the special characters from being read? Thanks. Ryan (3 Replies)
Discussion started by: Ryan2786
3 Replies
Login or Register to Ask a Question