Sed or trim to remove non alphanumeric and alpha characters?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Sed or trim to remove non alphanumeric and alpha characters?
# 1  
Old 12-19-2011
Sed or trim to remove non alphanumeric and alpha characters?

Hi All,

I am new to Unix and trying to run some scripting on a linux box. I am trying to remove the non alphanumeric characters and alpha characters from the following line.

<measResults>883250 869.898 86432.4 809875.22 804609 60023 59715 </measResults>

Desired output is:
883250 869.898 86432.4 809875.22 804609 60023 59715

I dont know much about sed so I used the following codes. a="<measResults>883250 869.898 86432.4 809875.22 804609 60023 59715 </measResults>"
b=${a//[^0-9]/ }
set -- $b
echo $1 $2 $3 $4 $5 $6 $7.....

It returns the result but it split the decimal point and break it to another value.
883250 869 898 86432 4 809875 22 804609 60023 59715


Next, I tried using trim.
a="<measResults>883250 869.898 86432.4 809875.22 804609 60023 59715 </measResults>"
echo $a | tr -d '[:alpha:]'

It returns the following result but how do I get rid of the <> and </>
<>883250 869.898 86432.4 809875.22 804609 60023 59715 </>

My mate told me I can easily use sed to remove the words and get my desired output but I have no clue about sed. Spent some time looking the tutorial but couldnt get the syntax right. too many /\ \/ \/ /\ /\ in sed which looks very confusing.

Any help would be appreciated.
Cheers
jack
# 2  
Old 12-19-2011
Code:
# echo $x | perl -e '$y=<>; $y=~s/<\/?.*?>//g; print $y'
883250 869.898 86432.4 809875.22 804609 60023 59715

This User Gave Thanks to balajesuri For This Post:
# 3  
Old 12-19-2011
This may do the trick:

Code:
sed 's/<[^>]*>//g'  input-file >output

It replaces all characters between < and > including the greater/lessthan symbols.
This User Gave Thanks to agama For This Post:
# 4  
Old 12-19-2011
@jackma: Also, you were pretty close with tr. Just a small extension and you would've got what you wanted:
Code:
echo $x | tr -d '[:alpha:]' | sed 's/[<>]//g'

This User Gave Thanks to balajesuri For This Post:
# 5  
Old 12-19-2011
Thanks all. Got it...

I will use the trim one and pipe it to sed....as it looks easier for me to understand. :-p

a="<measResults>883250 869.898 86432.4 809875.22 804609 60023 59715 </measResults>"
echo $a | tr -d '[:alpha:]' | sed 's/[</>]//g'

883250 869.898 86432.4 809875.22 804609 60023 59715

Cheers,
Thanks again, all.

---------- Post updated at 05:27 PM ---------- Previous update was at 05:23 PM ----------

Quote:
Originally Posted by agama
This may do the trick:

Code:
sed 's/<[^>]*>//g'  input-file >output

It replaces all characters between < and > including the greater/lessthan symbols.

Thanks agama. This works...with 1 line...lol thanks mate
# 6  
Old 12-19-2011
Or with just "tr". List the characters you want to keep and use the "complement" function.

Code:
cat filename | tr -cd '[0-9]. \n'

883250 869.898 86432.4 809875.22 804609 60023 59715

This User Gave Thanks to methyl For This Post:
# 7  
Old 12-19-2011
Quote:
Originally Posted by methyl
Or with just "tr". List the characters you want to keep and use the "complement" function.

Code:
cat filename | tr -cd '[0-9]. \n'
 
883250 869.898 86432.4 809875.22 804609 60023 59715


you guys are awesome!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Generate a string of alphanumeric characters

Hi, I want a script of a code that will allow me to generate all possible combinations of alphanumberica characters of length 12 such that each string will contain numbers and either small or capital letters. For example a string may look like this: 123AB45cd678. (11 Replies)
Discussion started by: faizlo
11 Replies

2. Shell Programming and Scripting

How does this sed expression to remove non-alpha characters work?

Hello! I know that this expression gets rid of non-alphanumeric characters: sed 's///g' and I understand that it is replacing them with nothing - hence the '//'-, but I don't understand how it's doing it. It seems it's finding strings that begin with alphanumeric and replacing them with... (2 Replies)
Discussion started by: bgnersoon2be#1
2 Replies

3. Shell Programming and Scripting

Sed - remove special characters

Hi, I have a file with this line, it's always in the first line: I want to remove these special characters: ´╗┐ file1 ´╗┐\\bar\c$\test2\;3.348.118 Bytes;160 ;3 \\bar\c$\test\;35 Bytes;2 ;1 I want the same file to be only \\bar\c$\test2\;3.348.118 Bytes;160 ;3 \\bar\c$\test\;35... (4 Replies)
Discussion started by: nakaedu
4 Replies

4. Shell Programming and Scripting

Remove the Characters '[' and ']' with Sed

Hi, I am new to Sed and would like to know if it is possible to remove the characters . I have a couple of files with a keyword and would like to remove the substring. I am Using sed s/// but Its not working Thanks for your Support Andrew Borg (2 Replies)
Discussion started by: andrewborg
2 Replies

5. UNIX for Dummies Questions & Answers

sed command to remove characters help!

I am trying to analyse a large file of sequencing data, example of first 10 lines below, @HWUSI-EAS656_0044_FC:7:1:2447:1039#GCAATT/1 GNCTATGGCTTGCCGGGCTCAGGGAAGACAATCATAGCCATGAAAATCATGGAAAAGATCAGAAAAACATTTCAA +HWUSI-EAS656_0044_FC:7:1:2447:1039#GCAATT/1... (1 Reply)
Discussion started by: Adeleh
1 Replies

6. Shell Programming and Scripting

get rid of non-alphanumeric characters

Hi! Could anyone so kindly help me a code to eliminate from a txt file, obtained by collecting and merge several web-page, every word (string) containing non alphabetical, numeric and punctuation character (i.e NON a-zA-Z0-9, underscore and punctuation mark)? Thanks a lot for the help to... (5 Replies)
Discussion started by: mjomba
5 Replies

7. Shell Programming and Scripting

non alpha characters in sed + making it fast?

hello, I'm trying to write the fastest sed command possible (large files will be processed) to replace RICH with NICK in a file which looks like this (below) if the occurance of RICH is uppercase, replace with uppercase if it's lowercase, replace with lowercase SOMTHING_RICH_SOMTHING <- replace... (10 Replies)
Discussion started by: rich@ardz
10 Replies

8. Shell Programming and Scripting

grep or sed. How to remove certain characters

Here is my problem. I have a list of phone numbers that I want to use only the last 4 digits as PINs for something I am working on. I have all the numbers in a file but now I want to be removed all items EXCEPT the last 4 digits. I have seen sed commands and some grep commands but I am... (10 Replies)
Discussion started by: Sucio
10 Replies

9. Shell Programming and Scripting

Sorting with non- and alphanumeric characters

Hi guys, I'm new to this forum and I'm not a UNIX expert. I can't figure out this certain problem i'm having: I need to sort some words, some of the words are annotations (enclosed within < and >). I need to have them sorted alphabetically with all non-alphanumeric characters up front. For... (2 Replies)
Discussion started by: fed.m.ang
2 Replies

10. Shell Programming and Scripting

Perl: How do I remove leading non alpha characters

Hi, Sorry for silly question, but I'm trying to write a perl script to operate a log file that is in following format: (4)ab=1234/(10)bc=abcdef9876/cd=0.... The number in the brackets is the lenghts of the field, "/" is the field separator. Brackets are not leading every field. What I'm... (9 Replies)
Discussion started by: Juha
9 Replies
Login or Register to Ask a Question