Need help to extract a string delimited by any special character


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Need help to extract a string delimited by any special character
# 15  
Old 06-01-2005
Instead of searching for the delimiters do it the other way round: Search for what you are interested in, everything else would be a delimiter.

With this in mind, suppose, for example, the text you are looking for are always letters (or letters and digits, or only uppercase letters, etc.). In this case you could use the following line:

# sed 's/^\([A-Za-z][A-Za-z]*\).*$/\1/' <source> > <target>

This collects every letter (uppercase or lowercase) starting at the beginning of the line to the sed variable "\1" until a non-letter is reached. The remainder of the line is snipped off then and the variables content put out.

If the text you are looking for can be further limited change the char classes in the regular expression above. For example, if you only look for uppercase letters, you could write "[A-Z][A-Z]*" above.

bakunin
# 16  
Old 06-01-2005
Thanks bakunin........... It worked

Anyway, can anyone suggest me the awk version on SUN OS
# 17  
Old 06-01-2005
Very Interesting solution...I never thought one can collect the fields into a variable using sed. Can someone explain it further.Will the above solution work if spaces is the delimiter.
Also which solution is the better one to follow awk or sed??
Thanks,
srikanth
# 18  
Old 06-01-2005
I know you got a working solution.

Just got access to an SunOS and continuing on the first suggested solution, this will work

Code:
/usr/xpg4/bin/awk -F"[*~]" '{ printf $1 "\n" }' kum.txt

Vino
# 19  
Old 06-01-2005
1. The preferable solution is sed of course, since it uses less system resources.

2. Of course it will work if the delimiters are spaces, since the sed script is not relying on the delimiters but the content. Delimiter is anything not part of the content, so to say.

3. In case of a specified single delimiter char (including blanks) the most efficient solution would be to use "cut", like in "cat file | cut -d'<delimiterchar>' -f1"

4. The given awk solution is correct AFAICS, I don't know why it doesn't work on kumariaks system

5. Last point: How do sed variables work:

sed variables are named \1...\9 and are defined by enclosing parts of the search string by \(...\). They can even be nested. To fully understand the effect the best thing is to try it with some short file and some different regexps. For instance, a script exchanging the first and the second word in every line ("word" meaning something delimited by a space) the syntax would be:

s/\([^ ][^ ]*\) \([^ ][^ ]*\]/\2\1/

everything matched by the expression between the first \(...\) goes to \1 in the result, everything matched by the expression enclosed in the second \(...\) pair goes to \2. In the replacement expression the are simply exchanged.

bakunin

Last edited by bakunin; 06-01-2005 at 08:45 AM..
# 20  
Old 06-01-2005
kumariak

If you are dealing with EDI data, the sed solution is not complete and will not work. If I may ask what EDI software are you using?
# 21  
Old 06-01-2005
Try this:

Code:
awk -F"$(head -1 test.txt | cut -c4)" '{print $1}' test.txt

where test.txt is the name of the input file. (Better to supply it in a variable, but I just did this quick for the purpose of showing it.) The problem isn't knowing ALL of the field delimiters you're dealing with, it's knowing which one you're dealing with...

Input file:
ISA*123
GS*456
ST*789

Output:
ISA
GS
ST

I tried it with "~", "*", and "@".

djp
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Extract string between two special chracters

Hi Folks - I'm trying to extract the string between two special characters, the "-" and "." symbols. The string format is as such: _PBCS_URL_PRD=https://plan-a503777.pbcs.us6.ocloud.com _PBCS_URL_TST=https://pln-test-a503777.pbcs.us6.ocloud.comIn the above case, I need to extract "a503777".... (7 Replies)
Discussion started by: SIMMS7400
7 Replies

2. Shell Programming and Scripting

Perl split string separated by special character

Hello I have string (string can have more sections) LINE="AA;BB;CC;DD;EE"I would like to assigne each part of string separated by ";" to some new variable. Can someone help? (4 Replies)
Discussion started by: vikus
4 Replies

3. UNIX for Dummies Questions & Answers

Extract string between two special characters

Hi, I have a file that looks something like >1-18*anc... (12 Replies)
Discussion started by: jyu429
12 Replies

4. Shell Programming and Scripting

Extract string between 2 special characters

Hi, I have a unix file with contents as below Line1: ABC MNN X$$QWERTY$$ JKL Line2: HELLO $$HOW$$ ARE $$YOU$$ DOING i want to extract the string between $$ and $$ ie i want the output as QWERTY HOW YOU i want those strings seperated by some character say | desired output is... (7 Replies)
Discussion started by: vinredmac
7 Replies

5. Shell Programming and Scripting

How to replace with a special character in String

Hi, I am beginner to Shell Scripting. I have a String like this "testabcdef", i need the first character as it is and the remaining character should be replaced by the the '*' character. e.g(t***********) PLZ Suggest me. (5 Replies)
Discussion started by: nanthagopal
5 Replies

6. Shell Programming and Scripting

Extract character from string

ps -eaf | grep “oracleTRLV (LOCAL=NO)” | while read ora_proc do echo $ora_proc done I would like to modify the above shell so that if character 13 and 14 equal "12" to do something. Sorry I'm new to shell:( (14 Replies)
Discussion started by: NicoMan
14 Replies

7. Shell Programming and Scripting

Delete parts of a string of character in one given column of a tab delimited file

I would like to remove characters from column 7 so that from an input file looking like this: >HWI-EAS422_12:4:1:69:89 GGTTTAAATATTGCACAAAAGGTATAGAGCGT U0 1 0 0 ref_chr8.fa 6527777 F DD I get something like that in an output file: ... (13 Replies)
Discussion started by: matlavmac
13 Replies

8. Shell Programming and Scripting

Remove box like special character from end of string

Hi All, How to remove a box like special character which appears at the end of a string/line/record. I have no clue what this box like special character is. It is transparent square like box. This appears in a .DAT file at the end of header. I'm to compare a value in header with a parameter.... (16 Replies)
Discussion started by: Qwerty123
16 Replies

9. Shell Programming and Scripting

Perl Script Syntax to Extract Everything After Special Character

Hi, I am writing a Perl script that reads in many lines, if a line meets the criteria I want to edit, it. For example, the script will return the following example line... test=abc123 All I want to do is strip off the "test=" and just be left with the abc123. In my script I can easily... (3 Replies)
Discussion started by: edrichard
3 Replies

10. Shell Programming and Scripting

replacing string with special character ???

the problem is while replacing the old string with new one with the help of SED i am unable to replace the special characters with new strings. how can i do that? i dont want the user to be given the trouble to write '\' before every special characters like * , . , \ , $ , &. sed... (4 Replies)
Discussion started by: imppayel
4 Replies
Login or Register to Ask a Question