Extracting substring within string between 2 token within the string


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extracting substring within string between 2 token within the string
# 1  
Old 01-03-2020
Extracting substring within string between 2 token within the string

Hello.


First best wishes for everybody.


here is the input file ("$INPUT1") contents :
Code:
BASH_FUNC_message_begin_script%%=() {  local -a L_ARRAY;
BASH_FUNC_message_debug%%=() {  local -a L_ARRAY;
BASH_FUNC_message_end_script%%=() {  local -a L_ARRAY;
BASH_FUNC_message_error%%=() {  local -a L_ARRAY;

This simple sed command works well for some kind of tokens.
These tokens work well

Code:
TOKEN1="^BASH_FUNC_"
TOKEN2="\%\%\=\(\)"
#
# # between but excluding TOKEN1 and TOKEN2
#
sed -e 's/'$TOKEN1'\(.*\)'$TOKEN2'.*/\1/'  "$INPUT1"

Which return :
Code:
message_begin_script
message_debug
message_end_script
message_error

Another working example
Code:
#
# # between but including TOKEN1 and excluding TOKEN2
#
sed -e 's/\('$TOKEN1'.*\)'$TOKEN2'.*/\1/'  "$INPUT1"

which return
Code:
BASH_FUNC_message_begin_script
BASH_FUNC_message_debug
BASH_FUNC_message_end_script
BASH_FUNC_message_error

Now I have another token2 of this kind : TOKEN2="local -a L_ARRAY"
And I got an error
Code:
TOKEN1="^BASH_FUNC_"
TOKEN2="local -a L_ARRAY"
#
# # between but excluding TOKEN1 and TOKEN2
#
sed -e 's/'$TOKEN1'\(.*\)'$TOKEN2'.*/\1/'  "$INPUT1"

Code:
sed -e 's/'^BASH_FUNC_'\(.*\)'local -a L_ARRAY'.*/\1/'  "/tmp/MY_INPUT1.txt" 
sed: -e expression #1, char 24: unterminated `s' command

I have tried to escape space and '-' without success
Any help is welcome
# 2  
Old 01-03-2020
Hi, try double quotes:
Code:
sed "s/${TOKEN1}\(.*\)${TOKEN2}.*/\1/" "$INPUT1"

The curly braces for variable expansions are good practice within strings, for reasons of readability and also to prevent variable expansion errors.

--
What happens with the single quote construct is that the variable expansions are unprotected, and therefore the shell splits variable TOKEN2 into three fields local -a and L_ARRAY. These three fields are combined with the other characters passed as parameters to the sed command, effectively like so:
Code:
sed -e 's/^BASH_FUNC_\(.*\)local' -a 'L_ARRAY.*/\1/'    # wrong

This is why sed complains of an unterminated s-command.

When you use double quotes this will not happen since the variables are expanded, but not field split by the shell, since they are protected by double quotes. Double quotes also allow for simpler and easier to read code..

Last edited by Scrutinizer; 01-03-2020 at 02:39 PM..
This User Gave Thanks to Scrutinizer For This Post:
# 3  
Old 01-03-2020
Or
Code:
sed 's/'"$TOKEN1"'\(.*\)'"$TOKEN2"'.*/\1/'  "$INPUT1"

The point is, each $var must be within "quotes", so the shell does not do word splitting and filename generation.
Then it must be
Code:
TOKEN2="%%=()"

The ( ) is not special in a BRE, but \( \) is.
The -e (code argument follows) is allowed but not needed (because at least one code argument is required).
# 4  
Old 01-03-2020
Quote:
Originally Posted by Scrutinizer
Hi, try double quotes:
Code:
sed "s/${TOKEN1}\(.*\)${TOKEN2}.*/\1/" "$INPUT1"

The curly braces for variable expansions are good practice within strings, for reasons of readability and also to prevent variable expansion errors.

--
What happens with the single quote construct is that the variable expansions are unprotected, and therefore the shell splits variable TOKEN2 into three fields local -a and L_ARRAY. These three fields are combined with the other characters passed as parameters to the sed command, effectively like so:
Code:
sed -e 's/^BASH_FUNC_\(.*\)local' -a 'L_ARRAY.*/\1/'    # wrong

This is why sed complains of an unterminated s-command.

When you use double quotes this will not happen since the variables are expanded, but not field split by the shell, since they are protected by double quotes. Double quotes also allow for simpler and easier to read code..

Thank you very much

--- Post updated at 21:14 ---

Quote:
Originally Posted by MadeInGermany
Or
Code:
sed 's/'"$TOKEN1"'\(.*\)'"$TOKEN2"'.*/\1/'  "$INPUT1"

The point is, each $var must be within "quotes", so the shell does not do word splitting and filename generation.
Then it must be
Code:
TOKEN2="%%=()"

The ( ) is not special in a BRE, but \( \) is.
The -e (code argument follows) is allowed but not needed (because at least one code argument is required).
Thank you
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Help on extracting a substring from the input string

Hi, I am new to Unix. I am trying to extract a substring from an input string: Ex - input string: deploy_v11_9_1 i want to extract and store the value v11_9_1 from the input string in a new variable. I am using following command in my shell script file: echo "Enter the folder name u... (5 Replies)
Discussion started by: Pranav Bhasker
5 Replies

2. Shell Programming and Scripting

Extracting a substring from a string in unix

Hi, I would like to extract a substring from a string in unix. eg: ./checkfile.sh -- i need only checkfile.sh from this string. Could someone help me out in this... Regards Arun (19 Replies)
Discussion started by: arunkumarmc
19 Replies

3. Shell Programming and Scripting

Extracting substring from string

Hi awk and sed gurus, Please help me in the following. I have the following entries in the file ABCDErules AbHDPrules ABCrules -- -- and other entries in the file. Now, I want to extract from the file that contain entries for *rules and process it separately. How can i do it... (6 Replies)
Discussion started by: sdosanjh
6 Replies

4. Shell Programming and Scripting

Extracting particular string in a file and storing matched string in output file

Hi , I have input file and i want to extract below strings <msisdn xmlns="">0492001956</ msisdn> => numaber inside brackets <resCode>3000</resCode> => 3000 needs to be extracted <resMessage>Request time getBalances_PSM.c(37): d out</resMessage></ns2:getBalancesResponse> => the word... (14 Replies)
Discussion started by: sushmab82
14 Replies

5. Shell Programming and Scripting

Extracting a substring starting from last occurance of a string/character

Hi All, This is Ram. I'm new to this forum & new to shell scripts as well. I've a requirement in which I want to extract a substring from a given string based on last occurance of a character. for eg. I have a string of a file name with absolute path like... (2 Replies)
Discussion started by: krramkumar
2 Replies

6. Shell Programming and Scripting

Extracting a string from one file and searching the same string in other files

Hi, Need to extract a string from one file and search the same in other files. Ex: I have file1 of hundred lines with no delimiters not even space. I have 3 more files. I should get 1 to 10 characters say substring from each line of file1 and search that string in rest of the files and get... (1 Reply)
Discussion started by: mohancrr
1 Replies

7. UNIX for Dummies Questions & Answers

How to get the substring from the string

Hi All, Can anybody help me to get the substring from the given string. (3 Replies)
Discussion started by: Anshu
3 Replies

8. Shell Programming and Scripting

sed, grep, awk, regex -- extracting a matched substring from a file/string

Ok, I'm stumped and can't seem to find relevant info. (I'm not even sure, I might have asked something similar before.): I'm trying to use shell scripting/UNIX commands to extract URLs from a fairly large web page, with a view to ultimately wrapping this in PHP with exec() and including the... (2 Replies)
Discussion started by: ropers
2 Replies

9. Programming

can i get a substring from a string?

for example, the string a is "abcdefg", can i get a substring "bcd" (from ato a) from string a? thank you (4 Replies)
Discussion started by: dell9
4 Replies
Login or Register to Ask a Question