All strings within two special chars


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting All strings within two special chars
# 8  
Old 01-13-2013
Quote:
Originally Posted by Viernes
Yes you have a point. I apologise about this.

1. No, it shouldn't be skipped. So for
Code:
Al/DET+>ax/NOUN+

I want to get
Code:
Al + >ax

2. A. If there is no '+' nor '/', the line should be skipped. Although the file doesn't contain such a line.
3. Use the '+' as the delimiter in this case.
So for
Code:
Al+>ax/NOUN+
Al+>a+x/NOUN+

I want to get
Code:
Al + >ax
Al + >a + x

4. A
OK. This set of requirements is MUCH different than I expected from your original problem statement: "I want to get all strings that starts with '+' and ends with '/'. Then I want the strings to be separated by ' + '."

Let me see if I understand what you want by stating rules that will outline the logic of an algorithm to process your input data:

If an input line does not contain a '/' character, skip to the next line. In this case no output will be produced for that line.

Otherwise the following steps shall be performed in sequence:
  1. For each '/' character found in a line, that character and everything following it up to (but not including) the next '+' or the '\n' at the end of the line (whichever comes first) shall be deleted.
  2. If the first character on a line is '+', that character shall be deleted.
  3. If the last character on a line is '+', that character shall be deleted.
  4. Every remaining '+' in the line shall be changed to ' + '.
  5. Print the revised contents of the line.
Is this what you want? If not, please tell me what I have wrong.
# 9  
Old 01-13-2013
Yes, precisely.

For this:
Code:
+$/ABBREV+ 
+$A$/NOUN
$A$/NOUN+At/NSUFF_FEM_PL+K/CASE_INDEF_ACC+
$A$/NOUN+At/NSUFF_FEM_PL+K/CASE_INDEF_GEN

Output shall be this:
Code:
$
$A$
$A$ + At + K 
$A$ + At + K

Thanks a lot!
# 10  
Old 01-13-2013
Quote:
Originally Posted by Viernes
Yes, precisely.

For this:
Code:
+$/ABBREV+ 
+$A$/NOUN
$A$/NOUN+At/NSUFF_FEM_PL+K/CASE_INDEF_ACC+
$A$/NOUN+At/NSUFF_FEM_PL+K/CASE_INDEF_GEN

Output shall be this:
Code:
$
$A$
$A$ + At + K 
$A$ + At + K

Thanks a lot!
OK, good. Here are a couple of ways to do it:
Code:
sed '/^[^\/]*$/d;s|/[^+]*+|+|g;s|/.*$||g;s/^+//;s/+$//;s/+/ + /g' input_file

and
Code:
awk '
index($0, "/") {                # for any line containing a "/"
        gsub("/([^+]*|$)", "")  # Remove all "/*" up to next "+" or EOL
        gsub(/(^[+]|[+]$)/, "") # Remove leading and trailing "+"
        gsub(/[+]/, " + ")      # Replace all "+" with " + "
        print                   # Print the modified line
}' input_file

If you are using a Solaris system, use /usr/xpg4/bin/awk or nawk instead of awk.
This User Gave Thanks to Don Cragun For This Post:
# 11  
Old 01-13-2013
Another sed approach (not sure if it's standard sed!):
Code:
$ sed '/\//!d; s:/[^+]*+*: + :g; s:^+\|+ *$::g;' file
$ 
$A$ 
$A$ + At + K 
$A$ + At + K

# 12  
Old 01-13-2013
Quote:
Originally Posted by RudiC
Another sed approach (not sure if it's standard sed!):
Code:
$ sed '/\//!d; s:/[^+]*+*: + :g; s:^+\|+ *$::g;' file
$ 
$A$ 
$A$ + At + K 
$A$ + At + K

It's almost standard sed. The last sed command here s:^+\|+ *$::g uses a non-standard extension supplied by some implementations of sed to make | behave as it does in an ERE. This behavior violates the standards. In a standards conforming sed this BRE would only match a line that consisted solely of a <plus_sign>, a <vertical_bar>, and a <plus_sign> followed by zero or more <space>s. On OS X (and, I expect, on several other sed implementations) this script, therefore, produces the output:
Code:
+$ + 
+$A$ + 
$A$ + At + K + 
$A$ + At + K +

instead of the desired output.

It also handles strings containing adjacent '+'s in input lines differently, but I'm not sure that that would make any difference for the input that would be expected here.

Your use of /\//!d is much easier to read than the /^[^\/]*$/d I used and serves exactly the same purpose (delete any line that does not contain a '/').
# 13  
Old 01-14-2013
The use of +* at the end of the search pattern serves to find 0 (at end of line) or n + in order to simplify the code. Should undesired side effect show up, we needed to resort to a standard solution as you proposed. In fact, that might be easier to read than a complex ERE...
# 14  
Old 01-15-2013
In fact now I got
Code:
$ +  
$A$ +  
$A$ + At + K
$A$ + At + i

Is there a way to get rid of the + as the last token? And get something like:
Code:
$
$A$
$A$ + At + K
$A$ + At + i

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Shell script to split data with a delimiter having chars and special chars

Hi Team, I have a file a1.txt with data as follows. dfjakjf...asdfkasj</EnableQuotedIDs><SQL><SelectStatement modified='1' type='string'><! The delimiter string: <SelectStatement modified='1' type='string'><! dlm="<SelectStatement modified='1' type='string'><! The above command is... (7 Replies)
Discussion started by: kmanivan82
7 Replies

2. UNIX for Dummies Questions & Answers

How to search for a string with special chars?

Hi guys, I am trying to find the following string in a file, but I always get pattern not found error, not sure what is missing here. Can you help please? I do a less to open the xrates.log and then do a /'="18"' in the file and tried various combinations to search the below string. String... (8 Replies)
Discussion started by: santokal
8 Replies

3. Shell Programming and Scripting

If condition matching with special chars

Hi, I have file #cat drivers.txt fcs0 fcs1 vscsi1 vscsi2 In this i need to check the availabality of "fcs" or "vscsi" alone not vscsi0,fcs1 I tried with "if condition" but it is not working. cat drivers.txt| while read ADAP do echo "Checking for $ADAP" if ;then echo "FC... (9 Replies)
Discussion started by: ksgnathan
9 Replies

4. Shell Programming and Scripting

print all between patterns with special chars

Hi, I'm having trouble with awk print all characters between 2 patterns. I tried more then one solution found on this forum but with no success. Probably my mistakes are due to the special characters "" and "]"in the search patterns. Well, have a log file like this: logfile.txt ... (3 Replies)
Discussion started by: ginolatino
3 Replies

5. UNIX for Dummies Questions & Answers

Remove Unicode/special chars from XML

Hi, We are receiving an XML file in Unix which has some special characters between tags like '^' etc <Tag> 1e^O7f%<2304e.$d8f57e8^Bf-&e.^Zh7/327e^O7 </Tag> We need to remove all special characters like ^ ones and also any '&' or '<' or '>' being sent within the start and close tags i.e.... (6 Replies)
Discussion started by: dsrookie7
6 Replies

6. UNIX for Dummies Questions & Answers

Strings with Special chars in IF condition

I was trying to run a code to check if a fax number is empty or not. for that, I've written the following code which is throwing an error. #!/bin/ksh fax= "999-999-9999" if ; then fax_no="000-000-0000" else fax_no=$fax fi echo $fax_no And I get the... (7 Replies)
Discussion started by: hooaamai
7 Replies

7. Shell Programming and Scripting

Special chars in sed variable

Hi, For years ive been using this script to do mass search & replaces on our websites. Its worked with all sorts of spaces, quotes, html or whatever with a little adjusting here and there. But I just cant get this pattern to work: #!/bin/bash OLDURL="document.write('<script... (2 Replies)
Discussion started by: mutex
2 Replies

8. Shell Programming and Scripting

special chars arrangement in code

here is my simple script to show process and owners except me: ps `-ef |grep xterm |grep -v aucar` | while read a1 a2 a3 a4 a5 a6 a7 a8 do echo KILL..\($a1\).. $a2 |more done how can I pass values from command "ps -ef |grep xterm|grep -v aucar" to ? because above command... (2 Replies)
Discussion started by: xramm
2 Replies

9. Shell Programming and Scripting

treating special chars

Hi, I need some advise on treating non printable chars over ascii value 126 Case 1 : On some fields in the text , I need to retiain then 'as-is' and load to a database.I understand it also depends on database codepage. but i just wanna know how do i ensure it do not change while loading... (1 Reply)
Discussion started by: braindrain
1 Replies

10. UNIX for Advanced & Expert Users

Supress special chars in vi

Hi, One of our application is producing log files. But if we open the log file in vi or less or view mode, it shows all the special characters in it. The 'cat' shows correctly but it shows only last page. If I do 'cat' <file_name> | more, then again it shows special characters. ... (1 Reply)
Discussion started by: divakarp
1 Replies
Login or Register to Ask a Question