Help - Search for string, then do string operation on line


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help - Search for string, then do string operation on line
# 1  
Old 08-17-2009
Java Help - Search for string, then do string operation on line

Hi,

I wish to find all lines that contain a specific search word, and then do few string operations on that line. The idea is to "fix" the file which has been moved from windows to unix.

Using unix - Sun Solaris

Test input ("t2.sas")

Code:
 
statement1
statement2
libname  yahoo          '/analytics/CODE';
statement 3
libname google  "/analytics/india/CODE/test_DIR" ;
libname msn '/analytics/month end docs for sas/Wcode';
statement 4

Required Actions:
1. Find the line which contains "libname". This is not necessarily at the start of the line (e.g leading spaced), but it is always the first word on the line.
2. Convert the entire libname line to lowercase
3. Remove space ONLY in the path (defined by quotes, can be single or double) and replace with underscore.

Ideal output will be:

Code:
 
statement1
statement2
libname  yahoo          '/analytics/code';
statement 3
libname google  "/analytics/india/code/test_dir" ;
libname msn '/analytics/month_end_docs_for_sas/wcode';
statement 4

I can use sed, awk (no perl). Also I don't have gnu versions.

Thanks.
# 2  
Old 08-17-2009
This seems to be homework. There is a separate forum with specific rules for homework questions, so it won't be (and shouldn't be) answered here.

-closed-

/edit: deepaksinbox explained that it was NOT homework at all and i was wrong. My apologies, thread is open again.

-reopened-

bakunin

Last edited by bakunin; 08-17-2009 at 08:16 AM..
# 3  
Old 08-17-2009
Try this:

Code:
awk -F"\'" '/libname/{gsub(" ","_",$2)}{print tolower($0)}' OFS="\'" file |
awk -F"\"" '/libname/{gsub(" ","_",$2)}1' OFS="\""

Use nawk or /usr/xpg4/bin/awk on Solaris.
# 4  
Old 08-17-2009
The trick to do this is to use a "range" construct with sed. It looks like this:

Code:
/expression/ {
            command1
            command2
            ....
            }

All the commands will be executed only on the lines which contains "/expression/". Think of it as a sort-of "if....endif"-construct. If the expression is matched then all the commands inside the curly braces are being applied.

Applying this to your problem:

Quote:
1. Find the line which contains "libname". This is not necessarily at the start of the line (e.g leading spaced), but it is always the first word on the line.
2. Convert the entire libname line to lowercase
3. Remove space ONLY in the path (defined by quotes, can be single or double) and replace with underscore.
My first question would be if the "libname" could be mixed case too. Your (otherwise well stated) requirements are not completely clear on this. I suppose for now that "libcase" itself is always lower case. Lets start with your point 1 (in the following scripts non-printing characters are written: replace "<spc>" with a literal space, "<tab>" with a literal tab character):

Code:
sed -n '/^[<spc><tab>]*libname/ {
              p
              }' t2.sas

This will just print only the matched lines and will give us a hint if our regexp is correct so far. Analyze the output and ask yourself:

1. Are all the lines i want to match matched?
2. Are lines matched i do not want to be matched?

If not, this regexp would have to be refined. Suppose the test succeeded. On to your requirement 2: There is a special command for replacing a list of characters with another list of characters. We use this with the whole alphabet as list:

Code:
sed -n '/^[<spc><tab>]*libname/ {
              y/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/
              p
              }' t2.sas

Again: let this run, analyze the output, check if it still is what you really want it to be. Don't care that only the lines we work on are printed for now, this is just to make the checks easier.

On to requirement 3: I take it by "space" you mean only space characters, not "white space" in general (which would include tab characters too). This is a tricky one, because we have to set up a sort-of loop to reapply the same regexp over and over until all space characters are replaced. Basically we use again a range-construct (see above) and inside this range have a "branch"-command which jumps back to the beginning of the loop after replacing the next space-char. Once all the space-chars are replace the range-expression won't be matched any more and execution continues:

Code:
sed -n '/^[<spc><tab>]*libname/ {
              y/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/
              :loopstart
              /["'][^"']* [^"']*["']/ {
                     s/\(["'][^"']*\) /\1_/
                     b loopstart
                     }
              p
              }' t2.sas

Again, test extensively. This is a complicated regexp and errors are easy to make and sometimes difficult to spot. Once you are satisfied with the results, we make a lst modification to let pass through all the other lines we have filtered out so far for clarity:

Code:
sed '/^[<spc><tab>]*libname/ {
              y/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/
              :loopstart
              /["'][^"']* [^"']*["']/ {
                     s/\(["'][^"']*\) /\1_/
                     b loopstart
                     }
              }' t2.sas

I hope this helps.

bakunin
# 5  
Old 08-18-2009
Java

Hi Bakunin,

First up, thanks a TON for the excellent solution and time taken to provide the explanation.

I have tested your solution, it works fine - in principle right now. I am facing a problem with one of the regex below, not sure whether it is related to the awk/platform I use:

This is not working:

Code:
 
/["'][^"']* [^"']*["']/

Here is the full output:

Code:
 
$ uname -a
SunOS sasuat1 5.10 Generic_138888-08 sun4u sparc SUNW,Sun-Fire-V890
$
$ cat t2.sas
statement1
statement2
libname  yahoo          '/analytics/CODE';
statement 3
libname google  "/analytics/india/CODE/test_DIR" ;
libname msn '/analytics/month end docs for sas/Wcode';
statement 4
$
$ sed '/^[<spc><tab>]*libname/ {
> y/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/
> :lpstrt
> /["'][^"']* [^"']*["']/ {
']* [^]*["]/: not found
$ sed: command garbled: /["][

Something unmatched ?

Thanks

---------- Post updated at 06:53 PM ---------- Previous update was at 03:14 PM ----------

Additional info:

Code:
 
$ which awk
/usr/xpg4/bin/awk
$

# 6  
Old 08-18-2009
Quote:
Originally Posted by deepaksinbox
I have tested your solution, it works fine - in principle right now. I am facing a problem with one of the regex below, not sure whether it is related to the awk/platform I use:

This is not working:

Code:
 
/["'][^"']* [^"']*["']/

No, it is not platform-related, at least not in a simple manner. The problem is that the whole script is (single-)quoted and the usage of the single- and double-quotes is confusing the shell. sed complains because it doesn't even get what stands there from the shell. The problem can be reduced to this:

Code:
sed 's/something/other/' -> works
sed 's/'/something/' -> will not work

because for the shell parsing this, there is one quoted string "'s/'", then an unquoted string "/something/" and then the begin of another quoted string which closing quote is missing. This will lead to an error.

Probably there is a more elegant solution to this, but this is the simplest one which came to me: replace all the problematic characters to something unlikely to occur in the text, and only in the end swap the quotation marks back in. As you can see a single quote can be used in a double-quoted string without problems, but in a single-quoted string escaping won't work.

Here is the new version, which runs perfectly against your test-data on my Linux (Ubuntu) system ("->@@@,'->@@):

Code:
sed -e "s/'/@@/g;s/\"/@@@/g" \
    -e '/^[<spc><tab>]*libname/ {
              y/ABCDEFGHIJKLMNOPQRSTUVWXYZ/abcdefghijklmnopqrstuvwxyz/
              :loopstart
              /@@[^@]* [^@]*@@/ {
                     s/\(@@[^@]*\) /\1_/
                     b loopstart
                     }
              }' \
    -e "s/@@@/\"/g;s/@@/'/g" t2.sas

I hope this helps.

bakunin
# 7  
Old 08-19-2009
MySQL

Bakunin,

Works perfectly fine now, and solves my problem. Brilliant stuff, thanks a ton again.

Cheers
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Search for a string,delete the line and replace with new string in a file

Hi Everyone, I have a requirement in ksh where i have a set of files in a directory. I need to search each and every file if a particular string is present in the file, delete that line and replace that line with another string expression in the same file. I am very new to unix. Kindly help... (10 Replies)
Discussion started by: Pradhikshan
10 Replies

2. Shell Programming and Scripting

Search string within a file and list common words from the line having the search string

Hi, Need your help for this scripting issue I have. I am not really good at this, so seeking your help. I have a file looking similar to this: Hello, i am human and name=ABCD. How are you? Hello, i am human and name=PQRS. I am good. Hello, i am human and name=ABCD. Good bye. Hello, i... (12 Replies)
Discussion started by: royzlife
12 Replies

3. Shell Programming and Scripting

Search a string in a text file and add another string at the end of line

Dear All I am having a text file which is having more than 200 lines. EX: 001010122 12000 BIB 12000 11200 1200003 001010122 2000 AND 12000 11200 1200003 001010122 12000 KVB 12000 11200 1200003 In the above file i want to search for string KVB... (5 Replies)
Discussion started by: suryanarayana
5 Replies

4. Shell Programming and Scripting

Search a string in a text file and add another string at the particular position of a line

I am having a text file which is having more than 200 lines. EX: 001010122 12000 BIB 12000 11200 1200003 001010122 2000 AND 12000 11200 1200003 001010122 12000 KVB 12000 11200 1200003 In the above file i want to search for string KVB and add/replace... (1 Reply)
Discussion started by: suryanarayana
1 Replies

5. Shell Programming and Scripting

Search one string and then search another string in the next line

I am unable to use grep comman to Print only EmpPosition and if the EmpID next line. So output should be both EmpPosition and EmpID and also EmpPosition and EmpID data should match. Sample Data EmpPosition "New" EmpID "New" - - EmpPosition "New" ... (4 Replies)
Discussion started by: onesuri
4 Replies

6. Shell Programming and Scripting

Search several string and convert into a single line for each search string using awk command AIX?.

I need to search the file using strings "Request Type" , " Request Method" , "Response Type" and by using result set find the xml tags and convert into a single line?. below are the scenarios. Cat test Nov 10, 2012 5:17:53 AM INFO: Request Type Line 1.... (5 Replies)
Discussion started by: laknar
5 Replies

7. Shell Programming and Scripting

Search a string and to add another string after that in new line

Hi Guys I am facing a problem:wall: In searching a string in a file and to add another string(ie. passed through command line argument) just after this(searched) string in new line. Thanks (2 Replies)
Discussion started by: kushwaha
2 Replies

8. Shell Programming and Scripting

search a string in a particular column of file and return the line number of the line

Hi All, Can you please guide me to search a string in a particular column of file and return the line number of the line where it was found using awk. As an example : abc.txt 7000,john,2,1,0,1,6 7001,elen,2,2,0,1,7 7002,sami,2,3,0,1,6 7003,mike,1,4,0,2,1 8001,nike,1,5,0,1,8... (3 Replies)
Discussion started by: arunshankar.c
3 Replies

9. Shell Programming and Scripting

search string in a file and retrieve 10 lines including string line

Hi Guys, I am trying to write a perl script to search a string "Name" in the file "FILE" and also want to create a new file and push the searched string Name line along with 10 lines following the same. can anyone of you please let me know how to go about it ? (8 Replies)
Discussion started by: sukrish
8 Replies

10. Shell Programming and Scripting

Perl: Search for string on line then search and replace text

Hi All, I have a file that I need to be able to find a pattern match on a line, search that line for a text pattern, and replace that text. An example of 4 lines in my file is: 1. MatchText_randomNumberOfText moreData ReplaceMe moreData 2. MatchText_randomNumberOfText moreData moreData... (4 Replies)
Discussion started by: Crypto
4 Replies
Login or Register to Ask a Question