Grabbing strings with awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Grabbing strings with awk
# 1  
Old 01-04-2013
Hammer & Screwdriver Grabbing strings with awk

Hello everyone,
I am doing some sort of analysis for some data about organic solvents, and I have a problem with writing a command to do this:

Here's a sample of my file:

Code:
 
1     ethanol
2     methanol
3     methanol/ethanol
4     ethanol/methanol
5     ethanol/DMF
6     ethyl ether/methanol
7     ethanol methanol
8     methanol ethanol
9     ethanol-Methanol
10   methanol-Ethanol
11   methanol-DMF
12   ethyl ether/Methanol
13   DMF
14   chloroform

This is a tab separated file, I am trying to generate a new file that contains combinations of ethanol and anything else using "awk".
also, I am trying to generate another file that contains exactly the other way around, everything but ethanol.

Please note that the word "Methanol" contains "ethanol" inside it as a string.
Also, the separation is not regular, sometimes a slash and sometimes a space.
Do you know what command i should use to get over this problem?

I am currently using this command :
Code:
 
awk 'BEGIN{IGNORECASE=1} $2 ~ /ethanol/' filepath

but this is giving me methanol as well. I am looking for combinations containing ethanol in one file, and everything else in another file.

Cheers,

# 2  
Old 01-04-2013
Try:
Code:
grep -i '[/ \t-]ethanol' infile

Code:
grep -i '[[:blank:]/-]ethanol' infile


Last edited by Scrutinizer; 01-04-2013 at 10:45 AM..
# 3  
Old 01-04-2013
Try:
Code:
perl -ne 'print if /\bethanol\b/i' file

# 4  
Old 01-04-2013
Probably other solutions...

Now, this solution does eliminate the / characters...

Code:
$ cat sample40.txt
1     ethanol
2     methanol
3     methanol/ethanol
4     ethanol/methanol
5     ethanol/DMF
6     ethyl ether/methanol
7     ethanol methanol
8     methanol ethanol
9     ethanol-Methanol
10   methanol-Ethanol
11   methanol-DMF
12   ethyl ether/Methanol
13   DMF
14   chloroform

$ tr "/" "\t" <sample40.txt | awk 'BEGIN{IGNORECASE=1} ($2 ~/ethanol/ && $2 !~/methanol/) || ($3 ~/ethanol/ && $3 !~/methanol/)'
1     ethanol
3     methanol  ethanol
4     ethanol   methanol
5     ethanol   DMF
7     ethanol methanol
8     methanol ethanol

This User Gave Thanks to joeyg For This Post:
# 5  
Old 01-04-2013
try also:
Code:
awk 'BEGIN{IGNORECASE=1} $2 ~ /^ethanol/' filepath

# 6  
Old 01-04-2013
Quote:
Originally Posted by joeyg
Now, this solution does eliminate the / characters...

Code:
$ cat sample40.txt
1     ethanol
2     methanol
3     methanol/ethanol
4     ethanol/methanol
5     ethanol/DMF
6     ethyl ether/methanol
7     ethanol methanol
8     methanol ethanol
9     ethanol-Methanol
10   methanol-Ethanol
11   methanol-DMF
12   ethyl ether/Methanol
13   DMF
14   chloroform
 
$ tr "/" "\t" <sample40.txt | awk 'BEGIN{IGNORECASE=1} ($2 ~/ethanol/ && $2 !~/methanol/) || ($3 ~/ethanol/ && $3 !~/methanol/)'
1     ethanol
3     methanol  ethanol
4     ethanol   methanol
5     ethanol   DMF
7     ethanol methanol
8     methanol ethanol

Thanks for the command dude. I honestly don't understand what it means but, it worked out for me! ( i mean the first part [[ tr"/" "\t" ]])
It would be very kind if you tell me what you did up there!

Thank you!
# 7  
Old 01-04-2013
tr "/" "\t" translates all the forward slash / to horizontal tab \t

For further reference on tr command check the manual here
This User Gave Thanks to Yoda For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

(g)awk: Matching strings from one file in another file between two strings

Hello all, I can get close to what I am looking for but cannot seem to hit it exactly and was wondering if I could get your help. I have the following sample from textfile with many thousands of lines: File 1 PS001,001 HLK PS002,004 L<G PS004,002 XNN PS004,006 BVX PS004,006 ZBX=... (7 Replies)
Discussion started by: jvoot
7 Replies

2. Shell Programming and Scripting

awk to get multiple strings in one variable

I am processing a file using awk to get few input variables which I'll use later in my script. I am learning to script using awk so please advise in any mistakes I made in my code. File sample is as follows # cat junk1.jnk Folder1 : test_file (File) ... (5 Replies)
Discussion started by: shunya
5 Replies

3. UNIX for Beginners Questions & Answers

Use strings from nth field from one file to match strings in entire line in another file, awk

I cannot seem to get what should be a simple awk one-liner to work correctly and cannot figure out why. I would like to use patterns from a specific field in one file as regex to search for matching strings in the entire line ($0) of another file. I would like to output the lines of File2 which... (1 Reply)
Discussion started by: jvoot
1 Replies

4. Shell Programming and Scripting

Using awk to replace strings

Hi.. I have a file that has the following content : abc 213 24 213 pqr 456#34 678 xyz 213 45%213 i need to write an awk script that will replace the second 213 in all the lines, if it is present. The IFS can not be specified and can be random. The number of lines in the file and the... (5 Replies)
Discussion started by: Hermione Grange
5 Replies

5. Shell Programming and Scripting

Get data from file using awk within strings.

i am having the text data file as below. Processing 1 a 111111 b c d e f Processing 2 a b 222222 c erorr d e f Processing 3 a b 3333 (1 Reply)
Discussion started by: expert
1 Replies

6. Shell Programming and Scripting

awk Splitting strings

Hi All, There is a file with a data. If the line is longer than 'n', we splitting the line on the parts and print them. Each of the parts is less than or equal 'n'. For example: n = 2; "ABCDEFGHIJK" -> length 11 Results: "AB" "CD" EF" GH" "IJ" "K" Code, but there are some errors.... (9 Replies)
Discussion started by: booyaka
9 Replies

7. Shell Programming and Scripting

awk to get text between 2 strings

Hi, I am trying different scenarios now, 1 of those is getting the text between the following 2 strings. Type of msg: -in_full >date >alr text >ID_on_exit AWXX-Ready to commit (98) msg type: (10) I need to get all the occurrences having the same start line and end line.... (6 Replies)
Discussion started by: ocramas
6 Replies

8. Shell Programming and Scripting

awk script for replacing 2 strings

Hi I have written a script for automating a program. There is a string in 2 lines that needs altering from input. The 2 lines are: prepare_flexreceptor4.py -r rec_rigid.pdbqt -s TYR119_TRP312 -x rec_flex.pdbqt and prepare_flexdocking4.py -l ind.pdbqt -r rec_flex.pdbqt -s TYR119_TRP312... (3 Replies)
Discussion started by: gav2251
3 Replies

9. Shell Programming and Scripting

awk and NMEA strings

Hi all: I have a file with NMEA sentences of the type: $GPVTG,012.0,T,,M,00.0,N,,K*7D $GPRMC,180541,A,3631.874,N,00617.423,W,00.0,004.0,210608,,*36 $SDDBT,,f,,M,,F*28 $GPGLL,3631.874,N,00617.423,W,180542,A*3B $GPVTG,009.0,T,,M,00.0,N,,K*77 $GPVTG,019.0,T,,M,00.0,N,,K*76... (9 Replies)
Discussion started by: rleal
9 Replies

10. Shell Programming and Scripting

using AWK how to extract text between two same strings

I have a file like: myfile.txt it is easy to learn awk and begin awk scripting and awk has got many features awk is a powerful text processing tool Now i want to get the text between first awk and immediate awk not the third awk . How to get it ? its urgent pls help me and file is unevenly... (2 Replies)
Discussion started by: santosh1234
2 Replies
Login or Register to Ask a Question