Extract lines of text based on a specific keyword


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extract lines of text based on a specific keyword
# 1  
Old 08-14-2009
Question Extract lines of text based on a specific keyword

I regularly extract lines of text from files based on the presence of a particular keyword; I place the extracted lines into another text file. This takes about 2 hours to complete using the "sort" command then Kate's find & highlight facility.

I've been reading the forum & googling and can find scripts and shell commands which extract a particular string from a file but nothing that extracts a complete line based on a keyword/string within a line.

Here's an example of the lines of data I'm using:

Code:
<li><a href="http://some-website1.com/"><b>CategoryOne: </b>Description defgh</a></li>
<li><a href="http://some-website2.com/"><b>CategoryThree: </b>Description cdefg</a></li>
<li><a href="http://some-website3.com/"><b>CategoryTwo: </b>Description bcdef</a></li>
<li><a href="http://some-website3.com/"><b>CategoryOne: </b>Description abcde</a></li>
<li><a href="http://some-website2.com/"><b>CategoryOne: </b>Description zabcd</a></li>

The data is alway a list item.

I need something that will find the line containing a specified category which will then extract the complete line and move it to a new text file (preferably named after that category). For example:

If I search for "<b >CategoryOne</b >" then I need it to move every line containing "<b >CategoryOne</b >" to text file categoryone.txt

Please help...
# 2  
Old 08-14-2009
Something like that ?
Code:
awk -F'[<|:|>]' '{f=tolower($8)".txt";print >> f;close(f)}' file

# 3  
Old 08-14-2009
Quote:
Originally Posted by DionDeVille
I've been reading the forum & googling and can find scripts and shell commands which extract a particular string from a file but nothing that extracts a complete line based on a keyword/string within a line.
ahem... i think you are thinking too complicated. This is what "grep" was built for! Your solution is a one-liner:

Code:
grep "your-criteria" /path/to/source > your-criteria.html

Replace to value of the criteria with a variable, put some error-handling in and you are done. You could also refine the search criteria to inlude the list-tag in the line, etc., but that is all just bells and whistles.

I hope this helps.

bakunin
# 4  
Old 08-14-2009
Thanks for you replies.

Danmero, to be honest, I haven't a clue what you've posted means. I know it's an Awk script and I believe it uses regular expressions but I know little about either of them. I'll try and work it out.

Bakunin, worked a dream Smilie thank you so, so much.
# 5  
Old 08-14-2009
Quote:
Originally Posted by DionDeVille
This takes about 2 hours to complete using the "sort" command then Kate's find & highlight facility.
....
I need something that will find the line containing a specified category which will then extract the complete line and move it to a new text file (preferably named after that category).
Look like you try to split a large file into multiple file named base on category name and that's what the awk oneliner will do.

bakunin grep solution will work for one-by-one by category split if that's what you want.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk join lines based on keyword

Hello , I will need your help once again. I have the following file: cat file02.txt PATTERN XXX.YYY.ZZZ. 500 ROW01 aaa. 300 XS 14 ROW 45 29 AS XD.FD. PATTERN 500 ZZYN002 ROW gdf gsste ALT 267 fhhfe.ddgdg. PATTERN ERE.MAY. 280 PATTERRNTH 5000 rt.rt. ROW SO a 678 PATTERN... (2 Replies)
Discussion started by: alex2005
2 Replies

2. Shell Programming and Scripting

Append a specific keyword in a text file into a new column

All, I have some sample text file(.csv) in the below format. In my actual file there are at least 100K rows. date 03/25/2016 A,B,C D,E,F date 03/26/2016 1,2,3 4,5,6 date 03/27/2016 6,4,3 4,5,6 I require the following output where in the date appeared at different locations need to... (3 Replies)
Discussion started by: ks_reddy
3 Replies

3. Shell Programming and Scripting

Extract specific lines based on another file

I have a folder containing text files. I need to extract specific lines from the files of this folder based on another file input.txt. How can I do this with awk/sed? file1 ARG 81.9 8 81.9 0 LEU 27.1 9 27.1 0 PHE .0 10 .0 0 ASP 59.8 11 59.8 0 ASN 27.6 12 27.6 0 ALA .0 13 .0 0... (5 Replies)
Discussion started by: alanmathew84
5 Replies

4. Shell Programming and Scripting

Print all lines between two keyword if a specific pattern exist

I have input file as below I need to check for a pattern and if it is there in file then I need to print all the lines below BEGIN and END keyword. Could you please help me how to get this in AIX using sed or awk. Input file: ABC ******** BEGIN ***** My name is Amit. I am learning unix.... (8 Replies)
Discussion started by: Amit Joshi
8 Replies

5. Shell Programming and Scripting

ksh sed - Extract specific lines with mulitple occurance of interesting lines

Data file example I look for primary and * to isolate the interesting slot number. slot=`sed '/^primary$/,/\*/!d' filename | tail -1 | sed s'/*//' | awk '{print $1" "$2}'` Now I want to get the Touch line for only the associate slot number, in this case, because the asterisk... (2 Replies)
Discussion started by: popeye
2 Replies

6. Shell Programming and Scripting

extract lines from text after keyword

I have a text and I want to extract the 4 lines following a keyword! For example if I have this text and the keyword is AAA hello helloo AAA one two three four helloooo hellooo I want the output to be one two three four (7 Replies)
Discussion started by: stekanius
7 Replies

7. UNIX for Dummies Questions & Answers

Extract lines with specific words with addition 2 lines before and after

Dear all, Greetings. I would like to ask for your help to extract lines with specific words in addition 2 lines before and after these lines by using awk or sed. For example, the input file is: 1 ak1 abc1.0 1 ak2 abc1.0 1 ak3 abc1.0 1 ak4 abc1.0 1 ak5 abc1.1 1 ak6 abc1.1 1 ak7... (7 Replies)
Discussion started by: Amanda Low
7 Replies

8. Shell Programming and Scripting

Merge file lines based off of keyword

Hello Everyone, I have two files I created in a format similar to the ones found below (character position is important): File 1: 21 Cat Y N S Y Y N N FOUR LEGS TAIL WHISKERS 30 Dog N N 1 Y Y N N FOUR LEGS TAIL 33 Fish Y N 1 Y Y N N FINS 43 CAR Y N S Y Y N N WHEELS DOORS... (7 Replies)
Discussion started by: jl487
7 Replies

9. Shell Programming and Scripting

Extract Lines Containg a Keyword

Hi , I have two files, say KEY_FILE and the MAIN_FILE. I am trying to read the KEY_FILE which has only one column and look for this column data in the MAIN_FILE to extract all the rows that have this key. I have written a script to do so, but somehow it is not returning all the rows ( It... (4 Replies)
Discussion started by: Sheel
4 Replies

10. Shell Programming and Scripting

extract the lines between specific line number from a text file

Hi I want to extract certain text between two line numbers like 23234234324 and 54446655567567 How do I do this with a simple sed or awk command? Thank you. ---------- Post updated at 06:16 PM ---------- Previous update was at 05:55 PM ---------- found it: sed -n '#1,#2p'... (1 Reply)
Discussion started by: return_user
1 Replies
Login or Register to Ask a Question