How to remove all text except pattern


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to remove all text except pattern
# 1  
Old 01-04-2010
Data How to remove all text except pattern

i have nasty html file with 2000+ simbols in 1 row...i need to remove whole the code except title="Some title..." and store those into file with titles (the whole text is in variable text)
i've tried something like this:


Code:
echo $text | sed 's/.*\(title=\".+\"\).*/\1/' > titles.html

BUT it does not work, script run fine, but nothing happened...file "titles.html" look just like text variable

can you help me someone please?SmilieSmilieSmilie

Last edited by Lukasito; 01-04-2010 at 05:45 PM.. Reason: little error
# 2  
Old 01-04-2010
Try:

Code:
echo "$text" | sed 's/^.*\(title="[^"]*"\).*$/\1/' > titles.html

In sed's default basic regular expression grammar, + is not special; it is a literal plus sign.

Also, the quotes around $text that i added preserve any runs of IFS characters (most likely spaces, if any) that may occur in the title.

Regards,
alister

Last edited by alister; 01-04-2010 at 06:02 PM.. Reason: added code tags
# 3  
Old 01-04-2010
Tools Is this what you are looking for?

Code:
>echo "title=123abc and more ^&@%#$%&@"
"title=123abc and more ^&@%#$%&@"

>echo "title=123abc and more ^&@%#$%&@" | tr -cd [:alpha:][:space:][:digit:]
title123abc and more

# 4  
Old 01-04-2010
Data

still everything in the output file SmilieSmilie
# 5  
Old 01-04-2010
Code:
echo "$text" | sed -ne '/^.*\(title="[^"]*"\).*$/s//\1/p' > titles.html

If that doesn't work, then provide some sample data and desired output.
# 6  
Old 01-04-2010
Can you post the output of the following command over here ?

Code:
echo $text

tyler_durden

Also, are the "2000+ symbols" -
(a) special, but printable, characters like "@", "$", "%" etc. or
(b) non-printable characters like those for ASCII 0, 1, 2, etc.
# 7  
Old 01-04-2010
Code:
echo $text| grep -o "title=\".*\""

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

awk to remove pattern and lines above pattern

In the awk below I am trying to remove all lines above and including the pattern Test or Test2. Each block is seperated by a newline and Test2 also appears in the lines to keep but it will always have additional text after it. The Test to remove will not. The awk executed until the || was added... (2 Replies)
Discussion started by: cmccabe
2 Replies

2. Shell Programming and Scripting

How to remove the text between all curly brackets from text file?

Hello experts, I have a text file with lot of curly brackets (both opening { & closing } ). I need to delete them alongwith the text between opening & closing brackets' pair. For ex: Input:- 59. Rh1 Qe4 {(Qf5-e4 Qd8-g8+ Kg6-f5 Qg8-h7+ Kf5-e5 Qh7-e7+ Ke5-f5 Qe7-d7+ Qe4-e6 Qd7-h7+ Qe6-g6... (6 Replies)
Discussion started by: prvnrk
6 Replies

3. UNIX for Advanced & Expert Users

How to remove a char before a pattern?

Hi I have a file where i want to remove a char before a specific pattern. exp: CREATE TABLE ( A, B, C, ----comma needs to be removed )AS SELECT A, B, C, ----comma needs to be removed FROM TABLE. So i want to delete the comma(,) after the C both ways.Pattern can be... (11 Replies)
Discussion started by: raju2016
11 Replies

4. Shell Programming and Scripting

Remove comments like pattern from text

Hi , We need to remove comment like pattern from a code text. The possible comment expressions are as follows. Input BizComment : Special/*@ Name:bzt_53_3aea640a_51783afa_5d64_0 BizHidden:true @*/ /* lookup Disease Category Therapuetic Class */ a=b;... (6 Replies)
Discussion started by: VikashKumar
6 Replies

5. Shell Programming and Scripting

Remove duplicate occurrences of text pattern

Hi folks! I have a file which contains a 1000 lines. On each line i have multiple occurrences ( 26 to be exact ) of pattern folder#/folder#. # is depicting the line number in the file some text here folder1/folder1 some text here folder1/folder1 some text here folder1/folder1 some text... (7 Replies)
Discussion started by: martinsmith
7 Replies

6. Shell Programming and Scripting

Search a pattern in a line and remove another pattern

Hi, I want to search a pattern in a text file and remove another pattern in that file. my text file look like this 0.000000 1.970000 F 303 - 1.970000 2.080000 VH VH + 2.080000 2.250000 VH VH + 2.250000 2.330000 VH L - 2.330000 2.360000 F H + 2.360000 2.410000 L VL - 2.410000 ... (6 Replies)
Discussion started by: sreejithalokkan
6 Replies

7. Shell Programming and Scripting

Remove last pattern

I have a file with entries below. domain1.com.http: domain2.com.49503: I need this to be sorted like below. ie remove the patten after the last right-hand side . (dot). domain1.com domain2.com (7 Replies)
Discussion started by: anil510
7 Replies

8. Shell Programming and Scripting

Help with remove last text of a file that have specific pattern

Input file matrix-remodelling_associated_8_ aurora_interacting_1_ L20 von_factor_A_domain_1 ATP_containing_3B_ . . Output file matrix-remodelling_associated_8 aurora_interacting_1 L20 von_factor_A_domain_1 ATP_containing_3B . . (3 Replies)
Discussion started by: perl_beginner
3 Replies

9. Shell Programming and Scripting

sed: Find start of pattern and extract text to end of line, including the pattern

This is my first post, please be nice. I have tried to google and read different tutorials. The task at hand is: Input file input.txt (example) abc123defhij-E-1234jslo 456ujs-W-abXjklp From this file the task is to grep the -E- and -W- strings that are unique and write a new file... (5 Replies)
Discussion started by: TestTomas
5 Replies

10. Shell Programming and Scripting

process text between pattern and print other text

Hi All, The file has the following. =========start of file=== This is a file containing employee info START name john id 123 date 12/1/09 END START name sam id 4234 date 12/1/08 resigned END (9 Replies)
Discussion started by: vlinet
9 Replies
Login or Register to Ask a Question