02-11-2009
Please help modify solution
I am trying to extract .co.uk domains from html,
using the command:
cat $DIR/oldfile.txt | tr " " "\n" | grep [A-Za-z0-9_\.-].co.uk > $DIR/newfile.txt
The problem is that this command matches:
/>domain.co.uk<br
/>domain.co.uk<br
/>domain.co.uk<br
etc
How do I modify my regexp to match alphanumeric chars only? (apart from the dots and possible hyphens)
Many Thanks,
Hal
10 More Discussions You Might Find Interesting
1. IP Networking
hey what the hell happens if you make sure (as best one can) that a domain name like anything.com is not used at all, and you set up your own DNS and use that name without registering with a registrar, i know if the address is in use you will make some people very upset and give many internet users... (2 Replies)
Discussion started by: norsk hedensk
2 Replies
2. UNIX for Dummies Questions & Answers
Hi,
We're an internet company with several domain names. Our mail server was originally set up to deal with xxx@domain1.com email addresses which works fine.
The problem I have is that we're now also using a domain2.com, and sales@domain1.com isn't the same as sales@domain2.com.
I've added... (1 Reply)
Discussion started by: captainash
1 Replies
3. Shell Programming and Scripting
Hi,
I have to perform an iterative function on a set of 10 files. After the first round the output files are named differently than the input files.
examples
input file name = xxxx1.yyy
output file name = xxxx1_0001.yyy
I need to rename all of the output files to the original input... (5 Replies)
Discussion started by: ligander
5 Replies
4. Shell Programming and Scripting
Hello,
i have a file contains the information like below
/home/username/domain.com/log/access
/home/username/domain23.net/log/access
/home/reseller/username/domain.com/log/access
using a loop i can read every line of the file but i wants to extract domain name like(domain.com,... (3 Replies)
Discussion started by: eyes_drinker
3 Replies
5. UNIX for Dummies Questions & Answers
Hi,
I have some ps files where I want to ectract/copy a certain number from and use that number to rename the ps file.
eg:
'file.ps' contains following text:
14 (09 01 932688 0)t
the text can be variable, the only fixed element is the '14 ('. The problem is that the fixed element can appear... (7 Replies)
Discussion started by: JohnDS
7 Replies
6. UNIX for Advanced & Expert Users
Hi All,
The following is the sample xml which is generated by a tool called HUDSON when ever change occurs in SVN(Sub version namespace).
In the given XML , path/paths tags ll be vary depends on no.of changes.
now , my requirement is, need a script which can extract the payment and... (1 Reply)
Discussion started by: geervani
1 Replies
7. Shell Programming and Scripting
Hello I have a large file with lines beginning with 552, 553, 554, below is a small sample, I need to extract the data you can see below highlighted in bold from this file on the same location on every line and output it to a new file.
Thank you in advance for any help
55201KL... (2 Replies)
Discussion started by: firefox2k2
2 Replies
8. UNIX for Dummies Questions & Answers
Hi,
I am trying to extract lines from a text file given a text file containing line numbers to be extracted from the first file. How do I go about doing this? Thanks! (1 Reply)
Discussion started by: evelibertine
1 Replies
9. UNIX for Dummies Questions & Answers
I am totally new to shell scripting. I want to see people from which domain access my website. I want to generate the domain names from IP addresses in the Apache access.log file.
There are around 54 log files. I concatenate all the files into one.
I am using Ubuntu 12.04 LTS.
So I... (4 Replies)
Discussion started by: Ronni
4 Replies
10. UNIX for Dummies Questions & Answers
I have a file like this:
http://article.wn.com/view/2010/11/26/IV_drug_policy_feels_HIV_patients_Red_Cross/ http://aidsjournal.com/,www.cfpa.org.cn/page1/page2 , www.youtube.com
http://seattletimes.nwsource.com/html/jerrybrewer/2013517803_brewer25.html... (1 Reply)
Discussion started by: csim_mohan
1 Replies
LEARN ABOUT CENTOS
mail::spamassassin::plugin::uridetail
Mail::SpamAssassin::Plugin::URIDetail(3) User Contributed Perl Documentation Mail::SpamAssassin::Plugin::URIDetail(3)
NAME
URIDetail - test URIs using detailed URI information
SYNOPSIS
This plugin creates a new rule test type, known as "uri_detail". These rules apply to all URIs found in the message.
loadplugin Mail::SpamAssassin::Plugin::URIDetail
RULE DEFINITIONS AND PRIVILEGED SETTINGS
The format for defining a rule is as follows:
uri_detail SYMBOLIC_TEST_NAME key1 =~ /value1/ key2 !~ /value2/ ...
Supported keys are:
"raw" is the raw URI prior to any cleaning (e.g. "http://spamassassin.apache%2Eorg/").
"type" is the tag(s) which referenced the raw_uri. parsed is a faked type which specifies that the raw_uri was parsed from the rendered
text.
"cleaned" is a list including the raw URI and various cleaned versions of the raw URI (http://spamassassin.apache%2Eorg/,
http://spamassassin.apache.org/).
"text" is the anchor text(s) (text between <a> and </a>) that linked to the raw URI.
"domain" is the domain(s) found in the cleaned URIs.
Example rule for matching a URI where the raw URI matches "%2Ebar", the domain "bar.com" is found, and the type is "a" (an anchor tag).
uri_detail TEST1 raw =~ /%2Ebar/ domain =~ /^bar.com$/ type =~ /^a$/
Example rule to look for suspicious "https" links:
uri_detail FAKE_HTTPS text =~ /https:/ cleaned !~ /https:/
Regular expressions should be delimited by slashes.
perl v5.16.3 2011-06-06 Mail::SpamAssassin::Plugin::URIDetail(3)