Grep in regex

 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Grep in regex
# 1  
Old 12-05-2016
Grep in regex

Hello guys,
Here i am writing a script in bash to check for a valid URL from a file using regex
This is my input file
Code:
http://www.yahoo.commmmmm
http://www.google.com
https://www.gooogle.co
www.test6.co.in
www.gmail.com
www.google.co
htt://www.money.com
http://eeeess.google.com
https:/ww.test.c.in

#my script
Code:
URL=$(grep -E -o   "^(http(s)?://)?+(w{3}\.)+([a-z0-9]{1,64}\.)+\w{2,3}" $path )

What my output is:
Code:
http://www.yahoo.com
http://www.google.com
https://www.gooogle.co
www.test6.co.in
www.gmail.com
www.google.co

here it is trimming the htttp://www.yahoo.commmmmm
Help me out from this

Last edited by rbatte1; 12-05-2016 at 08:00 AM..
# 2  
Old 12-05-2016
You asked it to trim to at most 3 with {2,3}.
# 3  
Old 12-05-2016
I think you are over-simplifying the issue. I don't think that there is no way for certain to know if names exist with a regular expression. You cannot just assume that the last part of a domain name (the Top Level Domain) is a 2 or three characters only.

List of Internet top-level domains - Wikipedia


You might have to trim out the domain name from the full URL & perform a get to the real site to see if you connect. That might be the only way.
  • There may be a formal list of names of the TLDs
    • Each of those may have a list of valid names below them
      • Each of those may have a list of valid names below them
        • Each of those may have a list of valid names below them
          • Each of those may have a list of valid names below them ..............

You can see the problem. The list (if you could even build one) would be huge and would be frequently updating. Perhaps a DNS query would give you enough though.

Code:
host $extracted_domain_name >/dev/null
if [ $? -eq 0 ]
then
   echo "DNS entry exists"
else
   echo "It is an invalid domain"
fi

Does that help?




Robin
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Grep regex

Hi everyone, I'm looking for a grep command to match the following pattern from a file: <EGS>10234567<EGS> I used this following command to do this: grep -E '^<EGS>{8}<EGS>' test.txt In output I got: <EGS>10234567<EGS> Till now it work, but if I add something at the end of the line... (2 Replies)
Discussion started by: Arnaudh78
2 Replies

2. Shell Programming and Scripting

Grep with regex containing one string but not the other

Hi to you all, I'm just struggling with a regex problem and I'm pretty sure that I'm missing sth obvious... :confused: I need a regex to feed my grep in order to find lines that contain one string but not the other. Here's the data example: 2015-04-08 19:04:55,926|xxxxxxxxxx| ... (11 Replies)
Discussion started by: stresing
11 Replies

3. Shell Programming and Scripting

grep -v and regex

How to match lines that don't contain a patern in regex it self, without using the -v option of grep? (15 Replies)
Discussion started by: vistastar
15 Replies

4. UNIX for Dummies Questions & Answers

| help | unix | grep (GNU grep) 2.5.1 | advanced regex syntax

Hello, I'm working on unix with grep (GNU grep) 2.5.1. I'm going through some of the newer regex syntax using Regular Expression Reference - Advanced Syntax a guide. ls -aLl /bin | grep "\(x\)" Which works, just highlights 'x' where ever, when ever. I'm trying to to get (?:) to work but... (4 Replies)
Discussion started by: MykC
4 Replies

5. Shell Programming and Scripting

regex and grep

I want it to find lines that contain any number of capital letters before P this is what I have tried echo "AAAAAP" | grep 'P' echo "AAAAAP" | grep '\{1\}P' echo "AAAAAP" | grep '^*P' But none of them seem to work, any help is much appreciated thanks Calypso (4 Replies)
Discussion started by: Calypso
4 Replies

6. Shell Programming and Scripting

grep and regex question

basically i have a csv i parse through. a user will supply me with a san switch he/she wants more info about... say the name is "pnj-sansw124" now i can grep out every connection to that switch w/o issue because this sans switch pnj-sansw124 has multiple slots 1-10. and it looks like this in the... (5 Replies)
Discussion started by: pupp
5 Replies

7. UNIX for Dummies Questions & Answers

Help with grep and regex

Hi all, I'm a beginner with linux, regex, grep, etc I am trying to get data out of a file that has about 13,000 lines in this format name - location I want to grep all the names out to one file and the locations to another so I can put them into a spreadsheet. Some have hyphenated... (14 Replies)
Discussion started by: raichlea
14 Replies

8. UNIX for Dummies Questions & Answers

grep with Regex help!

Hello everybody, I'd like to know how is it I should write a regex in unix to match a string not followed by another string (anywhere in the line). To be more specific, I want to find lines where "drop table" is found, but not followed anywhere in the line by the character "&". For... (3 Replies)
Discussion started by: mvalonso
3 Replies

9. Shell Programming and Scripting

grep regex problem

Hi, I am trying to do something with grep, but for some reason I just can't get it to to work. I am looking for find a match in the second field, the length must be 10 characters and end with 'abc'. The file is in this format: <int><tab><field2> I've tried a few patterns, some work,... (2 Replies)
Discussion started by: iceman
2 Replies

10. UNIX for Dummies Questions & Answers

use of regex on grep

having a look on the regex site I saw that characters can be search using hex values http://www.regular-expressions.info/characters.html So I try to use it whith grep to find a è on a string (octal Decimal Hexa : 350 232 E8) but it doesn't work E.g. /usr/bin/echo '\0350' | egrep '\xE8' ... (0 Replies)
Discussion started by: solea
0 Replies
Login or Register to Ask a Question