Why did you use [0-9][0-9]* at the end of digitSequenceTooLongNotIP instead of using [0-9]+?
Quote:
Originally Posted by gencon
I'm far from being a regex pro, my thought process went like this: I need to match a minimum of 4 digits in a row, so [0-9][0-9][0-9][0-9], then optionally a 5th or more digit so I need [0-9][0-9][0-9][0-9][0-9]*.
[0-9][0-9][0-9][0-9]+ is more concise, is it any more efficient?
I've had a look into the question I've placed in bold above...
I created a dataset of 5 million numbers each with a random number of digits (between 1 and 10 digits). 10 numbers per line, each separated by a space.
Then I used time to time 10 runs of an awk program which used [0-9][0-9][0-9][0-9]+ and then 10 runs with [0-9][0-9][0-9][0-9][0-9]*.
Since it was being run on my Linux desktop PC, I used chrt and set the scheduling policy to SCHED_FIFO with a priority of 99 which as far as I know gives the process the highest priority possible. The commands were:
I don't think the results can be considered as particularly scientific... But they were fairly consistent. BTW as expected each run had 0 context switches and 1 wait.
In fact the results were so close that I think that the Awk interpreter was probably running the same code in both cases, after all the 2 regexes [0-9][0-9][0-9][0-9]+ and [0-9][0-9][0-9][0-9][0-9]* are logically interchangeable.
I sorted the times and discarded the 3 fastest and 3 slowest times of the 10 runs, leaving me with:
The C code to create the data file of 5 million numbers, each 1-10 digits in length, and with 10 numbers on each line is here: http://pastebin.com/6vG9WQwj
Hi,
I have a log file containg records in sequence
<CRMSUB:MSIN=2200380,BSNBC=TELEPHON-7553&TS21-7716553&TS22-7716553,NDC=70,MSCAT=ORDINSUB,SUBRES=ONAOFPLM,ACCSUB=BSS,NUMTYP=SINGLE;
<ENTROPRSERV:MSIN=226380,OPRSERV=OCSI-PPSMOC-ACT-DACT&TCSI-PPSMTC-ACT-DACT&UCSI-USSD;... (17 Replies)
I admin two co-located servers. I built an app that creates subdirectories for users ie www.site.com/username.
one server that works just fine when you hit that url, it sees the index within and does as it should.
I moved the app to my other server running FEDORA 1 i686 standard, cPanel... (3 Replies)
Hello,
I'm working on unix with grep (GNU grep) 2.5.1. I'm going through some of the newer regex syntax using Regular Expression Reference - Advanced Syntax a guide.
ls -aLl /bin | grep "\(x\)"
Which works, just highlights 'x' where ever, when ever.
I'm trying to to get (?:) to work but... (4 Replies)
Hi,
I have a pipe delimited file. I am checking for junk characters ( non printable characters and unicode values).
I am using the following code
grep '' file.txt
But i want to ignore the name fields. For example field2 is firstname so i want to ignore if the junk characters occur... (4 Replies)
Hi,
I need to perform a grep from a file, but ignore any results from the first column.
For simplicity I have changed the actual data, but for arguments sake, I have a file that reads:
MONACO Monaco ASMonaco
MANUTD ManUtd ManchesterUnited
NEWCAS NewcastleUnited
NAC000 NAC ... (5 Replies)
Hi Guys.
I guess I have a very basic query but stuck with it :(
I have a file in which I want to extract particular content. The content is between standard format like :
Verify stats
A=0
B=12
C=34
TEST Failed
Now I want to extract data between "Verify stats" & "TEST Failed" but do... (6 Replies)
Friends,
In the file i am having more then 100 lines like,
File1 had the values like this:
#Example East.server_01=EAST.SERVER_01
East.server_01=EAST.SERVER_01
West.server_01=WEST.SERVER_01
File2 had the values like this:
#Example EAST.SERVER_01=http://yahoo.com... (3 Replies)
Hi,
How to achieve the displaying of sequence no while doing grep for an output.
Ex., need the output like below with the serial no, but not the available line number in the file
S.No Array Lun
1 AABC 7080
2 AABC 7081
3 AADD 8070
4 AADD 8071
5 ... (3 Replies)
Hi,
I want to read a file line by line and exclude the lines that are beginning with special characters. The below code is working fine except when the line starts with hyphen (-) in the file.
for TEST in `cat $FILE | grep -E -v '#|/+' | awk '{FS=":"}NF > 0{print $1}'`
do
.
.
done
How... (4 Replies)