I'm working on learning regular expressions and what I can do with them. I'm using unix to and its programs to experiment and learn what my limitations are with them.
I'm working on duplicating the regular expression:
This is supposed to delete duplicate lines from a file.
The full commadline argument that I'm using is:
I added \ in front of the ( and ) and ' infront of the ^ and after the $
I also removed the \r? which was for windows support.
ps -e: Outputs a list of most the processes running. This is simply to generate output to work with.
Quote:
29513 ? 00:00:00 bash
31212 ? 00:00:00 man
31215 ? 00:00:00 sh
31216 ? 00:00:00 sh
31221 ? 00:00:00 less
32464 ? 00:00:00 cat
cut -c 25-: Prints only the characters in a string in a line starting with the 25th character. This is to get only the processes names printed.
Quote:
man
sh
sh
less
cat
sort: Will sort the list alphabetically. This is because I think the regular expression requires the list to be sorted.
Quote:
cat
less
man
sh
sh
grep '^\(.*\)\(\n\1\)+$': Print lines matching the following pattern.
': Strong quotation, allows the containing characters to put passed "as is" to the grep program.
^: Match all lines that start with "\(.*\)"
\(.*\): Back reference "()" all lines of any length that contain zero or more characters ".*" Basically, store each line and entire line. Each line back referenced is replaced by the next line.
This is where I get kinda lost. Of course I could already be lost without knowing it.
\(\n\1\): Back reference "()" a new line "\n" call the previous backreference stored "\1" -> " "\(.*\)". Make a new line exactly the same as the first back referenced.
+: Is there one or more of the preceding line? Does \(.*\) contain (\n\1\)
$: Match the end of line position to the +
': Closing strong qoutation.
Even with the step by step from the below link, I still have a hard time understanding the replacement and repetition.
Ok, so if I'm trying to use grep to execute the regular expression:
To remove duplicate lines that are in sequence, then its simply a regular expression that goes beyond what grep is able to execute/interpret because it can't deal with more than one line at a time. Ok, I read a bit about the concept about line boundaries and it seems this one of the things sed is used for.
Is what I used to get the output I was trying to achieve, but this more of an experiment with grep rather getting the output. I guess I going to have to familiarize myself with the concept of line boundaries and any tricks I can use to work with them in odd ways.
Finally, I'm going to play around with this a bit but if grep can't remove duplicate lines like uniq it might be able to remove duplicate characters or patterns.
Hi ,
I have few lines like
A20120101.ANU.ZIP
A20120401.ABC.ZIP
A20120105.KJK.ZIP
A20120809.JUG.ZIP
A20120101.MAT.ZIP
B20120301.ANU.XIP
I want to filter by
1. Files starting with A and Ending With Z ( ^A.*.ZIP$)
2. And either ANU, or KJK or MAT in the file name.
Hope my... (6 Replies)
I've found this script which seems very promising to solve my issue:
To search and replace many different database passwords in many different (.php, .pl, .cgi, etc.) files across my filesystem.
The passwords may or may not be contained within quotes, single quotes, etc.
#!/bin/bash... (4 Replies)
Hi all,
How am I read a file, find the match regular expression and overwrite to the same files.
open DESTINATION_FILE, "<tmptravl.dat" or die "tmptravl.dat";
open NEW_DESTINATION_FILE, ">new_tmptravl.dat" or die "new_tmptravl.dat";
while (<DESTINATION_FILE>)
{
# print... (1 Reply)
I have the following code:
ls -al /bin | tr -s ' ' | grep 'x'
ls -al: Lists all the files in a given director such as /bin
tr -s ' ': removes additional spaces between characters so that there is only one space
grep 'x': match all "x" characters that are followed by a whitespace.
I was... (3 Replies)
Hello,
This is my first post so, Hello World! Anyways, I'm learning how to use unix and its quickly become apparent that a strong foundation in regular expressions will make things easier. I'm not sure if my syntax is messing things up or my logic is messing things up.
ps -e | grep... (4 Replies)
please can someone tell me what the following regrex means
grep "^aa*$" <file>
I thought this would match any word beginning with aa and ending with $, but it doesnt.
Thanks in advance
Calypso (7 Replies)
Hi, guys. I have one question, hope somebody can give me a hand
I have a file called passwd, the contents of it arebelow:
***********************
...
goldsimj:x:5008:200:
goldsij2:x:5009:200:
whitej:x:5010:201:
brownj:x:5011:202:
goldsij3:x:5012:204:
greyp:x:5013:203:
...... (6 Replies)
When i do ls -ld RT_BP* i am getting the following list.
drwxrwx--- 2 user group 256 Oct 17 10:09 RT_BP809
drwxrwx--- 2user group 256 Oct 17 10:09 RT_BP809.O
drwxrwx--- 2 user group 256 Oct 17 10:09 RT_BP810
drwxrwx--- 2user group 256 Oct... (2 Replies)
guys,
my requirment goes like this:
I have a file, and wish to filter out records where
1. The first letter is o or O
and
2. The next 4 following letter should not be ther
I do not wish to use pipe and wish to do it in one shot.
The best expression I came up with is:
grep ^*... (10 Replies)