text processing and filtering scripting


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting text processing and filtering scripting
# 1  
Old 09-02-2010
text processing and filtering scripting

Still new to bash. Using debian lenny 5, bash version 3.2.39. I'm working on three scripts. I need help completing them.

One script that inputs a plain text file, echo then chop it up into separate whitespace-delimited strings as an output. Not sure how to do this...

for example, the content could be this:
Code:
550e8400-e29b-41d4-a716-446655440000John doe 543    192.168.1.1


Second script that inputs a plain text file and output is stripped of uuid's (all versions) and ip addresses for example
Code:
550e8400-e29b-41d4-a716-446655440000John doe 543    192.168.1.1 
to 
John doe 543

I would guess with the exception of [0-9A-Z] with awk?


also not sure how to do this.

Third script that input a plain text file and output the unique lines, sorted according to their frequency
code so far:

Code:
  
sort | uniq -c | sort -nr 

done >> output.txt



---------- Post updated at 11:39 PM ---------- Previous update was at 04:02 PM ----------

Ok so I got two scripts done. I actually just need help on the second script.

Last edited by l20N1N; 09-01-2010 at 10:40 PM.. Reason: correction
# 2  
Old 09-02-2010
Try:
Code:
sed -e 's/\([0-9a-f]\{4,\}-\)\{4\}[0-9a-f]\{12\}//g' -e 's/\([0-9]*\.\)\{3\}[0-9]*//g' infile

or
Code:
sed -r 's/([0-9a-f]{4,}-){4}[0-9a-f]{12}//g;s/([0-9]*\.){3}[0-9]*//g' infile

if you have GNU sed
# 3  
Old 09-02-2010
Thanks! It worked! If you don't mind, I have one more question. I need to use a similar sed command to filter out this kind of data:

Code:
 1246346269-d1989834dbd34170e1a29310191e9dcf63538368202ee72eae62d

I can modify the one you provided right? For instance

I know
Code:
's/\([0-9a-f]\

is the first 8 digits including letters,
Code:
 {4,\}-\)\{4\}

is the next 8 digits with the dashes, and
Code:
[0-9a-f]\{12\}//g'

is the last 12 characters; correct? so the code would be something like
Code:
 sed -e 's/\([0-9]\{10,\}-\)[0-9a-f]\{53\}//g'

?

Last edited by l20N1N; 09-02-2010 at 07:26 PM..
# 4  
Old 09-02-2010
Quote:
Originally Posted by l20N1N
Thanks! It worked! If you don't mind, I have one more question. I need to use a similar sed command to filter out this kind of data:

Code:
 1246346269-d1989834dbd34170e1a29310191e9dcf63538368202ee72eae62d

I can modify the one you provided right? For instance

I know
Code:
's/\([0-9a-f]\

is the first 8 digits including letters,
Code:
 {4,\}-\)\{4\}

is the next 8 digits with the dashes,
Hi, not exactly.
Code:
[0-9a-f]\{4,\}-

means a sequence of 4 or more hexadecimale digits ( a-f being lowercase ) followed by a dash.
Code:
\(...\)\{4\}

means the sequence in escaped parentheses( "\(" and "\)" ) is repeated 4 times in its entirety.
So combing these two, 4 hexadecimal numbers followed by dashes of minimally 4 hexadecimal characters...
Quote:
and
Code:
[0-9a-f]\{12\}//g'

is the last 12 characters; correct?
Correct. the g of course means there can be multiple occurrences of the entire match per line.
Quote:
so the code would be something like
Code:
 sed -e 's/\([0-9]\{10,\}-\)[0-9a-f]\{53\}//g'

?
Yes, that would match it. You could lose the comma if the first part is always exactly 10 digits, and also you do not need the escaped parentheses.
Code:
sed -e 's/[0-9]\{10\}-[0-9a-f]\{53\}//g'


Last edited by Scrutinizer; 09-02-2010 at 08:43 PM..
This User Gave Thanks to Scrutinizer For This Post:
# 5  
Old 09-02-2010
Thanks again for your help! Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

8 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Filtering text with awk

I need to filter a file that is composed like that: >Cluster 0 0 292nt, >last294258;size=1;... * >Cluster 1 0 292nt, >last111510;size=1;... * 1 290nt, >last136280;size=1;... at -/98.62% 2 292nt, >last217336;size=1;... at +/99.66% 3 292nt, >last280937;size=1;... at -/99.32% >Cluster 2... (6 Replies)
Discussion started by: pedro88
6 Replies

2. Programming

awk processing / Shell Script Processing to remove columns text file

Hello, I extracted a list of files in a directory with the command ls . However this is not my computer, so the ls functionality has been revamped so that it gives the filesizes in front like this : This is the output of ls command : I stored the output in a file filelist 1.1M... (5 Replies)
Discussion started by: ajayram
5 Replies

3. Shell Programming and Scripting

text filtering

INPUT FILE: Date: 10-JUN-12 12:00:00 B 0: 00 00 00 00 10 00 16 28 B 120: 00 00 00 39 53 32 86 29 Date: 10-JUN-12 12:00:10 B 0: 00 00 00 00 10 01 11 22 B 120: 00 00 00 29 23 32 16 29 Date: 10-JUN-12 12:00:20 B 0: 00 00 00 00 10 02 17 29 B 120: 00 00 35 51 42 66 14 Date: 10-JUN-12... (5 Replies)
Discussion started by: thibodc
5 Replies

4. Shell Programming and Scripting

Filtering out text with awk

(0 Replies)
Discussion started by: nilekyle
0 Replies

5. Shell Programming and Scripting

Concurrent Processing in Perl Scripting

We are trying to scp files from machine A to machine B. We want to copy the files in the rate using the file original timestamps. Sometimes files have the same time stamp. So we want to start scp for all files with the same time stamp at the same time. I like to get some opinion on using... (0 Replies)
Discussion started by: meixingzhao
0 Replies

6. Shell Programming and Scripting

filtering text

Hi how can I filter the text using this one. SAMPLE servervmpool -listall|tail -11 ================================================================================ pool number: 112 pool name: Net-Ora-1wk description: Net-Ora-1wk max partially full: 0... (12 Replies)
Discussion started by: kenshinhimura
12 Replies

7. Shell Programming and Scripting

Another text filtering question

I want to remove everything from a file but the word following the search word. Example: crap crap crap crap SearchWord WordToKeep crap crap crap How would I do this with say awk or grep? Thank you! (4 Replies)
Discussion started by: DethLark
4 Replies

8. UNIX for Dummies Questions & Answers

Filtering text from a string

I'm trying to write a script which prints out the users who are loged in. Printing the output of the "users" command isn't the problem. What I want is to filter out my own username. users | grep -v (username) does not work because the whole line in which username exists is suppressed. If... (5 Replies)
Discussion started by: Cozmic
5 Replies
Login or Register to Ask a Question