Parsing chunks of text and finding data


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Parsing chunks of text and finding data
# 1  
Old 05-29-2011
Parsing chunks of text and finding data

Hi, I need a script that parses and greps data out of a textfile.
I have a text file that has this structure:

File1
host1.localdomain

text random text

Found errors

this text is random (41123) --- random random
at.5165 ---- random random
at.5165 ---- random random
at.5165 ---- random random
at.5123 ---- random random
at.5155 ---- random random
at.5333 ---- random random
200
200 hostname.localdomain

extra this text is random (41239) --- random random
at.5123 ---- random random:87654
at.5123 ---- random random:232
at.5123 ---- random random:23
at.5123 ---- random random
at.5123 ---- random random
at.5123 ---- random random
4565
4565 hostname.localdomain.end

this text is random (41123) --- random random
at.5165 ---- random random
at.5165 ---- random random:53
at.5165 ---- random random:5523323
at.5123 ---- random random:322
at.5155 ---- random random
at.5333 ---- random random
200
200 hostname.localdomain


I want a script that recognices this chunks of data (it doesn't necessarily starts with the word "this", it could be anything) but it always ends with a number, then newline and the same number with extra data.

I want the script to get the chunks of data that has a number larger than 1000 ( I am talking about the number that is by itself, in this case "4665") then take this chunk of data, in this case the chunk begins in "extra" and ends in "end", I want to create a md5sum of this chunk of text (where a number larger than 1000 was found)

Please notice, that the number that I am looking for that should be larger than 1000 should only be matched not in regular lines, but only the lines that only show a number (not text).

Please help, preferably on python or bash.

Thanks in advance
# 2  
Old 05-29-2011
Code:
awk 'BEGIN{FS="\n";RS=""} $(NF-1)~/^[0-9]+$/ && $(NF-1) >1000 {print > ++i ".new.text" }' File1

md5sum *.new.txt

if the number "4665" is unique, you can generate the file name with the unique number directly.
Code:
awk 'BEGIN{FS="\n";RS=""} $(NF-1)~/^[0-9]+$/ && $(NF-1) >1000 {print >  $(NF-1) ".new.text" }' File1

This User Gave Thanks to rdcwayx For This Post:
# 3  
Old 05-30-2011
Wow! that was really quick and helpfull, thanks a lot!

---------- Post updated at 10:04 PM ---------- Previous update was at 09:59 PM ----------

genius! it works Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Parsing XML (and insert data) then output data (bash / Solaris)

Hi folks I have a script I wrote that basically parses a bunch of config and xml files works out were to add in the new content then spits out the data into a new file. It all works - apart from the xml and config file format in the new file with XML files the original XML (that ends up in... (2 Replies)
Discussion started by: dfinch
2 Replies

2. Shell Programming and Scripting

Help with parsing data with awk , eliminating unwanted data

Experts , Below is the data: --- Physical volumes --- PV Name /dev/dsk/c1t2d0 VG Name /dev/vg00 PV Status available Allocatable yes VGDA 2 Cur LV 8 PE Size (Mbytes) 8 Total PE 4350 Free PE 2036 Allocated PE 2314 Stale PE 0 IO Timeout (Seconds) default --- Physical volumes ---... (5 Replies)
Discussion started by: rveri
5 Replies

3. Shell Programming and Scripting

Combining chunks of data

Hi there! Need help on some issue, I have data like this: 123 456 789 012 i need it to be like this: 123789 456012 Anyone has any idea how to do this? Thanks! Regards, Ken How to use code tags (8 Replies)
Discussion started by: kennethtls
8 Replies

4. Shell Programming and Scripting

Data parsing

Hi, I do have a data file which is divided into compartments by ---------. I would like to extract (parse) some of the data and numbers either using awk or sed The file has the format: CATGC Best GO enrichment: Genes/ORF that have the motifs (genes are sorted by max(pa+pd+po)): ... (6 Replies)
Discussion started by: Lucky Ali
6 Replies

5. Shell Programming and Scripting

Help in Parsing data

I have below string Transaction_ID:SDP-DM-151204679 , Transaction_DateTime:2011-02-11 00:00:15 GMT+05:30 , Transaction_Mode:WAP , Circle_ID:4 , Circle_Name:BJ ,Zone: , CustomerID:B_31563486 , MSISDN:7870904329 , IMSI:405876122068099 , IMEI: , Sub_Profile:Pre-Paid , CPID:Nazara , CPNAME:Nazara ,... (6 Replies)
Discussion started by: poweroflinux
6 Replies

6. Shell Programming and Scripting

Parsing the data

Hi friends, I need to parse the following data in the given format and get the desired output. I need a function, which takes the input as a parameter and the desired output will be returned from the function. INPUT(single parameter as complete string) A;BCF;DFG;FD ... (3 Replies)
Discussion started by: sumesh.1988
3 Replies

7. Shell Programming and Scripting

parsing a portion of Data from a text file

Hi All, I need some help to effectively parse out a subset of results from a big results file. Below is an example of the text file. Each block that I need to parse starts with "Output of GENE for sequence file 100.fasta" (next block starts with another number). I have given the portion of... (8 Replies)
Discussion started by: Lucky Ali
8 Replies

8. Shell Programming and Scripting

Parsing data

Hi all , I have a file with billing CDR records in it. I need to parse that information (row format) . The purpose is to compare full content. The example I have given below is a single line record but it has two portions, (1) the line start with “!” and end with “1.2.1.8” and (2) second part... (5 Replies)
Discussion started by: jaygamini
5 Replies

9. Shell Programming and Scripting

remove chunks of text from file

All, So, I have an ldif file that contains about 6500 users worth of data. Some users have a block of text I'd like to remove, while some don't. Example (block of text in question is the block starting with "authAuthority: ;Kerberosv5"): User with text block: # username, users,... (7 Replies)
Discussion started by: staze
7 Replies

10. Shell Programming and Scripting

Parsing the data

Hi I need to parse the following data using shell script Table ----- stage4n_abc 48 stage4o_abcd 4 adashpg_abc_HeartBeat 1 stage4l_asc 168 Can anyone gimme the solution. I want each value to get stored in an array or variable and want the value to be greped from another file.... (1 Reply)
Discussion started by: Archana.Dheepan
1 Replies
Login or Register to Ask a Question