Visit Our UNIX and Linux User Community


Parsing chunks of text and finding data


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Parsing chunks of text and finding data
# 1  
Old 05-29-2011
Parsing chunks of text and finding data

Hi, I need a script that parses and greps data out of a textfile.
I have a text file that has this structure:

File1
host1.localdomain

text random text

Found errors

this text is random (41123) --- random random
at.5165 ---- random random
at.5165 ---- random random
at.5165 ---- random random
at.5123 ---- random random
at.5155 ---- random random
at.5333 ---- random random
200
200 hostname.localdomain

extra this text is random (41239) --- random random
at.5123 ---- random random:87654
at.5123 ---- random random:232
at.5123 ---- random random:23
at.5123 ---- random random
at.5123 ---- random random
at.5123 ---- random random
4565
4565 hostname.localdomain.end

this text is random (41123) --- random random
at.5165 ---- random random
at.5165 ---- random random:53
at.5165 ---- random random:5523323
at.5123 ---- random random:322
at.5155 ---- random random
at.5333 ---- random random
200
200 hostname.localdomain


I want a script that recognices this chunks of data (it doesn't necessarily starts with the word "this", it could be anything) but it always ends with a number, then newline and the same number with extra data.

I want the script to get the chunks of data that has a number larger than 1000 ( I am talking about the number that is by itself, in this case "4665") then take this chunk of data, in this case the chunk begins in "extra" and ends in "end", I want to create a md5sum of this chunk of text (where a number larger than 1000 was found)

Please notice, that the number that I am looking for that should be larger than 1000 should only be matched not in regular lines, but only the lines that only show a number (not text).

Please help, preferably on python or bash.

Thanks in advance
# 2  
Old 05-29-2011
Code:
awk 'BEGIN{FS="\n";RS=""} $(NF-1)~/^[0-9]+$/ && $(NF-1) >1000 {print > ++i ".new.text" }' File1

md5sum *.new.txt

if the number "4665" is unique, you can generate the file name with the unique number directly.
Code:
awk 'BEGIN{FS="\n";RS=""} $(NF-1)~/^[0-9]+$/ && $(NF-1) >1000 {print >  $(NF-1) ".new.text" }' File1

This User Gave Thanks to rdcwayx For This Post:
# 3  
Old 05-30-2011
Wow! that was really quick and helpfull, thanks a lot!

---------- Post updated at 10:04 PM ---------- Previous update was at 09:59 PM ----------

genius! it works Smilie

Previous Thread | Next Thread
Test Your Knowledge in Computers #390
Difficulty: Medium
SdlBasic is a multiplatform interpreter for BASIC which can be used to make games in 2d style amos for linux and windows.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Parsing XML (and insert data) then output data (bash / Solaris)

Hi folks I have a script I wrote that basically parses a bunch of config and xml files works out were to add in the new content then spits out the data into a new file. It all works - apart from the xml and config file format in the new file with XML files the original XML (that ends up in... (2 Replies)
Discussion started by: dfinch
2 Replies

2. Shell Programming and Scripting

Help with parsing data with awk , eliminating unwanted data

Experts , Below is the data: --- Physical volumes --- PV Name /dev/dsk/c1t2d0 VG Name /dev/vg00 PV Status available Allocatable yes VGDA 2 Cur LV 8 PE Size (Mbytes) 8 Total PE 4350 Free PE 2036 Allocated PE 2314 Stale PE 0 IO Timeout (Seconds) default --- Physical volumes ---... (5 Replies)
Discussion started by: rveri
5 Replies

3. Shell Programming and Scripting

Combining chunks of data

Hi there! Need help on some issue, I have data like this: 123 456 789 012 i need it to be like this: 123789 456012 Anyone has any idea how to do this? Thanks! Regards, Ken How to use code tags (8 Replies)
Discussion started by: kennethtls
8 Replies

4. Shell Programming and Scripting

Data parsing

Hi, I do have a data file which is divided into compartments by ---------. I would like to extract (parse) some of the data and numbers either using awk or sed The file has the format: CATGC Best GO enrichment: Genes/ORF that have the motifs (genes are sorted by max(pa+pd+po)): ... (6 Replies)
Discussion started by: Lucky Ali
6 Replies

5. Shell Programming and Scripting

Help in Parsing data

I have below string Transaction_ID:SDP-DM-151204679 , Transaction_DateTime:2011-02-11 00:00:15 GMT+05:30 , Transaction_Mode:WAP , Circle_ID:4 , Circle_Name:BJ ,Zone: , CustomerID:B_31563486 , MSISDN:7870904329 , IMSI:405876122068099 , IMEI: , Sub_Profile:Pre-Paid , CPID:Nazara , CPNAME:Nazara ,... (6 Replies)
Discussion started by: poweroflinux
6 Replies

6. Shell Programming and Scripting

Parsing the data

Hi friends, I need to parse the following data in the given format and get the desired output. I need a function, which takes the input as a parameter and the desired output will be returned from the function. INPUT(single parameter as complete string) A;BCF;DFG;FD ... (3 Replies)
Discussion started by: sumesh.1988
3 Replies

7. Shell Programming and Scripting

parsing a portion of Data from a text file

Hi All, I need some help to effectively parse out a subset of results from a big results file. Below is an example of the text file. Each block that I need to parse starts with "Output of GENE for sequence file 100.fasta" (next block starts with another number). I have given the portion of... (8 Replies)
Discussion started by: Lucky Ali
8 Replies

8. Shell Programming and Scripting

Parsing data

Hi all , I have a file with billing CDR records in it. I need to parse that information (row format) . The purpose is to compare full content. The example I have given below is a single line record but it has two portions, (1) the line start with “!” and end with “1.2.1.8” and (2) second part... (5 Replies)
Discussion started by: jaygamini
5 Replies

9. Shell Programming and Scripting

remove chunks of text from file

All, So, I have an ldif file that contains about 6500 users worth of data. Some users have a block of text I'd like to remove, while some don't. Example (block of text in question is the block starting with "authAuthority: ;Kerberosv5"): User with text block: # username, users,... (7 Replies)
Discussion started by: staze
7 Replies

10. Shell Programming and Scripting

Parsing the data

Hi I need to parse the following data using shell script Table ----- stage4n_abc 48 stage4o_abcd 4 adashpg_abc_HeartBeat 1 stage4l_asc 168 Can anyone gimme the solution. I want each value to get stored in an array or variable and want the value to be greped from another file.... (1 Reply)
Discussion started by: Archana.Dheepan
1 Replies

Featured Tech Videos