Need a Python Script to filter huge files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Need a Python Script to filter huge files
# 1  
Old 11-27-2012
Need a Python Script to filter huge files

I work on various messages received from server and want to write a python script that can sort messages with unique flag values and give me the output in a text file.

I get these messages in the form of .zcap file from server and I use an internal tool to filter:

Step 1) Zcap file to get security types,
Step 2) then filter each security types file that generate various exchanges files
Steps 3) lastly filter each exchange files to get message types files.

For eg: I filtered a .zcap file to get message types as -
BOND-----CVE----- MTR_BOND, MTFD_BOND, MTQ_BQUOTE, MTQ_MBBOQUOTE
BOND-----NYSE---- MTR_BOND, MTFD_BOND, MTQ_BQUOTE
BOND-----TSE--- MTR_BOND, MTFD_BOND, MTA_RECAP, MTT_TRADE, MTT_STATUS, MTA_CLOSE

Step 4) at present, I run a UNIX command on each message type that generates multiple text files of each unique flag value messages (for eg; MTR_BOND).
$ awk '/MTQ/,/Quote Condition/{a[i++]=$0;if($0~/Flags:/){sub(":","",$2);fname=$2}if($0~/Quote Condition/){for(j=0;j<=i;j++)print a[j] > fname;i=0}}' MTA.txt

This is a very tedious job to run this command on each message type file (for eg; MTR_BOND) level, I want a python script that I run at Security Type level (for eg; BOND) .
Here I attach text file for BOND and its exchanges and message types. Please write a python script (preferably) to generate 1 text file for each message types with only 1 message of unique flag value.
# 2  
Old 11-27-2012
I have attached the files with all the data but you can only open it in unix using 'cat' or 'less' command or in notepad ++
A sample message would look like:
MTQ_BQUOTE, Length: 49, Timestamp: 8:03:28.350
MsgKey: symbol: DD13B
Symbol: DD13B, hash 33314407
QS Symbol: DD13B.CB, market 12
Security Type: BOND (5)
Symbol Type: Bond.Share.Single.None
Session: US_Day (3)
Ticker Exchange: NYSE (7) => 7
Flags: x00000000
Bid: 10341, frac: 2, pure7 val: 103.41
Bid Size: 10
Ask: 10356, frac: 2, pure7 val: 103.56
Ask Size: 15
Quote Condition: x00

There are many messages like this in the attached files with flag values (9th line of the message) while there are some messages that do not have flag value as well (like MTR_BOND and MTFD_BOND).

Last edited by Vijeta Laad; 11-27-2012 at 03:10 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Windows & DOS: Issues & Discussions

How to execute python script on remote with python way..?

Hi all, I am trying to run below python code for connecting remote windows machine from unix to run an python file exist on that remote windows machine.. Below is the code I am trying: #!/usr/bin/env python import wmi c = wmi.WMI("xxxxx", user="xxxx", password="xxxxxxx")... (1 Reply)
Discussion started by: onenessboy
1 Replies

2. Programming

Python script for extracting data using two files

Hello, I have two files. File 1 is a list of interested IDs Ex1 Ex2 Ex3File 2 is the original file with over 8000 columns and 20 millions rows and is a compressed file .gz Ex1 xx xx xx xx .... Ex2 xx xx xx xx .... Ex2 xx xx xx xx ....Now I need to extract the information for all the IDs of... (4 Replies)
Discussion started by: nans
4 Replies

3. UNIX for Dummies Questions & Answers

Filter records in a huge text file from a filter text file

Hi Folks, I have a text file with lots of rows with duplicates in the first column, i want to filter out records based on filter columns in a different filter text file. bash scripting is what i need. Data.txt Name OrderID Quantity Sam 123 300 Jay 342 498 Kev 78 2500 Sam 420 50 Vic 10... (3 Replies)
Discussion started by: tech_frk
3 Replies

4. Shell Programming and Scripting

Aggregation of Huge files

Hi Friends !! I am facing a hash total issue while performing over a set of files of huge volume: Command used: tail -n +2 <File_Name> |nawk -F"|" -v '%.2f' qq='"' '{gsub(qq,"");sa+=($156<0)?-$156:$156}END{print sa}' OFMT='%.5f' Pipe delimited file and 156 column is for hash totalling.... (14 Replies)
Discussion started by: Ravichander
14 Replies

5. Shell Programming and Scripting

Bash script with python slicing on multiple data files

I have 2 files generated in linux that has common output and were produced across multiple hosts with the same setup/configs. These files do some simple reporting on resource allocation and user sessions. So, essentially, say, 10 hosts, with the same (2) system reporting in the files, so a... (0 Replies)
Discussion started by: jdubbz
0 Replies

6. Shell Programming and Scripting

Problem running Perl Script with huge data files

Hello Everyone, I have a perl script that reads two types of data files (txt and XML). These data files are huge and large in number. I am using something like this : foreach my $t (@text) { open TEXT, $t or die "Cannot open $t for reading: $!\n"; while(my $line=<TEXT>){ ... (4 Replies)
Discussion started by: ad23
4 Replies

7. Shell Programming and Scripting

Compare 2 folders to find several missing files among huge amounts of files.

Hi, all: I've got two folders, say, "folder1" and "folder2". Under each, there are thousands of files. It's quite obvious that there are some files missing in each. I just would like to find them. I believe this can be done by "diff" command. However, if I change the above question a... (1 Reply)
Discussion started by: jiapei100
1 Replies

8. UNIX for Dummies Questions & Answers

Difference between two huge files

Hi, As per my requirement, I need to take difference between two big files(around 6.5 GB) and get the difference to a output file without any line numbers or '<' or '>' in front of each new line. As DIFF command wont work for big files, i tried to use BDIFF instead. I am getting incorrect... (13 Replies)
Discussion started by: pyaranoid
13 Replies

9. Shell Programming and Scripting

need help--script to filter specific lines from multiple txt files

Hi folks, - I have 800 txt files - those files are cisco router configs router1.txt router2.txt ... router800.txt I want to accomplish the following: - I want to have a seperate file with all the filenames that I want to process - I want a script that goes trough all those... (7 Replies)
Discussion started by: I-1
7 Replies

10. Shell Programming and Scripting

Shell Script for searching files with date as filter

Hi , Assume today's date is 10-May-2002. I want to get a list of files which were last modified since 01-May-2002. If I run the script after 5 days, it should still list me the files modified from 01-May-2002 till today. I also plan to pass the date 01-May-2002 as an argument to the shell script... (3 Replies)
Discussion started by: kanakaraj_s
3 Replies
Login or Register to Ask a Question