FASTEN count line of dat file and compare with the CTRL file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting FASTEN count line of dat file and compare with the CTRL file
# 1  
Old 11-12-2013
FASTEN count line of dat file and compare with the CTRL file

Hi All,
I thinking on how to accelerate the speed on calculate the dat file against the number of records CTRL file.

There are about 300 to 400 folder directories that contains both DAT and CTL files.
DAT contain all the flat files records
CTL is the reference check file for the DAT for verification purpose.
It contain the DAT total number of records information, process date and so on.

for your information, some of the DAT files are pretty large may contain up to > 10Mil records (due to full file and part of the application requirement).

Sometimes it took more than 30 min to perform wc -l for all the DAT file (The script that I created which run serially).

One of the way to accelerate the process of perform aggregation is to spawn multiple processes to perform the tasks since the CPU is idle just for this verification purposes.

Anyone can share the idea ?


Thanks.

---------- Post updated at 10:21 PM ---------- Previous update was at 10:10 PM ----------

Code:
  cat LNS_DSLNT01/DSLNT01.CTL
  DSLNT01.DAT§2013-10-21§13636§......

  ::This 3rd Filed in the CTRL indicate the number of records of DAT file will be.

  datdss@root:/dat3/data/UAT_TEST/wc -l ./LNS_DSLNT01/DSLNT01.DAT
  13636

# 2  
Old 11-13-2013
You can go parallel but wc -l DAT processing might saturate the disk channels pretty quickly. Maybe running the verification when the files are created would get it started faster and spread the load? GNU parallel can help run it at max speed. Extracting the CTL lines should be pretty easy, though, find|xargs grep .... Collect the should be CTL and the was DAT each as a line per file in two files, sort and run through comm -3 to find out what is out of whack. The <<() can help make this pipeline parallel.
# 3  
Old 11-14-2013
Yes this is part of script that run multiple processes at once.

Code:
while [[ $i -lt ${MAX_PROCESS}-1 ]] ; do

 let i=$(( $i + 1 ))
 if [ $i = 1 ]; then
   ostr=1
   oend=$ttl_src_ru
        echo $i,$ostr,$oend
        nohup ./verify $ostr $oend > verify_${ostr}_${oend}.lst &
 else
        ostr=`expr $oend + 1`
        oend=`expr $ostr + $ttl_src_ru`
        echo $i,$ostr,$oend
        nohup ./verify $ostr $oend > verify_${ostr}_${oend}.lst &
  fi

done

---------- Post updated at 03:40 AM ---------- Previous update was at 03:40 AM ----------

Yes this is part of script that run multiple processes at once.

Code:
while [[ $i -lt ${MAX_PROCESS}-1 ]] ; do

 let i=$(( $i + 1 ))
 if [ $i = 1 ]; then
   ostr=1
   oend=$ttl_src_ru
        echo $i,$ostr,$oend
        nohup ./verify $ostr $oend > verify_${ostr}_${oend}.lst &
 else
        ostr=`expr $oend + 1`
        oend=`expr $ostr + $ttl_src_ru`
        echo $i,$ostr,$oend
        nohup ./verify $ostr $oend > verify_${ostr}_${oend}.lst &
  fi

done

# 4  
Old 11-26-2013
http://www.gnu.org/software/parallel/ no sense reinventing a lesser wheel.

The real trick is to poll for modified files and maintain a registry file of file names, line counts and mod times, so the slow part is done ahead of time. Use fuser to ensure the file is fully written (not open for write) before counting lines.

Last edited by DGPickett; 11-26-2013 at 02:57 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Compare one file and get the count of multiple file

Hi All, I need to compare a record in one file and find the matching across two file. This is the master file. File name : CUST.dat CUST_ID 9998 10000 10004 10005 DATAFILE1 9998;80000091;4;687582837443;;;;;;;;; 9998;80000091;4;687582841003;;;;;;;;;... (4 Replies)
Discussion started by: arunkumar_mca
4 Replies

2. Shell Programming and Scripting

Help with Getting distinct record count from a .dat file using UNIX command

Hi, I have a .dat file with contents like the below: Input file ============SEQ NO-1: COLUMN1========== 9835619 7152815 ============SEQ NO-2: COLUMN2 ========== 7615348 7015548 9373086 ============SEQ NO-3: COLUMN3=========== 9373086 Expected Output: (I just... (1 Reply)
Discussion started by: MS06
1 Replies

3. Shell Programming and Scripting

How to read file line by line and compare subset of 1st line with 2nd?

Hi all, I have a log file say Test.log that gets updated continuously and it has data in pipe separated format. A sample log file would look like: <date1>|<data1>|<url1>|<result1> <date2>|<data2>|<url2>|<result2> <date3>|<data3>|<url3>|<result3> <date4>|<data4>|<url4>|<result4> What I... (3 Replies)
Discussion started by: pat_pramod
3 Replies

4. Shell Programming and Scripting

Compare file1 header count with file2 line count

What I'm trying to accomplish. I receive a Header and Detail file for daily processing. The detail file comes first which holds data, the header is a receipt of the detail file and has the detail files record count. Before processing the detail file I would like to put a wrapper around another... (4 Replies)
Discussion started by: pone2332
4 Replies

5. Shell Programming and Scripting

Compare two string in two separate file and delete some line of file

Hi all i want to write program with shell script that able compare two file content and if one of lines of file have # at the first of string or nothing find same string in one of two file . remove the line in second file that have not the string in first file. for example: file... (2 Replies)
Discussion started by: saleh67
2 Replies

6. UNIX for Dummies Questions & Answers

Count rows in .dat file

Could you please tell e how to find the total number of rows in a .dat file. edit by bakunin: this is exactly why in "advanced and expert" section?? I transfer this thread to "Unix for Dummies Questions and Answers" (2 Replies)
Discussion started by: Deeptanshu
2 Replies

7. Shell Programming and Scripting

Shell script to count number of ~ from each line and compare with next line

Hi, I have created one shell script in which it will count number of "~" tilda charactors from each line of the file.But the problem is that i need to count each line count individually, that means. if line one contains 14 "~"s and line two contains 15 "~"s then it should give an error msg.each... (3 Replies)
Discussion started by: Ganesh Khandare
3 Replies

8. Shell Programming and Scripting

Read dat file line by line

Hello, I am a bit stuck on something I am sure is easy to most. I have a dat file that has a list of server names. sentra blue red willy clawcrab I need to take each server name from the dat file in a shell script and attempt to ssh to it to run a script on that server. So I guess I... (1 Reply)
Discussion started by: LRoberts
1 Replies

9. Shell Programming and Scripting

compare two .dat files and if there is any difference pulled into a separate file

Hi, compare two .dat files and difference will be moved into separate file.if anybody having code for this please send asap. using diff command, i don't know how to write shell programming. and my first file is like this including Header and trailer 10Ç20060323Ç01(Header) 01ÇIÇbabuÇ3000 01ÇIÇbaluÇ4000... (1 Reply)
Discussion started by: kirankumar
1 Replies

10. Shell Programming and Scripting

Adding a character in the beginning of every line in a .dat file

How can i add a character(#) in the beginning of every line in a .dat file (2 Replies)
Discussion started by: Cool Coder
2 Replies
Login or Register to Ask a Question