Extracting part of data from files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extracting part of data from files
# 1  
Old 07-25-2016
Extracting part of data from files

Hi All,

I have log files as below.

Code:
log1.txt

<table name="content_analyzer" primary-key="id">
  <type="global" />
</table>
<table name="content_analyzer2" primary-key="id">
  <type="global" />
</table>
Time taken: 1.008 seconds
ID = gd54321bbvbvbcvb

<table name="content_analyzer" unique-key="code">
  <type="global" />
</table>
<table name="content_analyzer2" primary-key="sid">
  <type="global" />
</table>
Table astcms.dfile1 stats: [numberoffiles=2, blocks=5, numberofrows=2000, totalSize=589, rawDataSize=6779]
OK
Time taken: 10.23 seconds
ID = fffgs5667uyt

<table name="content_analyzer" primary-key="did">
  <type="global" />
</table>
<table name="content_analyzer2" unique-key="scod">
  <type="global" />
</table>
Table astcms.dfile2 stats: [numberoffiles=5, numberofrows=100, totalSize=68, rawDataSize=89]
OK
Time taken: 15.12 seconds
FAILED:Exception [Error 56700]

log2.txt

Time taken: 2.005 seconds
ID = gd54321bbvbvbcvb

<table name="content_analyzer" primary-key="did">
  <type="global" />
</table>
<table name="content_analyzer2" unique-key="scod">
  <type="global" />
</table>
Table astcms.dfile4 stats: [numberoffiles=3, numberofrows=60, totalSize=10, rawDataSize=15]
OK
Time taken: 12.123 seconds

This is my code.

Code:
for filename in /home/usr/*.log
do
echo "$filename"
sed 's/[[]//g;s/[]]//g' < $filename | 
     awk '
     {  if (index($0, "stats" ) > 0)
        {
         { for(i=1;i<=NF; i++)                               
             {               
                if( $(i) ~ /numberoffiles/ || $(i) ~ /numberofrows/ || $(i) "Table") 
                {
                 printf("%s ", $(i) ) 
                }
             }
         }
       }
       if (index($0, "Time")> 0 )  
       {
         print $0
       } 
      if (index($0, "FAILED:")> 0 )  
       {
              print $0
	  }
     } ' 
   done

I am getting output as below.

Code:
/home/usr/log1.txt
Time taken: 1.008
Table astcms.dfile1 stats: numberoffiles=2, blocks=5, numberofrows=2000, totalSize=589, rawDataSize=6779 Time taken: 10.23 seconds
Table astcms.dfile2 stats: numberoffiles=5, numberofrows=100, totalSize=68, rawDataSize=89 Time taken: 15.12 seconds
FAILED:Exception [Error 56700]
/home/usr/log2.txt
Time taken: 2.005 seconds
Table astcms.dfile4 stats: numberoffiles=3, numberofrows=60, totalSize=10, rawDataSize=15 Time taken: 12.123 seconds

I want the output as below.

Code:
astcms,dfile1,2,2000,10.23,PASS,/home/usr/log1.txt
astcms,dfile2,5,100,15.12,PASS,/home/usr/log1.txt
NULL,NULL,NULL,NULL,NULL,FAILED:Exception [Error 56700],/home/usr/log1.txt
astcms,dfile4,3,60,12.123,PASS,/home/usr/log2.txt

Please help me.

Thanks in Advance.

Last edited by ROCK_PLSQL; 07-25-2016 at 09:25 AM..
# 2  
Old 07-25-2016
A bit surprising that there's any output, as your code loops across *.log whilst your files are named log*.txt.
# 3  
Old 07-25-2016
Adapting this post:
Code:
awk '
BEGIN                   {SRCH1 = "numberoffiles[^,]*"
                         SRCH2 = "numberofrows[^,]*"
                        }
/^Table/                {TBL = $2
                         sub (/\./, ",", TBL)
                         match ($0, SRCH1)
                         NoF = substr ($0, RSTART+14, RLENGTH-14)
                         match ($0, SRCH2)
                         NoR = substr ($0, RSTART+13, RLENGTH-13)
                         FND = 1
                        }
/Time/ && FND           {print TBL, NoF, NoR, $3, "PASS", FILENAME
                         FND = 0
                        }
/^FAILED/               {print "NULL,NULL,NULL,NULL,NULL", $0, FILENAME
                        }
' OFS="," log*.txt
astcms,dfile1,2,2000,10.23,PASS,log1.txt
astcms,dfile2,5,100,15.12,PASS,log1.txt
NULL,NULL,NULL,NULL,NULL,FAILED:Exception [Error 56700],log1.txt
astcms,dfile4,3,60,12.123,PASS,log2.txt

# 4  
Old 07-26-2016
Hi Rudic,

Thanks a lot.

The script is working fine.

I need date also.For that I have added done some change in the script.
But it's not working.

Code:
awk '
BEGIN                   {SRCH1 = "numberoffiles[^,]*"
                         SRCH2 = "numberofrows[^,]*"
                        }
			now="$(date +'%d-%m-%Y')"
/^Table/                {TBL = $2
                         sub (/\./, ",", TBL)
                         match ($0, SRCH1)
                         NoF = substr ($0, RSTART+14, RLENGTH-14)
                         match ($0, SRCH2)
                         NoR = substr ($0, RSTART+13, RLENGTH-13)
                         FND = 1
                        }
/Time/ && FND           {print TBL, NoF, NoR, $3, "PASS", FILENAME,now
                         FND = 0
                        }
/^FAILED/               {print "NULL,NULL,NULL,NULL,NULL", $0, FILENAME,now
                        }
' OFS="," log*.txt

Please help me.

Thanks
# 5  
Old 07-26-2016
Hello ROCK_PLSQL,

In awk we can't mention variables like shell, so could you please try following and let us know then(Haven't tested it though).
Code:
awk -vnow="$(date +'%d-%m-%Y')" '
BEGIN                   {SRCH1 = "numberoffiles[^,]*"
                         SRCH2 = "numberofrows[^,]*"
                        }
   
/^Table/                {TBL = $2
                         sub (/\./, ",", TBL)
                         match ($0, SRCH1)
                         NoF = substr ($0, RSTART+14, RLENGTH-14)
                         match ($0, SRCH2)
                         NoR = substr ($0, RSTART+13, RLENGTH-13)
                         FND = 1
                        }
/Time/ && FND           {print TBL, NoF, NoR, $3, "PASS", FILENAME,now
                         FND = 0
                        }
/^FAILED/               {print "NULL,NULL,NULL,NULL,NULL", $0, FILENAME,now
                        }
' OFS="," log*.txt

Thanks,
R. Singh
# 6  
Old 07-26-2016
Hi Singh,

Thanks for your script.
How ever it's not working.

Actually When I ran this cript nothing is happening.

I think some thing is missed.

Please check.

Thanks
# 7  
Old 07-26-2016
Works for me as given. How about checking yourself what could be wrong with your setup/environment/orthography?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extracting data from specific rows and columns from multiple csv files

I have a series of csv files in the following format eg file1 Experiment Name,XYZ_07/28/15, Specimen Name,Specimen_001, Tube Name, Control, Record Date,7/28/2015 14:50, $OP,XYZYZ, GUID,abc, Population,#Events,%Parent All Events,10500, P1,10071,95.9 Early Apoptosis,1113,11.1 Late... (6 Replies)
Discussion started by: pawannoel
6 Replies

2. Programming

Python script for extracting data using two files

Hello, I have two files. File 1 is a list of interested IDs Ex1 Ex2 Ex3File 2 is the original file with over 8000 columns and 20 millions rows and is a compressed file .gz Ex1 xx xx xx xx .... Ex2 xx xx xx xx .... Ex2 xx xx xx xx ....Now I need to extract the information for all the IDs of... (4 Replies)
Discussion started by: nans
4 Replies

3. Shell Programming and Scripting

Extracting Delimiter 'TAG' Data From log files

Hi I am trying to extract data from within a log file and output format to a new file for further manipulation can someone provide script to do this? For example I have a file as below and just want to extract all delimited variances of tag 32=* up to the delimiter "|" and output to a new file... (2 Replies)
Discussion started by: Buddyluv
2 Replies

4. UNIX for Dummies Questions & Answers

Extracting data from PDF files into CSV file

Hi, I have several hundreds of PDFfiles number 01.pdf, 02.pdf, 03.pdf, etc in one folder. These are vey long documentd with a lot of information (text, tables, figures, etc). I need to extract the information asociated with one disease in particular (Varicella). The information I need to... (5 Replies)
Discussion started by: Xterra
5 Replies

5. Shell Programming and Scripting

awk - extracting data from a series of files

Hi, I am trying to extract data from multiple output files. I am able to extract the data from a single output file by using the following awk commands: awk '/ test-file*/{print;m=0}' out1.log > out1a.txt awk '/ test-string/{m=1;c=0}m&&++c==3{print $2 " " $3 " " $4 ;m=0}' out1.log >... (12 Replies)
Discussion started by: p_sun
12 Replies

6. UNIX for Dummies Questions & Answers

Finding and Extracting uniq data in multiple files

Hi, I have several files that look like this: File1.txt Data1 Data2 Data20 File2.txt Data1 Data5 Data10 File3.txt Data1 Data2 Data17 File4.txt (6 Replies)
Discussion started by: Fahmida
6 Replies

7. UNIX for Dummies Questions & Answers

Extracting data from many compressed files

I have a large number (50,000) of pretty large compressed files and I need only certain lines of data from them (each relevant line contains a certain key word). Each file contains 300 such lines. The individual file names are indexed by file number (file_name.1, file_name.2, ... ,... (1 Reply)
Discussion started by: Boltzmann
1 Replies

8. Shell Programming and Scripting

extracting data from files..

frnds, I m having prob woth doing some 2-3 task simultaneously... what I want is... I have lots ( lacs ) of files in a dir... I want.. these info from arround 2-3 months files filename convention is - abc20080403sdas.xyz ( for todays files ) I want 1. total no of files for 1 dec... (1 Reply)
Discussion started by: clx
1 Replies

9. Shell Programming and Scripting

extracting uncommon part between two files

Hi, I need to extract the uncommon (better say incremental) part from 2 files say file_1 and file_2. file_2 contains everything that is in file_1. That is file_2 has been created internally somehow : cat file_1 temp_file > file_2 My objective is to extract the temp_file part from... (2 Replies)
Discussion started by: sabyasm
2 Replies

10. Shell Programming and Scripting

Perl - extracting data from .csv files

PROJECT: Extracting data from an employee timesheet. The timesheets are done in excel (for user ease) and then converted to .csv files that look like this (see color code key below): ,,,,,,,,,,,,,,,,,,, 9/14/2003,<-- Week Ending,,,,,,,,,,,,,,,,,, Craig Brennan,,,,,,,,,,,,,,,,,,,... (3 Replies)
Discussion started by: kregh99
3 Replies
Login or Register to Ask a Question