Extract paragraphs and count them


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extract paragraphs and count them
# 8  
Old 03-13-2017
Hi,

Sure, no problem. One last question then. Would output like this:

Code:
Station / User...
SDate / Time / PDate...
Institution Number...

Warning|Error

<message>

be what you're after ? Or is there a particular kind of summary you'd like as output ?
# 9  
Old 03-13-2017
The original output is fine. i just want it sorted so that I know I am not doing unnecessary scrolling of the window bar in a notepad++ trying to find if a similar warning/error message occurs again, if you know what I mean.

I have a particular type of output that I am looking for but I'll try to do some research on my own and try to script it. At this moment I just wanna see how you write your logic so that I can learn from it. I don't wanna bug you with stupid questions again n again. Once I do come up with some output, Ill try to post the script and maybe if you have time, do try to comment on it.
This User Gave Thanks to dsid For This Post:
# 10  
Old 03-13-2017
Hi,

Sorting the output would be a bit trickier than you might imagine, since while on the face of it that would be easy to do, you'd end up with a mixture of lines all run together with no way to tie them back to the block they were sorted from, if you see what I mean. But parsing the blocks to print out some kind of neatly summarised information on a single line is possible, if there are certain key parts that you'd want included in the summary.

If you think you've got enough to go on for now that's great, but if you would like anything further then if you can provide the details we can take things from there.
# 11  
Old 03-13-2017
what kind of sorting do you have in mind? What's the sorting criteria?
I foresee building up an awk hash index by "criteria" with actual block as hash-ed value and sorting the hash once built...
Just my $.02
# 12  
Old 03-13-2017
Hi drysdalk,

Yes I do understand. sorting would kind of get tricky. But a summarized information would also do the trick I guess. I did try to find a pattern in the 'Original presentment Not Found !' block and the only things of use were the 'Institution number'; 6th line from the 'BEGIN MESSAGE' and 'Acquirer Reference:'; 16th line from 'BEGIN MESSAGE'. The other warnings/errors don't have this particular information so I need to further analyze the logs to find a common pattern.

For the moment printing out the 'Institution number' and 'Acquirer Reference:' would kind of do the trick at least for the 'Original presentment Not Found !' block

Thanks for your help again.

---------- Post updated at 04:25 PM ---------- Previous update was at 04:06 PM ----------

Quote:
Originally Posted by vgersh99
what kind of sorting do you have in mind? What's the sorting criteria?
I foresee building up an awk hash index by "criteria" with actual block as hash-ed value and sorting the hash once built...
Just my $.02
i was looking for some sort of block sorting, for eg , a single block constitutes a
BEGIN MESSAGE and an END MESSAGE. The attachment at the start of the forum has the blocks. In this block would be an error/warning message. Based on that error/warning message, if my blocks are sorted, it would a bit easier for me to figure out how many of those error/warning message blocks are there in the original file

Hope my words made some sense. Let me know if its not clear

Last edited by dsid; 03-13-2017 at 01:30 PM..
# 13  
Old 03-13-2017
Hi,

This solution is a bit less efficient since it now relies on external binaries rather than shell built-ins, but for every block that has a Warning or Error, this will print out the Institution ID and the text of the error or warning.

Code:
#!/bin/bash

IFS=''

input=EXTRN071_copy.txt
tmp=/tmp/script.tmp

echo institution,errormessage
while read -r line
do
        case "$line" in
                *BEGIN\ MESSAGE*)
                        unset print
                        echo "$line" > "$tmp"
                        ;;
                *END\ MESSAGE*)
                        echo "$line" >> "$tmp"

                        if [ "$print" == "1" ]
                        then
                                institution=`/usr/bin/awk '$0 ~ /   Institution/ {sub(/\r$/,""); print $NF}' "$tmp"`
                                errormessage=`/bin/grep -E -A2 "^Warning|^Error" "$tmp" | /usr/bin/tail -1`
                                echo $institution,$errormessage
                        fi
                        ;;
                Warning*|Error*)
                        print=1
                        echo "$line" >> "$tmp"
                        ;;
                *)
                        echo "$line" >> "$tmp"
                        ;;
        esac
done < "$input"

Sample output:
Code:
$ ./script.sh 
institution,errormessage
00000029,Original presentment Not Found !
00000029,Non-financial original Slip Not Found !
00000029,Processing Failed For Transaction!
00000046,Transaction type of chargeback is not the same as that of original presentment.
00000046,Transaction type of chargeback is not the same as that of original presentment.
00000041,Original presentment Not Found !
00000041,Non-financial original Slip Not Found !
00000041,Processing Failed For Transaction!
00000041,Original presentment Not Found !
00000041,Non-financial original Slip Not Found !
00000041,Processing Failed For Transaction!
00000050,Original presentment Not Found !
00000050,Non-financial original Slip Not Found !
00000050,Processing Failed For Transaction!
00000050,Original presentment Not Found !
00000050,Non-financial original Slip Not Found !
00000050,Processing Failed For Transaction!
00000007,Original Transaction Not Found !
00000007,Processing Failed For Transaction!
00000007,No transactions processed!
00000007,PROCESSING ERROR! - check log for error messages.
$

Hope this helps in the meantime.

EDIT: If you want the output sorted, change the last line to:

done < "$input" | /usr/bin/sort
This User Gave Thanks to drysdalk For This Post:
# 14  
Old 03-13-2017
@drysdalk, @vgersh99 I did try to google and found a simple perl script and did some replacements with my own text

Code:
#!/bin/perl -w
$/ = '******* BEGIN MESSAGE *******';
$pattern = 'Original presentment Not Found !';
while ( <> )
{
    chomp;
    /$pattern/ or next;
    print $/;
    print $_;
}

but then again it just gave me those blocks. drysdalk's script already does that. For all different type of errors/warning messages, I would need to enter the pattern manually by first searching for it from the original file and then replacing it in the above script.

What if I did not need to enter the pattern manually, and the output the script would automatically give me the output of these patterns?

Actually would it not be possible to say first search for all similar patterns and when the next line is not similar it moves on. This way there is no requirement for a sort explicitly?

Please let me know if I am not clear

---------- Post updated at 04:43 PM ---------- Previous update was at 04:42 PM ----------

Quote:
Originally Posted by drysdalk
Hi,

This solution is a bit less efficient since it now relies on external binaries rather than shell built-ins, but for every block that has a Warning or Error, this will print out the Institution ID and the text of the error or warning.

Code:
#!/bin/bash

IFS=''

input=EXTRN071_copy.txt
tmp=/tmp/script.tmp

echo institution,errormessage
while read -r line
do
        case "$line" in
                *BEGIN\ MESSAGE*)
                        unset print
                        echo "$line" > "$tmp"
                        ;;
                *END\ MESSAGE*)
                        echo "$line" >> "$tmp"

                        if [ "$print" == "1" ]
                        then
                                institution=`/usr/bin/awk '$0 ~ /   Institution/ {sub(/\r$/,""); print $NF}' "$tmp"`
                                errormessage=`/bin/grep -E -A2 "^Warning|^Error" "$tmp" | /usr/bin/tail -1`
                                echo $institution,$errormessage
                        fi
                        ;;
                Warning*|Error*)
                        print=1
                        echo "$line" >> "$tmp"
                        ;;
                *)
                        echo "$line" >> "$tmp"
                        ;;
        esac
done < "$input"

Sample output:
Code:
$ ./script.sh 
institution,errormessage
00000029,Original presentment Not Found !
00000029,Non-financial original Slip Not Found !
00000029,Processing Failed For Transaction!
00000046,Transaction type of chargeback is not the same as that of original presentment.
00000046,Transaction type of chargeback is not the same as that of original presentment.
00000041,Original presentment Not Found !
00000041,Non-financial original Slip Not Found !
00000041,Processing Failed For Transaction!
00000041,Original presentment Not Found !
00000041,Non-financial original Slip Not Found !
00000041,Processing Failed For Transaction!
00000050,Original presentment Not Found !
00000050,Non-financial original Slip Not Found !
00000050,Processing Failed For Transaction!
00000050,Original presentment Not Found !
00000050,Non-financial original Slip Not Found !
00000050,Processing Failed For Transaction!
00000007,Original Transaction Not Found !
00000007,Processing Failed For Transaction!
00000007,No transactions processed!
00000007,PROCESSING ERROR! - check log for error messages.
$

Hope this helps in the meantime.

EDIT: If you want the output sorted, change the last line to:

done < "$input" | /usr/bin/sort

@drysdalk Let me try to understand your script and I'll get back to you.

Thanks a lot again
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Extract lines that have dupliucate and count them

Dear friends i have big file and i want to export the filw with new column for the lines that have same duplicate value in first column : ex : , ex : -bash-3.00$ cat INTCONT-IS.CSV M205-00-106_AMDRN:1-0-6-22,12-662-4833,intContact,2016-11-15 02:32:16,50... (9 Replies)
Discussion started by: is2_egypt
9 Replies

2. Shell Programming and Scripting

Extract count of string in all files and display on date wise

Hi All, hope you all are doing well! I kindly ask you for shell scripting help, here is the description: I have huge number of files shown below on date wise, which contains different strings(numbers you can say) including 505001 and 602001. ... (14 Replies)
Discussion started by: VasuKukkapalli
14 Replies

3. Shell Programming and Scripting

Skip the delimiter with in double quotes and count the number of delimiters during data extract

Hi All, I'm stuck-up in finding a way to skip the delimiter which come within double quotes using awk or any other better option. can someone please help me out. Below are the details: Delimited: | Sample data: 742433154|"SYN|THESIS MED CHEM PTY.... (2 Replies)
Discussion started by: BrahmaNaiduA
2 Replies

4. Shell Programming and Scripting

Extract and count number of Duplicate rows

Hi All, I need to extract duplicate rows from a file and write these bad records into another file. And need to have a count of these bad records. i have a command awk ' {s++} END { for(i in s) { if(s>1) { print i } } }' ${TMP_DUPE_RECS}>>${TMP_BAD_DATA_DUPE_RECS}... (5 Replies)
Discussion started by: Arun Mishra
5 Replies

5. Shell Programming and Scripting

Need help with sorting in paragraphs

I am very new to shell scripting, current try to do a sorting of a text file in paragraphs with ksh script. example: File content: A1100001 line 1 = "testing" line 2 = something, line 3 = 100 D1200003 line 1 = "testing" line 2 = something, line 3 = 100 B1200003 line 1 =... (3 Replies)
Discussion started by: gavin_L
3 Replies

6. Shell Programming and Scripting

Extract paragraphs under conditions

Hi all, I want to extract some paragraphs out of a file under certain conditions. - The paragraph must start with 'fmri' - The paragraph must contain the string 'restarter svc:/system/svc/restarter:default' My input is like that : fmri svc:/system/vxpbx:default state_time Wed... (4 Replies)
Discussion started by: Armoric
4 Replies

7. Shell Programming and Scripting

Extract string from multiple file based on line count number

Hi, I search all forum, but I can not find solutions of my problem :( I have multiple files (5000 files), inside there is this data : FILE 1: 1195.921 -898.995 0.750312E-02-0.497526E-02 0.195382E-05 0.609417E-05 -2021.287 1305.479-0.819754E-02 0.107572E-01 0.313018E-05 0.885066E-05 ... (15 Replies)
Discussion started by: guns
15 Replies

8. Shell Programming and Scripting

How to extract specific data and count number containing sets from a file?

Hello everybody! I am quit new here and hope you can help me. Using an awk script I am trying to extract data from several files. The structure of the input files is as follows: TimeStep parameter1 parameter2 parameter3 parameter4 e.g. 1 X Y Z L 1 D H Z I 1 H Y E W 2 D H G F 2 R... (2 Replies)
Discussion started by: Daniel8472
2 Replies

9. Shell Programming and Scripting

how to filter out some paragraphs in a file

Hi, I am trying to filter out those paragraphs that contains 'CONNECT', 'alter system switch logfile'. That means say the input file is : ------------------------------------------------------- Wed Jun 7 00:32:31 2006 ACTION : 'CONNECT' CLIENT USER: prdadm CLIENT TERMINAL: Wed Jun 7... (7 Replies)
Discussion started by: cnlhap
7 Replies

10. Shell Programming and Scripting

how to extract paragraphs from file in BASH script followed by prefix ! , !! and !!!

I]hi all i am in confusion since last 2 days :( i posted thraed yesterday and some friends did help but still i couldnt get solution to my problem let it be very clear i have a long log file of alkatel switch and i have to seperate the minor major and critical alarms shown by ! , !! and !!!... (6 Replies)
Discussion started by: nabmufti
6 Replies
Login or Register to Ask a Question