Multi line extraction based on condition


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Multi line extraction based on condition
# 1  
Old 07-21-2014
Scissors Multi line extraction based on condition

Hi

I have some data in a file as below

Code:
******************************
Class 1A
Students absent are :
1. ABC
2. CDE
3. CPE

******************************
Class 2A
Students absent are :

******************************
Class 3A
Students absent are :

******************************
Class 17ACF
Students absent are :
1. ABCD
2. XYZ

From this file i just need to fetch/extract the data where ever there is some value for Students absent
Class name is dynamic and no of absent students are also dynamic
E.g. Output should look like

Code:
******************************
Class 1A
Students absent are :
1. ABC
2. CDE
3. CPE

******************************
Class 17ACF
Students absent are :
1. ABCD
2. XYZ

Pls help how could i do it via simple command or a script.

Thanks in advance
rel

Last edited by reldb; 07-21-2014 at 04:37 AM.. Reason: adding the code tag
# 2  
Old 07-21-2014
Moderator's Comments:
Mod Comment Please note that in the normal boards homework and coursework questions are forbidden.

You may use the special board for this kind of questions (additional rules apply, see there). Should this not be homework and the thread should be reopened please ask me (or any other moderator) to do so.

bakunin


Moderator's Comments:
Mod Comment Addendum: reldb avouched that this is not homework and his data is just for example purpose. Thread reopened.

Last edited by bakunin; 07-21-2014 at 12:05 PM..
# 3  
Old 07-21-2014
I suggest you read the man page of "grep" and find out what this utility can do for you. The man page is - like any other man page - accessible via

Code:
man grep

If you need to count lines you might want to give the "-c" options some special attention.

I hope this helps.

bakunin
# 4  
Old 07-21-2014
Hi reldb,
I can quickly give you an algorithm to this. Just convert it into unix code and make use of grep command for searching patterns.
create two temporary files file1.txt and file2.txt
Code:
scount=0
while read k
do

    if [ line starts with Class ]
    then
        put the line into a file1.txt
    
    elif [ line starts with Students ]
    then
        append the line into file1.txt
    
    elif [ line starts with a number ]
        scount+=1
        append the line into file1.txt
    elif [line is empty and scount >=1 ]
    then
        insert an empty line into file1.txt
        insert ****** into file1.txt
        append file1.txt data to file2.txt
        empty file1.txt
        scount=0
    elif [ line is empty and scount=0 ]
    then
        empty the file1.txt
    fi
done <"Sourcefile.txt"

The above could be used assuming the structure of your source file remains the same as you have provided.

Moderator's Comments:
Mod Comment edit by bakunin: please use CODE-tags - even for pseudocode. Thank you!

Last edited by bakunin; 07-21-2014 at 01:41 PM..
# 5  
Old 07-21-2014
You should have shown us what your attempts were. Anyhow, try
Code:
awk     '/\*\*\*/               {if (CNT>4) for (i=1;i<=CNT;i++) print T[i]; CNT=0}
                                {T[++CNT]=$0}
         END                    {if (CNT>4) for (i=1;i<=CNT;i++) print T[i]}
        '  file
******************************
Class 1A
Students absent are :
1. ABC
2. CDE
3. CPE

******************************
Class 17ACF
Students absent are :
1. ABCD
2. XYZ

EDIT: This was nice but it didn't quite satisfy your spec:
Code:
awk'(A=gsub (/\n/, "&"))>4||A==0' RS="*" ORS="*" file
******************************
Class 1A
Students absent are :
1. ABC
2. CDE
3. CPE

****************************************************************************************
Class 17ACF
Students absent are :
1. ABCD
2. XYZ
*

# 6  
Old 07-26-2014
DHeisenberg - Thanks for suggestion. I wrote a program on similar patter in java and it is working perfectly fine.


RudiC - Thanks for your suggestion, Below one worked fine. (got some error in 2nd suggestion with live data)
Code:
awk     '/\*\*\*/               {if (CNT>4) for (i=1;i<=CNT;i++) print T[i]; CNT=0}
                                {T[++CNT]=$0}
         END                    {if (CNT>4) for (i=1;i<=CNT;i++) print T[i]}
        '  file

I have couple of question to understand it better and use it for other future requirement as well.

1. /\*\*\*/ is extracting the paragraph based on *** pattern and then based on number of line/rows count result is getting printed.
Instead of counting the number of lines if i want to check in this paragraph if any line starts with a number(.) then print it (kind of true or false logic) then how to do

2. I couldn't understand the logic of print 2 times (one before end and other after end with similar logic) even though final output is only once.

Thanks
Moderator's Comments:
Mod Comment Please use CODE tags (not ICODE tags) for multi-line samples of input, output, and code segments.

Last edited by Don Cragun; 07-26-2014 at 06:14 PM.. Reason: Change ICODE tags to CODE tags.
# 7  
Old 07-26-2014
You need another condition plus another variable.
This one uses a string to store the line (simpler than an array).
Code:
awk '
$1~/\*\*\*/ {if (c>0) print buf; c=0; buf=$0; next}
{buf=buf RS $0}
$1~/[0-9]+\./ {c++}
END {if (c>0) print buf}
' file

Because it prints only at the *** lines, and your example does not end with it, you need another print at the end, otherwise your last section is never printed.

Last edited by MadeInGermany; 07-27-2014 at 01:05 PM.. Reason: buf="" replaced by buf=$0; next
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Search a multi-line shell command output and execute logic based on result

The following is a multi-line shell command example: $cargo build Compiling prawn v0.1.0 (/Users/ag/rust/prawn) error: failed to resolve: could not find `setup_panix` in `human_panic` --> src/main.rs:14:22 | 14 | human_panic::setup_panix!(); | ... (2 Replies)
Discussion started by: yogi
2 Replies

2. Shell Programming and Scripting

Help with tag value extraction from xml file based on a matching condition

Hi , I have a situation where I need to search an xml file for the presence of a tag <FollowOnFrom> and also , presence of partial part of the following tag <ContractRequest _LoadId and if these 2 exist ,then extract the value from the following tag <_LocalId> which is "CW2094139". There... (2 Replies)
Discussion started by: paul1234
2 Replies

3. Shell Programming and Scripting

Help with XML tag value extraction based on condition

sample xml file part <?xml version="1.0" encoding="UTF-8"?><ContractWorkspace xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" _LoadId="export_AJ6iAFmh+pQHq1" xsi:noNamespaceSchemaLocation="ContractWorkspace.xsd"> <_LocalId>CW2218471</_LocalId> <Active>true</Active> ... (3 Replies)
Discussion started by: paul1234
3 Replies

4. Shell Programming and Scripting

Help with XML tag value extraction based on matching condition

sample xml file part <DocumentMinorVersion>0</DocumentMinorVersion> <DocumentVersion>1</DocumentVersion> <EffectiveDate>2017-05-30T00:00:00Z</EffectiveDate> <FollowOnFrom> <ContractRequest _LoadId="export_AJ6iAFoh6g0rE9"> <_LocalId>CRW2218451</_LocalId> ... (4 Replies)
Discussion started by: paul1234
4 Replies

5. Shell Programming and Scripting

Remove line based on condition in awk

In the following tab-delimited input, I am checking $7 for the keyword intronic. If that keyword is found then $2 is split by the . in each line and if the string after the digits or the +/- is >10, then that line is deleted. This will always be the case for intronic. If $7 is exonic then nothing... (10 Replies)
Discussion started by: cmccabe
10 Replies

6. Shell Programming and Scripting

Print lines based on line number and specified condition

Hi, I have a file like below. 1,2,3,4,5,6,7,8,9I would like to print or copied to a file based of line count in perl If I gave a condition 1 to 3 then it should iterate over above file and print 1 to 3 and then again 1 to 3 etc. output should be 1,2,3 4,5,6 7,8,9 (10 Replies)
Discussion started by: Anjan1
10 Replies

7. Shell Programming and Scripting

Multi-line filtering based on multi-line pattern in a file

I have a file with data records separated by multiple equals signs, as below. ========== RECORD 1 ========== RECORD 2 DATA LINE ========== RECORD 3 ========== RECORD 4 DATA LINE ========== RECORD 5 DATA LINE ========== I need to filter out all data from this file where the... (2 Replies)
Discussion started by: Finja
2 Replies

8. Shell Programming and Scripting

ksh: how to extract strings from each line based on a condition

Hi , I'm a newbie.Never worked on Unix before. I want a shell script to perform the following: I want to extract strings from each line ,based on the type of line(Nameline,Subline) and output it to another file.Below is a sample format. 2010-12-21 14:00"1"Nameline"Midterm"First Name:Jane ... (4 Replies)
Discussion started by: angie1234
4 Replies

9. Shell Programming and Scripting

Multi line document to single lines based on occurance of string

Hi Guys, I am new to awk and sed, i am working multiline document, i want to make make that document into SINGLE lines based on occurace of string "dwh". here's the sample of my problem.. dwh123 2563 4562 4236 1236 78956 12394 4552 dwh192 2656 46536 231326 65652 6565 23262 16625623... (5 Replies)
Discussion started by: victor369
5 Replies

10. Shell Programming and Scripting

awk to print lines based on string match on another line and condition

Hi folks, I have a text file that I need to parse, and I cant figure it out. The source is a report breaking down softwares from various companies with some basic info about them (see source snippet below). Ultimately what I want is an excel sheet with only Adobe and Microsoft software name and... (5 Replies)
Discussion started by: rowie718
5 Replies
Login or Register to Ask a Question