ksh: how to extract strings from each line based on a condition


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting ksh: how to extract strings from each line based on a condition
# 1  
Old 03-14-2012
ksh: how to extract strings from each line based on a condition

Hi ,
I'm a newbie.Never worked on Unix before. I want a shell script to perform the following:
I want to extract strings from each line ,based on the type of line(Nameline,Subline) and output it to another file.Below is a sample format.

Code:
2010-12-21 14:00"1"Nameline"Midterm"First Name:Jane     Last Name: Doe      "
2010-12-21 14:00"2"Subline"Midterm"Subject:Mathematics                     "
2010-12-21 14:00"3"Markline"Midterm"Int.Marks:48   Ext.Marks:92             "
2010-12-21 14:00"4"Subline"Midterm"Subject:Physics                         "
2010-12-21 14:00"5"Markline"Midterm"Int.Marks:30   Ext.Marks:86             "
2010-12-21 14:00"6"Subline"Midterm"Subject:Chemistry                       "
2010-12-21 14:00"7"Markline"Midterm"Int.Marks:45   Ext.Marks:98             "
2011-04-22 09:00"1"Nameline"Final"First Name:John     Last Name: Smith    "
2010-04-22 09:00"2"Subline"Final"Subject:Mathematics                     "
2010-04-22 09:00"3"Markline"Final"Int.Marks:30   Ext.Marks:76             "
2010-04-22 09:00"4"Subline"Final"Subject:Physics                         "
2010-04-22 09:00"5"Markline"Final"Int.Marks:32   Ext.Marks:70             "
2010-04-22 09:00"6"Subline"Final"Subject:Chemistry                       "
2010-04-22 09:00"7"Markline"Final"Int.Marks:41   Ext.Marks:88             "

From the Nameline, string such a Jane and Doe should be extracted and be outputted to a new file in the format shown as below.The fields that are not applicable are to be left blank for the line. The structure and length of the last column is same for a particular type of line, for ex., every first name starts at position 12 of the last column and is of length 8 and every last name starts at position 31 and is of length 9. Similiarly for other lines the position of the values to be extracted are fixed.

Code:
<<Exam_Date,Exam_Time,Line_Type,Exam_Type,Student_FName,Student_LName,Subject,Int_Marks,Ext_Marks>>
2010-12-21,14:00,Nameline,Midterm,Jane,Doe,   ,   ,   "
2010-12-21,14:00,Subline,Midterm,   ,   ,Mathematics,   ,   "
2010-12-21,14:00,Markline,Midterm,   ,   ,   ,48,92"
2010-12-21,14:00,Subline,Midterm,   ,   ,Physics,   ,   "
2010-12-21,14:00,Markline,Midterm,   ,   ,   ,30,86"
2010-12-21,14:00,Subline,Midterm,   ,   ,Chemistry,   ,   "
2010-12-21,14:00,Markline,Midterm,   ,   ,   ,45,98"

Can anyone please help me out.
Thanks!
# 2  
Old 03-14-2012
This looks very much like a homework/schoolwork assignment.

Please follow our rules and repost in the homework forums, which require some special information.

Thanks you.
# 3  
Old 03-14-2012
Provisionally reopened on request since it may not actually be homework, the data is intentionally simplified and obscured compared to the real data.
This User Gave Thanks to Corona688 For This Post:
# 4  
Old 03-14-2012
Code:
$ cat unknown.awk

BEGIN { FS="\""
        OFS="," }

# Don't repeat the same tail if we don't recognize something.
# Substitute "2010-12-21 14:00" into "2010-12-21,14:00"
{       TAIL="";        sub(/ /, ",", $1)       }

# Whenever we see a type we recognize, assemble a 'tail' to append to the line
$3 == "Subline" {
        VAL=substr($5, 9, 20);
        sub(/ *$/, "", VAL); # Strip off extra spaces
        TAIL="   " OFS "   " OFS VAL OFS "   " OFS "   \"";
}


$3 == "Nameline" {
        FNAME=substr($5, 12, 8);
        LNAME=substr($5, 32, 9);
        sub(/ *$/, "", FNAME);
        sub(/ *$/, "", LNAME);

        TAIL=LNAME OFS FNAME OFS "   "OFS"   "OFS"   \""
}

$3 == "Markline" {
        TAIL="   ,   ,   ," substr($5, 11, 2) "," substr($5, 26, 2) "\""
}

# Print the line with fields in the new order and tail appended.
{
        print $1 OFS $3 OFS $4 OFS TAIL
}

$ awk -f unknown.awk data

2010-12-21,14:00,Nameline,Midterm,Doe,Jane,   ,   ,   "
2010-12-21,14:00,Subline,Midterm,   ,   ,Mathematics,   ,   "
2010-12-21,14:00,Markline,Midterm,   ,   ,   ,48,92"
2010-12-21,14:00,Subline,Midterm,   ,   ,Physics,   ,   "
2010-12-21,14:00,Markline,Midterm,   ,   ,   ,30,86"
2010-12-21,14:00,Subline,Midterm,   ,   ,Chemistry,   ,   "
2010-12-21,14:00,Markline,Midterm,   ,   ,   ,45,98"
2011-04-22,09:00,Nameline,Final,Smith,John,   ,   ,   "
2010-04-22,09:00,Subline,Final,   ,   ,Mathematics,   ,   "
2010-04-22,09:00,Markline,Final,   ,   ,   ,30,76"
2010-04-22,09:00,Subline,Final,   ,   ,Physics,   ,   "
2010-04-22,09:00,Markline,Final,   ,   ,   ,32,70"
2010-04-22,09:00,Subline,Final,   ,   ,Chemistry,   ,   "
2010-04-22,09:00,Markline,Final,   ,   ,   ,41,88"

$


Last edited by Corona688; 03-14-2012 at 02:18 PM.. Reason: mispaste
# 5  
Old 03-14-2012
Code:
tr -s ":\"" "  " < infile | awk '
   $5 == "Nameline" { printf "%s,%s:%s,%s,%s,%s,%s,   ,    ,   \"\n", $1, $2, $3, $5, $6, $12, $9 }
   $5 == "Subline"  { printf "%s,%s:%s,%s,%s,   ,   ,%s,   ,   \"\n" , $1, $2, $3, $5, $6, $8 }
   $5 == "Markline" { printf "%s,%s:%s,%s,%s,   ,   ,   ,%s,%s\"\n" , $1, $2, $3, $5, $6, $8, $10 }
' > outfile

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove line based on condition in awk

In the following tab-delimited input, I am checking $7 for the keyword intronic. If that keyword is found then $2 is split by the . in each line and if the string after the digits or the +/- is >10, then that line is deleted. This will always be the case for intronic. If $7 is exonic then nothing... (10 Replies)
Discussion started by: cmccabe
10 Replies

2. Shell Programming and Scripting

Extract multiple strings from line

Hello I have an output that has a string between quotes and another between square brackets on the same line. I need to extract these 2 strings Example line Device "nrst3a" attributes=(0x4) RAW SERIAL_NUMBER=SNL2 Output should look like nrst3a VD073AV1443BVW00083 I was trying with sed... (3 Replies)
Discussion started by: bombcan
3 Replies

3. Shell Programming and Scripting

Extract batch based on condition

HI, I have a file as mentioned below. Here one batch is for one user id.Batch starts from |T row and ends at .T row. I want to create a new file by reading this file. The condition is for record 10(position 1-2), if position 3 to position 17 is 0 then delete the entire batch and write into the new... (9 Replies)
Discussion started by: abhi.mit32
9 Replies

4. Shell Programming and Scripting

Print lines based on line number and specified condition

Hi, I have a file like below. 1,2,3,4,5,6,7,8,9I would like to print or copied to a file based of line count in perl If I gave a condition 1 to 3 then it should iterate over above file and print 1 to 3 and then again 1 to 3 etc. output should be 1,2,3 4,5,6 7,8,9 (10 Replies)
Discussion started by: Anjan1
10 Replies

5. UNIX for Dummies Questions & Answers

Extract strings based on the value

I have a file with multiple columns (in this case, the file has 3 columns): NM_001006304 (-33.7) XM_418228 (-38.4) JN880447 (-33.7) CR387600 (-33.7) CR524203 (-36.3) GALGA_6AKII_KRT75 (-33.7) GALGA25_SC7 (-31.9) CR352795 (-36.3) NM_204172 (-31.7) NM_204137 (-31.9) NM_001030561 (-36.3) AB011672... (7 Replies)
Discussion started by: yuejian
7 Replies

6. Shell Programming and Scripting

Multi line extraction based on condition

Hi I have some data in a file as below ****************************** Class 1A Students absent are : 1. ABC 2. CDE 3. CPE ****************************** Class 2A Students absent are : ****************************** Class 3A Students absent are : (6 Replies)
Discussion started by: reldb
6 Replies

7. Shell Programming and Scripting

ksh : Building an array based on condition result

I want to build an Errorlog. I would like to build an array as I move through the if statements and print the array once all error conditions have been defined. The results need to be comma delimited. tsver will be static "1.9.6(2)" other vars $prit $lt $rt can have the same or a different... (1 Reply)
Discussion started by: popeye
1 Replies

8. Shell Programming and Scripting

Extract Line and Column from CSV Line in ksh or bash format

Hi, I was doing some research and can't seem to find anything. I'm trying to automate a process by creating a script to read a csv line and column and assigning that value to a variable for the script to process it. Also if you could tell me the line and column if it's on another work ... (3 Replies)
Discussion started by: vpundit
3 Replies

9. Shell Programming and Scripting

extract xml tag based on condition

Hi All, I have a large xml file of invoices. The file looks like below: <INVOICES> <INVOICE> <NAME>Customer A</NAME> <INVOICE_NO>1234</INVOICE_NO> </INVOICE> <INVOICE> <NAME>Customer A</NAME> <INVOICE_NO>2345</INVOICE_NO> </INVOICE> <INVOICE> <NAME>Customer A</NAME>... (9 Replies)
Discussion started by: angshuman
9 Replies

10. Shell Programming and Scripting

how to extract multiple strings from a line

Hi I have the following requirement. i have the following line from a log file one : two : Three : four : five : six : seven : eight :nine :ten Now can you pls help what i should do to get only the following output from the above line two : five : six : seven : Eight appreciate your... (3 Replies)
Discussion started by: vin_eme
3 Replies
Login or Register to Ask a Question