Sed/awk command to convert number occurances into date format and club a set of lines


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Sed/awk command to convert number occurances into date format and club a set of lines
# 1  
Old 10-13-2015
Sed/awk command to convert number occurances into date format and club a set of lines

Hi,

I have been stuck in this requirement where my file contains the below format.

Code:
20150812170500846959990854-25383-8.0.0
"ABC Report" hp96880
"4952"
20150812170501846959990854-25383-8.0.0 End of run
20150812060132846959990854-20495-8.0.0
"XYZ Report" vg76452
"1006962188"
20150812060141846959990854-20495-8.0.0
"ZZY Report" fu59172
20150812060147846959990854-20495-8.0.0 End of run

It follows the below pattern.
Line 1: Start Time
Line 2: Report Name and User
Line 3: Identifier
Line 4: End Time
In the following lines, the 2nd block is missing the End Time and the 3rd block is missing the Identifier.

The requirement is to
1. convert all lines starting with "20" into date format i.e. YYYY/MM/DD
2. Merge block from Start Time till End time separated by commas.
3. Ignore blocks that that don't have the end time.
4. Add a blank space in the block which doesn't contain identifier.
5. If possible, separate Report Name and User Name with comma.

The output should basically look like the below.
Code:
2015/08/12:17:05:00,"ABC Report",hp96880,"4952",2015/08/12:17:05:01
2015/08/12:06:01:41,"ZZY Report",fu59172,"",2015/08/12:06:01:47


I used the if loop for addressing the requirements but the script slows down when run for large files and hence I'm looking for a faster solution using sed or awk.
Can anyone please help me out here ?

Last edited by Scrutinizer; 10-13-2015 at 04:23 AM.. Reason: code tags
# 2  
Old 10-13-2015
Please use code tags as required by forum rules!

And, post your attempts so far.
# 3  
Old 10-13-2015
Hi Rudi,

Sorry.

I had used the below

Code:
joinstr=""
HDate=""
userrpt=""
while read line
do
printf "."
EndHeader=`echo $line | grep -c "End of run"`;
if [ "$EndHeader" -eq 1 ]
then
HDate=`echo $line | awk 'BEGIN { FS=OFS="," } {$1=substr($1,0,4)"/"substr($1,5,2)"/"substr($1,7,2)":"substr($1,9,2)":"substr($1,11,2)":"substr($1,13,2);print}'`
joinstr=$joinstr","$HDate
echo $joinstr >> $OUT_PATH/$OUT_FILE
joinstr="" 
else
BegHeader=`echo $line | grep -c "^20"`;
if [ "$BegHeader" -eq 1 ]
then
HDate=`echo $line | awk 'BEGIN { FS=OFS="," } {$1=substr($1,0,4)"/"substr($1,5,2)"/"substr($1,7,2)":"substr($1,9,2)":"substr($1,11,2)":"substr($1,13,2);print}'`
joinstr=$HDate
else
userrpt=`echo $line | sed 's/" /",/g' | sed 's/ "/,"/g'`
joinstr=$joinstr","$userrpt
fi
fi
done<tempreportuserfile

But the script slows down and hence would like a faster solution.

Last edited by Chinmaya Kabi; 10-13-2015 at 04:32 AM.. Reason: Used proper Code tags
# 4  
Old 10-13-2015
No surprise, you're creating 12 processes per line read. Try
Code:
awk '
function TMCVT(TStr)    {return substr(TStr,  1, 4) "/" substr(TStr,  5, 2) "/" substr(TStr,  7, 2) ":" \
                                substr(TStr,  9, 2) ":" substr(TStr, 11, 2) ":" substr(TStr, 13, 2) ":" \
                                substr(TStr, 15, 2)
                        }
                {CNT = split ($1, T, "-")
                 if (length (T[1]) == 26) TVAR = TMCVT(T[1])
                }

/End of run/    {print STRT, RPT, USR, ID, TVAR
                }
/^20/           {RPT = USR = ""
                 ID = "\" \""
                 STRT = TVAR
                 next
                }
NF == 1         {ID = $1
                 next
                }
                {for (i=1; i<NF; i++) RPT = RPT (RPT?FS:_) $i
                 USR = $NF
                }
' OFS="," file
2015/08/12:17:05:00:84,"ABC Report",hp96880,"4952",2015/08/12:17:05:01:84
2015/08/12:06:01:41:84,"ZZY Report",fu59172," ",2015/08/12:06:01:47:84

# 5  
Old 10-13-2015
Thanks Rudi,

I'm however getting the below errors.
Code:
awk: syntax error near line 2
awk: bailing out near line 2

Also is it possible for you to explain a bit as I'm quite new to these commands ?

Regards,
Chinmaya
# 6  
Old 10-13-2015
Citing Don Cragun: "If you are using a Solaris/SunOS system, use /usr/xpg4/bin/awk or nawk instead of awk ."

Try a pure shell solution as well:
Code:
while read V1 V2 V3 V4 REST
    do  TIM="${V1:0:4}/${V1:4:2}/${V1:6:2}:${V1:8:2}:${V1:10:2}:${V1:12:2}"
        [[ "$V1" == 20* ]] \
          &&    { [[ "$V2 $V3 $V4" == "End of run" ]] \
                  &&    { printf "%s,%s,%s,%s,%s\n" "$BEG" "$RPT" "$USR" "$ID" "$TIM"; } \
                  ||    { BEG="$TIM"; ID='" "'; continue; }   
                } 
        [[ "$V2" ]] || { ID=$V1; continue; }
        RPT="$V1 $V2"; USR="$V3"
    done < file
2015/08/12:17:05:00,"ABC Report",hp96880,"4952",2015/08/12:17:05:01
2015/08/12:06:01:41,"ZZY Report",fu59172," ",2015/08/12:06:01:47

It's quite difficult to sync in on those records with elements missing and fields consisting of several words. So the above is far from elegant and may benefit from some polishing...
This User Gave Thanks to RudiC For This Post:
# 7  
Old 10-13-2015
Thanks Rudi.

That helped to a major extent.
I'm looking into additional cases where there are multiple identifiers in the file and would attempt tweaking the code. I'll ask your help if I fail.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Date: invalid date trying to set Linux date in specific format

i try to set linux date & time in specific format but it keep giving me error Example : date "+%d-%m-%C%y %H:%M:%S" -d "19-01-2017 00:05:01" or date +"%d-%m-%C%y %H:%M:%S" -d "19-01-2017 00:05:01" keep giving me this error : date: invalid date ‘19-01-2017 00:05:01' Please use CODE tags... (7 Replies)
Discussion started by: umen
7 Replies

2. Shell Programming and Scripting

sed command to replace slash in date format only

Hello experts. I haven't been able to find a solution for this using the sed command. I only want to replace the forward slash with string "FW_SLASH" only if there's a number right after the slash while preserving the original number. I have a file containing 2 entries: Original File:... (5 Replies)
Discussion started by: pchang
5 Replies

3. Shell Programming and Scripting

awk convert date format

Could you tell me how to convert the following dates? If I have m/d/yyyy, I want to have 0m/0d/yyyy. I want my dates to always be 8 digits. In other words, I want a 0 inserted whenever the month or day is a single digit. My issue is first I need to use FS="," to get field $4 for the... (7 Replies)
Discussion started by: wbrunc
7 Replies

4. Shell Programming and Scripting

Display date in mm/dd/yy format in sed command

Hi All, Following is my issue. $MAIL_DOC = test.txt test.txt contains the following text . This process was executed in the %INSTANCE% instance on %RUNDATE%. I am trying to execute the following script var=`echo $ORACLE_SID | tr ` NOW=$(date +"%D") sed -e... (3 Replies)
Discussion started by: megha2525
3 Replies

5. Shell Programming and Scripting

How to convert date format such as 7/18/2015 to the number of month

How to convert date format such as 7/18/2015 to the number of month from requesting date 'date' in sh scripting ? Let say I have output in my log.txt -> 7/18/2015. How I convert it to the full number of month starting from 'date' till 7/18/2015 in shell scripting ? Thanks in advance. (1 Reply)
Discussion started by: amerabest
1 Replies

6. Shell Programming and Scripting

awk or sed - Convert 2 lines to 1 line

Hi, Just trying to get to grips with sed and awk for some reporting for work and I need some assistance: I have a file that lists policy names on the first line and then on the second line whether the policy is active or not. Policy Name: Policy1 Active: yes Policy... (8 Replies)
Discussion started by: guinch
8 Replies

7. Shell Programming and Scripting

Using awk or nawk to convert epoch time to date format

Looking for some help and usually when I do a search this site comes up. Hopefully someone can give me a little direction as to how to use one of these two commands to achieve what I'm trying to do. What am I trying to do? I need to take the time value in epoch format returned from the... (5 Replies)
Discussion started by: minigts
5 Replies

8. Shell Programming and Scripting

sed remove multiple set of lines in one command

is there a way with sed to removed more than one set of lines in one line? so i mean sed ${firstElem},${lastIndex}d web.xml > web1.xml this will delete lines between ${firstElem},${lastIndex} i want in the same line to do somethinkg like this (doesn't work so far) sed... (3 Replies)
Discussion started by: Poki
3 Replies

9. Shell Programming and Scripting

Format of SED command to change a date

I have a website. I have a directory within it with over a hundred .html files. I need to change a date within every file. I don't have an easy way to find/replace. I need to change 10/31 to 11/30 on every single page at once. I tried the command below but it didn't work. Obviously I don't know... (3 Replies)
Discussion started by: ijustsawmars
3 Replies

10. Shell Programming and Scripting

convert date format to mysql date format in log file

I have a comma delimited log file which has the date as MM/DD/YY in the 2nd column, and HH:MM:SS in the 3rd column. I need to change the date format to YYYY-MM-DD and merge it with the the time HH:MM:SS. How will I got about this? Sample input 02/27/09,23:52:31 02/27/09,23:52:52... (3 Replies)
Discussion started by: hazno
3 Replies
Login or Register to Ask a Question