Sorting a text file with respect to Function/Keyword


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Sorting a text file with respect to Function/Keyword
# 1  
Old 12-29-2014
Sorting a text file with respect to Function/Keyword

Hello Experts,

I am truly a beginner in shell and perl . Need an urgent help with sorting a file. please help. wouldn't mind whether in perl or shell script.

Here are the details.
Code:
------------------------------------------------------
 Input Text file EX:
------------------------------------------------------

There is one ENTRY & EXIT for each function/Thread but it may not be sequential. The input file is time sequential.

Code:
|TIME|THREAD_ID:3086054296|dtcp_Init|ENTRY|13:16:825897777:635006|
|TIME|THREAD_ID:3086054296|dtcp_Init|EXIT|13:16:825897777:695502|

|TIME|THREAD_ID:3086063296|cc_SockRecvBuffer|ENTRY|13:16:825897777:863804|
|TIME|THREAD_ID:3086063296|cc_SockRecvBuffer|EXIT|13:16:825897777:864584|

|TIME|THREAD_ID:3086067592|CC_SendAndRecieveMessage|ENTRY|13:16:825897777:159307|
|TIME|THREAD_ID:3086067592|http_SendMessage|ENTRY|13:16:825897777:159499|

|TIME|THREAD_ID:3086067592|http_TimeOut|ENTRY|13:16:825897777:160185|
|TIME|THREAD_ID:3086067592|http_TimeOut|EXIT|13:16:825897777:160379|

|TIME|THREAD_ID:3086067592|http_SendMessage|EXIT|13:16:825897777:161849|
|TIME|THREAD_ID:3086067592|CC_SendAndRecieveMessage|EXIT|13:16:825897777:191158|

Output
-------------------------
Output file should be sorted on ENTRY & EXIT of a function irrespective of time sequence.
For ex the above input lines should be output as follows

Code:
|TIME|THREAD_ID:3086054296|dtcp_Init|ENTRY|13:16:825897777:635006|
|TIME|THREAD_ID:3086054296|dtcp_Init|EXIT|13:16:825897777:695502|

|TIME|THREAD_ID:3086063296|cc_SockRecvBuffer|ENTRY|13:16:825897777:863804|
|TIME|THREAD_ID:3086063296|cc_SockRecvBuffer|EXIT|13:16:825897777:864584|

|TIME|THREAD_ID:3086067592|CC_SendAndRecieveMessage|ENTRY|13:16:825897777:159307|
|TIME|THREAD_ID:3086067592|CC_SendAndRecieveMessage|EXIT|13:16:825897777:191158|

|TIME|THREAD_ID:3086067592|http_TimeOut|ENTRY|13:16:825897777:160185|
|TIME|THREAD_ID:3086067592|http_TimeOut|EXIT|13:16:825897777:160379|

|TIME|THREAD_ID:3086067592|http_SendMessage|ENTRY|13:16:825897777:159499|
|TIME|THREAD_ID:3086067592|http_SendMessage|EXIT|13:16:825897777:161849|


Last edited by vbe; 12-29-2014 at 06:06 AM.. Reason: code tags for your code and data please, not Bold char
# 2  
Old 12-29-2014
Maybe a simple sort will do:
Code:
sort -t"|" -k4,4 -k6,6 /tmp/timeprofile_log.txt

? But, be aware that there are unmatched ENTRY lines in your file, e.g.:
Code:
|TIME|THREAD_ID:3086067592|http_RecvMessage|ENTRY|13:16:825897777:162055|
|TIME|THREAD_ID:3086067592|http_RecvMessage|ENTRY|13:16:825897777:213348|

# 3  
Old 12-29-2014
Hi Rudic,

Many many thanks for the reply.

Yes you are correct I ve trimmed the file due to big size. I will follow your suggestion and let you know Smilie.

Best Regards,
Pradyumna
# 4  
Old 12-29-2014
Hi Rudic,

You are genius..Smilie
It is working jus close to my expectation but one issue I am facing some issues like below.

Before sorting:
Code:
|TIME|THREAD_ID:3086067592|CC_StreamWaitForData|ENTRY|13:16:825897777:1|
|TIME|THREAD_ID:3086067592|CC_StreamWaitForData|EXIT|13:16:825897777:415|

|TIME|THREAD_ID:3086067592|CC_StreamWaitForData|ENTRY|13:16:825897777:5025|
|TIME|THREAD_ID:3086067592|CC_StreamWaitForData|EXIT|13:16:825897777:5474|

After Sorting
Code:
|TIME|THREAD_ID:3086067592|CC_StreamWaitForData|ENTRY|13:16:825897777:1|
|TIME|THREAD_ID:3086067592|CC_StreamWaitForData|EXIT|13:16:825897777:100085|
|TIME|THREAD_ID:3086067592|CC_StreamWaitForData|EXIT|13:16:825897777:100095|
|TIME|THREAD_ID:3086067592|CC_StreamWaitForData|EXIT|13:16:825897777:100102|
|TIME|

which is not expected. Thanks again..

And if possible could you please help to process further to reduce two lines to one.
Input
Code:
|TIME|THREAD_ID:3086063296|ake_socketSendBuffer|ENTRY|13:16:825897777:118523|
|TIME|THREAD_ID:3086063296|ake_socketSendBuffer|EXIT|13:16:825897777:119976|

Ouput
Code:
|TIME|THREAD_ID:3086063296|ake_socketSendBuffer|ENTRY|13:16:825897777:118523|EXIT|13:16:825897777:119976|

Best Regards,
PD

Last edited by rbatte1; 12-29-2014 at 09:04 AM.. Reason: Added CODE tags
# 5  
Old 12-29-2014
Hello pradyumnajpn10,

Could you please try following it is an awk command only, didn't test though. Kindly let us know if this helps.

Code:
awk  'FNR==NR && $0 !~ /^$/{X[$2 FS $3 FS $4]=$5!="ENTRY"?X[$2 FS $3 FS $4] $5 OFS $6:$0;next} {for(i in X){print X[i];delete X[i]}}' FS="|" OFS="|" Input_file Input_file

Thanks,
R. Singh

Last edited by RavinderSingh13; 12-29-2014 at 08:47 AM..
# 6  
Old 12-29-2014
You could try this (although not thoroughly tested):
Code:
sort -t"|" -k3,4 -k6 /tmp/timeprofile_log.txt |
awk     '$5=="ENTRY"            {if (N) printf "\n"
                                 N=1
                                 printf "%s", $0
                                 next
                                }
         $5=="EXIT"             {if (N) sub("^.*"$4"\|", "")
                                 print
                                }   
                                {N=0
                                }
        ' FS="|"

It looks like your input file is NOT in temporal order (e.g. ...:687531 ...:68745 ...:68754) if the last field has the time stamp.

Last edited by RudiC; 12-29-2014 at 10:29 AM..
# 7  
Old 12-29-2014
Hi Ravinder,

Many thanks for the reply.
Since I mentioned earlier the input file Entry & Exit are not sequential . For that I need to sort that file first and Rudic sort command is working cool.

@Rudic: Many many many thanks for the help.
I think my input file time stamp is not in order that’s why it is not sorting as I am expecting,
but still something is happening which is not meeting my requirments.

If you see the Line:84 of the Input file for the function "CC_StreamWaitForData" which the function "CC_StreamWaitForData" appearing for the 1st time in the the Input file.

|TIME|THREAD_ID:3086067592|CC_StreamWaitForData|ENTRY|13:16:825897777:36717|
|TIME|THREAD_ID:3086067592|CC_StreamWaitForData|EXIT|13:16:825897777:37586|


But after sorting the 1st appearance of the same function "CC_StreamWaitForData" is strange.

|TIME|THREAD_ID:3086067592|CC_StreamWaitForData|ENTRY|13:16:825897777:1| /*This is meaningful. coming according to the time stamp . Line 4653 from the Input file
|TIME|THREAD_ID:3086067592|CC_StreamWaitForData|EXIT|13:16:825897777:100085| /* But this line suddenly picked up from Line 7582 from the Input file.

Could you please suggest if I doing something wrong in the Input file format or it is as expected ?
Do you suggest me to change the input file format ?


Best regards,
PD
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Append a specific keyword in a text file into a new column

All, I have some sample text file(.csv) in the below format. In my actual file there are at least 100K rows. date 03/25/2016 A,B,C D,E,F date 03/26/2016 1,2,3 4,5,6 date 03/27/2016 6,4,3 4,5,6 I require the following output where in the date appeared at different locations need to... (3 Replies)
Discussion started by: ks_reddy
3 Replies

2. Shell Programming and Scripting

Search for a Keyword in file and replace another keyword or add at the end of line

Hi I want to implement something like this: if( keyword1 exists) then check if(keyword2 exists in the same line) then replace keyword 2 with New_Keyword else Add New_Keyword at the end of line end if eg: Check for Keyword JUNGLE and add/replace... (7 Replies)
Discussion started by: dashing201
7 Replies

3. Shell Programming and Scripting

cut with delimiter respect text

Hi, Can someone help me to get the shortest command Input file ========= I|know|"english|french" It|can|have|four|delimiters Desired output =========== "english|french" have If I use cut -d "|" -f3 , i am getting "english as 3rd field.But I would like to get the whole text in... (2 Replies)
Discussion started by: anandapani
2 Replies

4. UNIX Desktop Questions & Answers

Problem in sorting a text file

Hi; I have a text file like this: 1 10 11 2 3 4 M X Y When I sort it numerically using sort -n, it looks like this: Y X M 1 2 3 4 10 (3 Replies)
Discussion started by: a_bahreini
3 Replies

5. Shell Programming and Scripting

sorting based on a specified column in a text file

I have a tab delimited file with 5 columns 79 A B 20.2340 6.1488 8.5086 1.3838 87 A B 0.1310 0.0382 0.0054 0.1413 88 A B 46.1651 99.0000 21.8107 0.2203 89 A B 0.1400 0.1132 0.0151 0.1334 114 A B 0.1088 0.0522 0.0057 0.1083 115 A B... (2 Replies)
Discussion started by: Lucky Ali
2 Replies

6. Shell Programming and Scripting

Sorting a text file

In unix how to sort in reverse order based on second field in a text file. $ cat data1 David:501 Albie:503 Shaun:502 The expected output: Albie:503 Shaun:502 David:501 Please help :) (4 Replies)
Discussion started by: jon2ryhme
4 Replies

7. UNIX for Dummies Questions & Answers

Using $0 and 'Function' Keyword

Hi all, I had a query on the usage of $0 in shells. I would appreciate any assistance in this. We moved from a sun solaris server to a linux server. I ran 2 different pieces on these servers and in one case, the outputs didnt change and in the other case, the outputs were different. The 2... (3 Replies)
Discussion started by: novice1324
3 Replies

8. UNIX for Dummies Questions & Answers

sorting files with find command before sending to text file

i need help with my script.... i am suppose to grab files within a certain date range now i have done that already using the touch and find command (found them in other threads) touch -d "$date_start" ./tmp1 touch -d "$date_end" ./tmp2 find "$data_location" -maxdepth 1 -newer ./tmp1 !... (6 Replies)
Discussion started by: deking
6 Replies

9. Shell Programming and Scripting

awk error in sorting text file

Hi Having a file as below file.txt error Server Network Name Dept Date Time =========================================================================================================================== 0 ServerA LAN1 AAA IT01 04/30/2008 09:16:26 0 ... (3 Replies)
Discussion started by: karthikn7974
3 Replies

10. Shell Programming and Scripting

sorting dir with respect to their nr., of files

hello, I've got a lil shell program, which gets some directories as parameters.How can I sort this directories with respect to the nr of files they contain? The dir with the most files should be printed first. i've tried with ls -1|wc -w but i can't save this value and can not save the... (0 Replies)
Discussion started by: atticus
0 Replies
Login or Register to Ask a Question