Multi line sorting in Linux


Login or Register for Dates, Times and to Reply

 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Multi line sorting in Linux
# 1  
Multi line sorting in Linux

I have log files with following format -
Code:
YYYY/MM/DD HH:mm:ss.msec|field2|filed3|  log message

Now the message itself can be multi line message containing new line character.
for e.g.
Code:
2013/02/05 15:33:12.234|abc|xyz| This is first single line message.
2013/02/05 15:33:12.786|abc|xyz| This is a multiple
       line message continued
to many lines.
2013/02/05 15:33:12.413|abc|xyz| This is second single line message.
2013/02/05 15:33:12.945|abc|xyz| This is last single line message.

I would like to sort this file based on time stamp ascending order, i.e., output like this -
Code:
2013/02/05 15:33:12.234|abc|xyz| This is first single line message.
2013/02/05 15:33:12.413|abc|xyz| This is second single line message.
2013/02/05 15:33:12.786|abc|xyz| This is a multiple
       line message continued
to many lines.
2013/02/05 15:33:12.945|abc|xyz| This is last single line message.

Thanks in advance for looking to it and helping out.
# 2  
Try sth like this...

Code:
 
$ awk '/2013/{if(s){print s}s=$0}
!/2013/{s=s"_^_"$0}END{print s}' file3 | sort | sed 's/_\^_/\n/g'

2013/02/05 15:33:12.234|abc|xyz| This is first single line message.
2013/02/05 15:33:12.413|abc|xyz| This is second single line message.
2013/02/05 15:33:12.786|abc|xyz| This is a multiple
       line message continued
to many lines.
2013/02/05 15:33:12.945|abc|xyz| This is last single line message.

This User Gave Thanks to pamu For This Post:
# 3  
Thanks Pamu....

I haven't tried your suggestion......
but just for understanding, what it does is -
1. check the pattern 2013 .
2. if the pattern is found, print it as such.
3. If the pattern is not found, prefix each pf those lines with _^_, then sort and replace back.

pls correct if wrong.

so making it generic, i can use a 4 digit year pattern as well, so as not to restrict with 2013, and infact can use my time stamp prefix itself as pattern. right?

another,
does
Code:
{s=s"_^_"$0}END{print s}

takes care of newline replacement as well?

Thanks again for your time.
# 4  
Quote:
Originally Posted by gini32
Thanks Pamu....

I haven't tried your suggestion......
but just for understanding, what it does is -
1. check the pattern 2013 .
2. if the pattern is found, print it as such.
3. If the pattern is not found, prefix each pf those lines with _^_, then sort and replace back.

pls correct if wrong.

so making it generic, i can use a 4 digit year pattern as well, so as not to restrict with 2013, and infact can use my time stamp prefix itself as pattern. right?

another,
does
Code:
{s=s"_^_"$0}END{print s}

takes care of newline replacement as well?

Thanks again for your time.
Hi gini32,
Reformatting pamu's script and adding line numbers for discussion purposes:
Code:
1 awk '
2 /2013/{if(s){print s}
3         s=$0}
4 !/2013/{s=s"_^_"$0}
5 END{    print s}
6 ' file3 | sort | sed 's/_\^_/\n/g'

Note that the line numbers cannot actually appear in your awk script; they are just to make this discussion easier.
The awk program is made up of the commands on lines 2 through 5.

Line 2 selects any line that contains the string 2013 and assumes that it is the 1st line of an entry. (If 2013 could appear anywhere other than at the start of a line, it would be safer to change /2013/ to /^2013/ so the line will be selected only if 2013 appears as the 1st four characters on the line.) The first time you get here, the variable s will be an empty string and the print command will not be executed.

Line 3 then sets s to the current input line.

Line 4 appends every line that does not contain the string2013 to the end of the variable s using the string _^_ (rather than newline) as the output line separator. (If you change /2013/ to /^2013/ on line 2, you need to make the same change on line 4.
Lines 2-4 are then repeated until all lines have been read from the input file.

Line 5 prints the last line from the value accumulated in the variable s.

Line 6 specifies that the input file for the awk script is the file named file3, sorts the output from awk, and then uses sed to change the _^_ line separators that were inserted by awk back into newline characters.

Note that this script assumes that the concatenated lines won't be longer than {LINE_MAX} bytes on your system. If this isn't true the script may fail because awk, sort, and sed are only guaranteed to work if input and output files being processed are text files (which, by definition, have lines no longer than {LINE_MAX} bytes including the terminating newline character. (You can find the value of {LINE_MAX} on your system by running the command:
Code:
getconf LINE_MAX

On systems that conform the POSIX or UNIX Standards, {LINE_MAX} must be at least 2048.
This User Gave Thanks to Don Cragun For This Post:
# 5  
Thanks a ton to both Pamu and Don Cragun.
That really helped.
# 6  
(GNU) sort on Linux has a -z option to sort NULL delimited, multi-line records:
Code:
sed 's/^2013/\x0&/' log-file |sort -z |tr -d '\0'

These 2 Users Gave Thanks to binlib For This Post:
# 7  
Thanks binlib.
sort -z has got its worth.
But
Code:
tr -d '\0'

is not working to convert back null characters....but let me figure out myself.....

Thanks again.
Login or Register for Dates, Times and to Reply

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

Test Your Knowledge in Computers #1000
Difficulty: Medium
Trusted Computer System Evaluation Criteria (TCSEC) is frequently referred to as the Red Book.
True or False?

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Merge multi-lines into one single line using shell script or Linux command

Hi, Can anyone help me for merge the following multi-line log which beginning with a " and line ending with ": into one line. *****Original Log***** 087;2008-12-06;084403;"mc;;SYHLR6AP1D\LNZW;AD-703;1;12475;SYHLR6AP1B;1.1.1.1;0000000062;HGPDI:MSISDN=12345678,APNID=1,EQOSID=365;... (3 Replies)
Discussion started by: rajeshlinux2010
3 Replies

2. Shell Programming and Scripting

Multi line log files to single line format

I want to read the log file which was generate from other command . And the output was having multi line in log files for job name and server name. But i need to make all the logs on one line Source file 07/15/2018 17:02:00 TRANSLOG_1700 Server0005_SQL ... (2 Replies)
Discussion started by: ranjancom2000
2 Replies

3. Shell Programming and Scripting

Help with reformat single-line multi-fasta into multi-line multi-fasta

Input File: >Seq1 ASDADAFASFASFADGSDGFSDFSDFSDFSDFSDFSDFSDFSDFSDFSDFSD >Seq2 SDASDAQEQWEQeqAdfaasd >Seq3 ASDSALGHIUDFJANCAGPATHLACJHPAUTYNJKG ...... Desired Output File >Seq1 ASDADAFASF ASFADGSDGF SDFSDFSDFS DFSDFSDFSD FSDFSDFSDF SD >Seq2 (4 Replies)
Discussion started by: patrick87
4 Replies

4. Shell Programming and Scripting

Multi-line filtering based on multi-line pattern in a file

I have a file with data records separated by multiple equals signs, as below. ========== RECORD 1 ========== RECORD 2 DATA LINE ========== RECORD 3 ========== RECORD 4 DATA LINE ========== RECORD 5 DATA LINE ========== I need to filter out all data from this file where the... (2 Replies)
Discussion started by: Finja
2 Replies

5. Shell Programming and Scripting

Joining multi-line output to a single line in a group

Hi, My Oracle query is returing below o/p ---------------------------------------------------------- Ins trnas value a lkp1 x a lkp1 y b lkp1 a b lkp2 x b lkp2 y ... (7 Replies)
Discussion started by: gvk25
7 Replies

6. Shell Programming and Scripting

Multi level sorting script

I want to sort like below Suppose few lines in a file is like this systemid:ABC messagedestination:batchxpr replytoqname: myca systemid:BCD messagedestination:realtime replytoqname: myca systemid:ABC messagedestination:realtime replytoqname: eac systemid: BCD messagedestination:mqonline... (1 Reply)
Discussion started by: srkmish
1 Replies

7. UNIX for Dummies Questions & Answers

Assistance with combining, sorting and saving multi files into one new file

Good morning. I have a piece of code that is currently taking multiple files and using the CAT.exe command to combine into one file that is then sorted in reverse order based on the 3rd field of the file, then displayed on screen. I am trying to change this so that the files are being combined into... (4 Replies)
Discussion started by: jaacmmason
4 Replies

8. Shell Programming and Scripting

Sorting multi-column values from a specific file

Hi, all. I need a shell script which gathers data from a remote XML file and then displays it according to my needs.. I need this for my job due to the fact that I need to keep track price changes of euro, usd, gold, etc. The XML file I am talking about is located at this page: cnnturk dot... (4 Replies)
Discussion started by: canimsin
4 Replies

9. Shell Programming and Scripting

sorting multi dimensional array

Hi there, Can someone let me know how to sort the 2 dimensional array below by column 1 then by column 2? 22 55 2222 2230 33 66 44 58 222 240 11 25 22 60 33 45 output: 11 25 22 55 22 60 33 45 33 66 44 58 (6 Replies)
Discussion started by: phoeberunner
6 Replies

10. UNIX for Advanced & Expert Users

sorting data based on multi columns

Hi all I have data in following format: CSCH74,2007,1,09103,15 CSCH74,2007,10,09103,0 CSCH74,2007,11,09103,0 CSCH74,2007,12,09103,0 CSCH74,2007,2,09103,15 CSCH74,2007,3,09103,194 CSCH74,2007,4,09103,115 CSCH74,2007,5,09103,66 CSCH74,2007,6,09103,0 CSCH74,2007,7,09103,0... (2 Replies)
Discussion started by: sumeet
2 Replies

Featured Tech Videos