Log4j combining lines to single line


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Log4j combining lines to single line
# 1  
Old 11-20-2018
Log4j combining lines to single line

Hi,
Our log4j file contents look like this:
Code:
2018-11-20T00:06:58,888  INFO [HiveServer2-Background-Pool: Thread-21912] ql.Driver: Executing command(queryId=hive_20181120000656_49af4ad0-1d37-4312-872c-a247ed80c181): CREATE TABLE RESULTS.E7014485_ALL_HMS_CAP1
 AS SELECT name,dept
 from employee
  Where employee='Jeff'
2018-11-20T00:06:58,888  INFO [HiveServer2-Background-Pool: Thread-21912] ql.Driver: Query ID = hive_20181120000656_49af4ad0-1d37-4312-872c-a247ed80c181
2018-11-20T00:06:58,888  INFO [HiveServer2-Background-Pool: Thread-21912] ql.Driver: Executing command(queryId=hive_20181120000656_49af4ad0-1d37-4312-872c-a247ed80c182): CREATE TABLE RESULTS.E7014485_ALL_HMS_CAP2
 AS SELECT name,dept
 from employee
  Where employee='Yung'
2018-11-20T00:06:58,888  INFO [HiveServer2-Background-Pool: Thread-21912] ql.Driver: Query ID = hive_20181120000656_49af4ad0-1d37-4312-872c-a247ed80c182

As you can see the create statement is across many lines, and the number of lines can vary.
I need to have only one line per entry.
My output should look like this:
Code:
2018-11-20T00:06:58,888  INFO [HiveServer2-Background-Pool: Thread-21912] ql.Driver: Executing command(queryId=hive_20181120000656_49af4ad0-1d37-4312-872c-a247ed80c181): CREATE TABLE RESULTS.E7014485_ALL_HMS_CAP1 AS SELECT name,dept from employee  Where employee='Jeff'
2018-11-20T00:06:58,888  INFO [HiveServer2-Background-Pool: Thread-21912] ql.Driver: Query ID = hive_20181120000656_49af4ad0-1d37-4312-872c-a247ed80c181
2018-11-20T00:06:58,888  INFO [HiveServer2-Background-Pool: Thread-21912] ql.Driver: Executing command(queryId=hive_20181120000656_49af4ad0-1d37-4312-872c-a247ed80c182): CREATE TABLE RESULTS.E7014485_ALL_HMS_CAP2 AS SELECT name,dept from employee  Where employee='Yung'
2018-11-20T00:06:58,888  INFO [HiveServer2-Background-Pool: Thread-21912] ql.Driver: Query ID = hive_20181120000656_49af4ad0-1d37-4312-872c-a247ed80c182

Any idea on how to achieve this?

I was trying sed and some regex patterns, but was unable to make it work

Thanks
# 2  
Old 11-20-2018
Where and how did you get stuck with your "sed and some regex patterns" attempt?
And, what OS, shell, sed versions are you using?
# 3  
Old 11-20-2018
Hi,
The idea is any line which does not start with a data, replace the first character with a backspace.
So I tried the command below:

Code:
sed '/^[[:digit:]]\{4\}-[[:digit:]]\{2\}-[[:digit:]]\{2\}T[[:digit:]]\{2\}:[[:digit:]]\{2\}:[[:digit:]]\{2\},[[:digit:]]\{3\}\ [[:alpha:]]*/! s/^/^\b/g' logfile.txt

But backspace is not working, maybe character is wrong, or I need to try another way.

OS: AWS AMI
Shell: Bash
# 4  
Old 11-21-2018
Quote:
Originally Posted by wahi80
But backspace is not working, maybe character is wrong, or I need to try another way.
Indeed. When you work with sed, especially when you are about to do rather complex things, it pays to first define as exactly as possible what you are going to do, so the first step is to describe (in as excruciating detail as possible) what we are going to do and when. If in the following my assumptions are wrong don't hesitate to correct them.

We want to rearrange the line endings, so that lines only start with a "clause" of this type:

Code:
2018-11-20T00:06:58,888  INFO [HiveServer2-Background-Pool: Thread-21912] ql.Driver:

Question: might this clause also be spread over several lines? If yes we need to do more work, for now i assume it isn't.

What do we need to do when we encounter such a clause? We need to start collecting text until we hit another such clause - or the end of file - which is when we need to output everything collected so far in one line. For all the other lines we encounter this means: they must be part of such a previous line and we simply collect them to what we have already. Now let us formalise this into rules what we do when:

Code:
Lines starting with the clause:
          - remove the newlines from the last line if there is one
          - output the last line if there is one
          - clear the collecting buffer
          - put the new line into the collecting buffer
EOF, last line:
          - add it to what we have collected so far
          - remove the newlines from the currently collected line
          - output that line
other lines:
          - put the line into the collecting buffer

This is already the very structure of our sed-script, because sed works rule-based. Furthermore, sed has exactly what we need for this: the "hold space". This is the collecting buffer we will need. I suggest you sit down with the man page and read what it does and how it is manipulated.

Let us start coding. We need a regexp to express what i called "clause" above. I will do it but you probably want to refine it because you know your data better than I. I.e i coded the month and day "[0-9][0-9]" because i supposed dates will be written "2018-03-04", but maybe they are not and it would be "2018-3-4" in which case you will have to correct the regexp. Also, remove the commentary because sed will not understand them, they are just there for you to better understand:

Code:
/^20[0-9][0-9]-[0-9][0-9]-[0-9][0-9]T[0-9][0-9]:[0-9][0-9]:[0-9][0-9],[0-9].*\[.*\] ql\.Driver:/ {
     x                      # exchange hold space and pattern space, this clears the collecting buffer and puts
                            #  the new line there, we work from now on with the last line collected so far
     s/\n/ /g               # replace newlines with blanks
     p                      # and print the line finally
     b end                  # and go to end of script/start with the next line
}
$ {
     H                      # add this line to the hold space
     x                      # exchange hold space and pattern space, we have what we collected in pattern space again
     s/\n/ /g               # replace newlines with blanks
     p                      # and print the line finally
     b end                  # and go to end of script/start with the next line
}
                            # here we land only with all "other" lines not covered by above rules: they would jump over this
H                           # add this line to the hold space
d                           # and delete the line from pattern space, we do not want to print it

:end                        # here we land when we execute the b-commands

I hope this helps.

bakunin

Last edited by bakunin; 11-21-2018 at 02:44 AM..
# 5  
Old 11-21-2018
Not as sophisticated as bakunin's proposal (esp. the date detection regex), but you could try also
Code:
tac file | sed -n '/^2018/!{G; h; b}; G; s/\n//g; p; s/.*//; h' | tac

It starts from the end, composes the "CREATE TABLE" statement in hold space if no date found. If date found, append the hold space, print, and empty hold space.
This User Gave Thanks to RudiC For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Combining lines in one line

Hi below is the input file snippet. here i want that all the line which is coming after 1 shoud be in one line. so for exanple if after 1 there is two lines which is starting with 2 should be combine in one line. input file content 1,8091012,BATCH_1430903_01,21,T,2,808738,,,,21121:87:01,... (19 Replies)
Discussion started by: scriptor
19 Replies

2. Shell Programming and Scripting

Combining lines into a single line

i have a file (where the column values are separated by ' and the text can be enclosed in ~) which contains data in form of 4461,2,~Basic: 2 Years/Unlimited Miles Drivetrain: Gas Engine 2 Years/Unlimited Miles Duramax Engine 3 Years/Unlimited... (2 Replies)
Discussion started by: rahulchandak
2 Replies

3. Shell Programming and Scripting

Combining multiple block of lines in one comma separated line

Hi Everyone, On my Linux box I have a text file having block of few lines and this block lines separated by one blank line. I would like to format and print these lines in such a way that this entire block of lines will come as single comma separated line & again next block of lines in next... (7 Replies)
Discussion started by: gr8_usk
7 Replies

4. UNIX for Dummies Questions & Answers

Need help combining txt files w/ multiple lines into csv single cell - also need data merge

:confused:Hello -- i just joined the forums. I am a complete noob -- only about 1 week into learning how to program anything... and starting with linux. I am working in Linux terminal. I have a folder with a bunch of txt files. Each file has several lines of html code. I want to combine... (2 Replies)
Discussion started by: jetsetter
2 Replies

5. Shell Programming and Scripting

Combining lines in to one line

Hi Friends, I have a file1.txt 1001 jkilo yrhfm 200056 jhdf rjhwjkrh 3+u8jk5h3 uru ehjk 1002 jkfhk hfjkd 2748395 fdjksfh hefjkh 3hdfk ejkh kjhjke In the above if you see the firt charcter of each line mentioned in red has a pattern . I need to create another file where , the... (6 Replies)
Discussion started by: i150371485
6 Replies

6. Programming

PERL:Combining multiple lines to single line

Hi All I need a small help for the below format in making a small script in Perl or Shell. I have a file in which a single line entries are broken into three line entries. Eg: I have a pen and notebook. All i want is to capture in a single line in a separate file. eg: I have a pen and... (4 Replies)
Discussion started by: Kalaiela
4 Replies

7. Shell Programming and Scripting

Combining 2 lines in a file into 1 line

Hi all, I have a file with lot of lines with repeating pattern. ( TABLE_NAME line followed by Total line). I would like combine these two lines into one line seperated by cama and create a new file. Is there a simple way to do this. Current Format ( just a sample 4 lines ) TABLE_NAME:... (10 Replies)
Discussion started by: MKNENI
10 Replies

8. Shell Programming and Scripting

Multiple lines in a single column to be merged as a single line for a record

Hi, I have a requirement with, No~Dt~Notes 1~2011/08/1~"aaa bbb ccc ddd eee fff ggg hhh" Single column alone got splitted into multiple lines. I require the output as No~Dt~Notes 1~2011/08/1~"aaa<>bbb<>ccc<>ddd<>eee<>fff<>ggg<>hhh" mean to say those new lines to be... (1 Reply)
Discussion started by: Bhuvaneswari
1 Replies

9. Shell Programming and Scripting

split single line into two line or three lines

Dear All, I want to split single line into two line or three lines wherever “|” separated values comes using Input line test,DEMTEMPUT20100404010012,,,,,,,,|0070086|0070087, output shoule be test,DEMTEMPUT20100404010012,,,,,,,,0070086, test,DEMTEMPUT20100404010012,,,,,,,,0070087, (14 Replies)
Discussion started by: arvindng
14 Replies

10. Shell Programming and Scripting

Break lines up into single lines after each space in every line

It sounds a bit confusing but what I have is a text file like the example below (without the Line1, Line2, Line3 etc. of course) and I want to move every group of characters into a new line after each space. Example of text file; line1 .digg-widget-theme2 ul { background: rgb(0, 0, 0) none... (7 Replies)
Discussion started by: lewk
7 Replies
Login or Register to Ask a Question