Home
Man
Search
Today's Posts
Register

This forum is closed for new posts. Please post beginner questions to learn unix and learn linux in the UNIX for Beginners Questions & Answers forum.

Concatenate three lines into one

Tags
awk, beginners, solved

 

 
Thread Tools Search this Thread
# 1  
Old 08-13-2014
Concatenate three lines into one

Hi. I'm new to this forum.

I am attempting to parse an Audit Log from Cognos/TM1, selecting only Event IDs of "client" which are found on the "start-tag" record. These Logs are in a pseudo-XML format but not a true XML format. I want to FTP an Audit Log File from the Cognos server to our UNIX server. Then I want to look for "client" at position 28 in a given line, and upon finding that, write the current line and the next two lines after it as one output line. (We have Shell Scripts already that FTP data from Windows to UNIX, so I don't need assistance with that.)

These lines have CRLF (I think) after each one. When I view them in Windows-based editors (like TextPad), they are separate lines.

I'm a Unix script newbie. I've searched the Web for possible solutions but have not found any. I have found examples of merging two lines by patterns using sed and one using sed (I think) and a line range, but the sample had a hard-coded line range.

We run the AIX Flavor of UNIX at my company. We use KSH as the Shell Script most of the time.

Pseudo-code:

IF current-line-number line has "client" at position 28, write current-line-number concatenate current-line-number plus 1 concatenate current-line-number plus 2 FI

Thank you for your help.

Last edited by FredAtArrow; 08-13-2014 at 02:27 PM..
# 2  
Old 08-13-2014
Welcome to the UNIX and Linux Forums.
Thank you for telling us what OS and shell you're using. That always makes it easier for the volunteers here who try to help answer your questions suggest solutions that will work in your environment.

Please provide a short sample input file and a corresponding output file showing what you want to be produced by your script for that sample input. (Please use CODE tags when showing us sample input, sample output, and code fragments.)

- Don
# 3  
Old 08-13-2014
Thank you for your reply, Mr. Cragun.

Input file sample (I put in "(crlf)" to show where crlf's are):

Code:
<Commit ts="20140812153817" client="AAAAAA">(crlf)
"144","AAAAAA","00.00.000.000","User 'AAAAAA' successfully logged in from address '00.00.000.000'."crlf
</Commit>crlf
<Commit ts="20140812160347" client="BBBBBB">(crlf)
"144","BBBBBB","00.0.000.00","User 'BBBBBB' successfully logged in from address '00.0.000.00'."(crlf)
</Commit>(crlf)
<Commit ts="20140812170838" client="CCCCCC">(crlf)
"144","CCCCCC","00.0.000.00","User 'CCCCCC' successfully logged in from address '00.0.000.00'."(crlf)
</Commit>(crlf)

Output file sample (again I put "(crlf)" to show where crlf's should be):

Code:
<Commit ts="20140812153817" client="AAAAAA">"144","AAAAAA","00.00.000.000","User 'AAAAAA' successfully logged in from address '00.00.000.000'."</Commit>(crlf)
<Commit ts="20140812160347" client="BBBBBB">"144","BBBBBB","00.0.000.00","User 'BBBBBB' successfully logged in from address '00.0.000.00'."</Commit>(crlf)
<Commit ts="20140812170838" client="CCCCCC">"144","CCCCCC","00.0.000.00","User 'CCCCCC' successfully logged in from address '00.0.000.00'."</Commit>(crlf)

Once in a single-record-per-event format, I can parse it using our ETL Tools.
# 4  
Old 08-13-2014
Code:
awk '{ gsub(/[\r\n]/, ""); L=L$0 } /<\/Commit>/ { print L; L="" }' inputfile > outputfile

The Following User Says Thank You to Corona688 For This Useful Post:
FredAtArrow (08-14-2014)
# 5  
Old 08-13-2014
I think Corona688 missed the fact that you want CRLF line terminators in your output too. To do that, this trivial modification to his code:
Code:
awk '{ gsub(/[\r\n]/, ""); L=L$0 } /<\/Commit>/ { print L "\r"; L="" }' inputfile > outputfile

should do what you want as long as the 1st line in every set of 3 adjacent lines in your input file contains "client" starting in column 29 (columns are numbered starting from 1; not 0; in awk. If you want to discard other types of records, this more complicated script should work:
Code:
awk '
cnt {	out = out $0
	if(--cnt == 0) {
		gsub("\r", "", out)
		print out "\r"
	}
	next
}
substr($0, 29, 6) == "client" {
	cnt = 2
	out = $0
}' inputfile > outuptfile

The Following 2 Users Say Thank You to Don Cragun For This Useful Post:
Corona688 (08-13-2014), FredAtArrow (08-14-2014)
# 6  
Old 08-14-2014
Posted by DON:

Quote:
I think Corona688 missed the fact that you want CRLF line terminators in your output too. To do that, this trivial modification to his code:


Code:
awk '{ gsub(/[\r\n]/, ""); L=L$0 } /<\/Commit>/ { print L "\r"; L="" }' inputfile > outputfile

should do what you want as long as the 1st line in every set of 3 adjacent lines in your input file contains "client" starting in column 29 (columns are numbered starting from 1; not 0; in awk . If you want to discard other types of records, this more complicated script should work:


Code:
awk 'cnt { out = out $0 if(--cnt == 0) { gsub("\r", "", out) print out "\r" } next}substr($0, 29, 6) == "client" { cnt = 2 out = $0}' inputfile > outuptfile
Hello Don,

As per user's request I can see user wants crlf at last only.

Code:
awk '/^<Commit ts/ {gsub(/\(crlf\)/,X,$0);a=$0;getline;gsub(/\(crlf\)/,X,$0);gsub(/crlf/,X,$0);a=a $0;getline;a=a $0; print a}'  filename

Output will be as follows.

Code:
<Commit ts="20140812153817" client="AAAAAA">"144","AAAAAA","00.00.000.000","User 'AAAAAA' successfully logged in from address '00.00.000.000'."</Commit>crlf
<Commit ts="20140812160347" client="BBBBBB">"144","BBBBBB","00.0.000.00","User 'BBBBBB' successfully logged in from address '00.0.000.00'."</Commit>(crlf)
<Commit ts="20140812170838" client="CCCCCC">"144","CCCCCC","00.0.000.00","User 'CCCCCC' successfully logged in from address '00.0.000.00'."</Commit>(crlf)


Thanks,
R. Singh
The Following User Says Thank You to RavinderSingh13 For This Useful Post:
FredAtArrow (08-14-2014)
# 7  
Old 08-14-2014
Quote:
Originally Posted by RavinderSingh13
Posted by DON:



Hello Don,

As per user's request I can see user wants crlf at last only.

Code:
awk '/^<Commit ts/ {gsub(/\(crlf\)/,X,$0);a=$0;getline;gsub(/\(crlf\)/,X,$0);gsub(/crlf/,X,$0);a=a $0;getline;a=a $0; print a}'  filename

Output will be as follows.

Code:
<Commit ts="20140812153817" client="AAAAAA">"144","AAAAAA","00.00.000.000","User 'AAAAAA' successfully logged in from address '00.00.000.000'."</Commit>crlf
<Commit ts="20140812160347" client="BBBBBB">"144","BBBBBB","00.0.000.00","User 'BBBBBB' successfully logged in from address '00.0.000.00'."</Commit>(crlf)
<Commit ts="20140812170838" client="CCCCCC">"144","CCCCCC","00.0.000.00","User 'CCCCCC' successfully logged in from address '00.0.000.00'."</Commit>(crlf)


Thanks,
R. Singh
Hi Ravinder,
The (crlf) (and in two places) crlf (without the parentheses) is FredAtArrow's notation indicating that the input and output lines have Windows style <carriage-return><line-feed> (where <line-feed> is a synonym for <newline> on Linux/UNIX systems) line terminators instead of just the Linux/UNIX style <newline> line terminators. The:
Code:
gsub(/[\r\n]/, "")

that processed every input line in Corona688's proposal is overkill removing both <carriage-return>s and <newline>s (I say overkill, because awk already strips out the <newline> characters before it hands us a line as long as we keep the default value for the awk RS variable).

The code I suggested strips out the carriage returns (the awk escape sequence \r in a string) from the accumulated set of three input lines just before adding back the desired <carriage-return> at the end of the line when printing the results:
Code:
		gsub("\r", "", out)
		print out "\r"

and, as we both know, as long as we are using the default awk ORS variable, the print command adds a <newline> when it prints a string.
The Following 2 Users Say Thank You to Don Cragun For This Useful Post:
FredAtArrow (08-14-2014), RavinderSingh13 (08-14-2014)
 

« Previous Thread | Next Thread »
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Concatenate select lines from multiple files pathunkathunk Shell Programming and Scripting 3 04-13-2014 12:25 PM
How to concatenate lines with specific pattern? nixtime Shell Programming and Scripting 8 06-25-2013 06:39 PM
Concatenate lines with unique string AND number pioavi Shell Programming and Scripting 2 01-18-2012 07:43 AM
Concatenate lines in file shell script systemoper UNIX for Advanced & Expert Users 3 01-14-2012 06:40 PM
Need to concatenate spuriously separated lines lemele Shell Programming and Scripting 16 12-28-2011 02:21 AM
Concatenate two lines in a fIle uday.sena.m Programming 3 09-22-2011 12:50 AM
Concatenate lines between lines starting with a specific pattern s052866 Shell Programming and Scripting 9 10-24-2010 01:09 PM
Concatenate multiple lines based. indrajit_u Shell Programming and Scripting 8 06-30-2010 06:50 PM
concatenate lines in pairs kerl Shell Programming and Scripting 5 06-17-2010 12:26 PM
Concatenate two lines in one Orbix Shell Programming and Scripting 5 04-29-2010 05:40 PM
concatenate lines using shell scripting dtdt Shell Programming and Scripting 13 09-07-2009 01:14 AM
concatenate log file lines up to timestamp AlanC Shell Programming and Scripting 4 06-29-2009 12:38 PM
Concatenate lines in a single paragraph cgkmal Shell Programming and Scripting 4 05-17-2009 02:02 AM
How to concatenate consecutive lines shivi707 UNIX Desktop Questions & Answers 1 01-12-2009 09:08 AM
concatenate and display 2 lines as 1 with a condition for 2 line ? vithala Shell Programming and Scripting 7 07-11-2008 02:01 AM


All times are GMT -4. The time now is 04:07 PM.

Unix & Linux Forums Content Copyright 1993-2018. All Rights Reserved.
UNIX.COM Login
Username:
Password:  
Show Password