Format the text using sed or awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Format the text using sed or awk
# 1  
Old 08-23-2018
Format the text using sed or awk

I was able to figure out how to format a text.

Code:
Raw Data:
$ cat test
Thu Aug 23 15:43:28 UTC 2018,
hostname01,
232.02,
3,
0.00
Thu Aug 23 15:43:35 UTC 2018,
hostname02,
231.09,
4,
0.31
Thu Aug 23 15:43:37 UTC 2018,
hostname03,
241.67,
4,
0.43




My output:
cat test| sed 'N;N;N;N; s/\n/ /g'
Thu Aug 23 15:43:28 UTC 2018, hostname01, 232.02, 3, 0.00
Thu Aug 23 15:43:35 UTC 2018, hostname02, 231.09, 4, 0.31
Thu Aug 23 15:43:37 UTC 2018, hostname03, 241.67, 4, 0.43


This one works for me "sed 'N;N;N;N; s/\n/ /g'"

But what if the data is not perfect?

Code:
$ cat test
Thu Aug 23 15:43:28 UTC 2018,
hostname01,
232.02,
3,
0.00
Thu Aug 23 15:43:35 UTC 2018,
hostname02,
231.09,
0.31
Thu Aug 23 15:43:37 UTC 2018,
hostname03,
241.67,
4,
0.43
$

#Missing between line number 8 and 9.


$ cat test| sed 'N;N;N;N; s/\n/ /g'
Thu Aug 23 15:43:28 UTC 2018, hostname01, 232.02, 3, 0.00
Thu Aug 23 15:43:35 UTC 2018, hostname02, 231.09, 0.31 Thu Aug 23 15:43:37 UTC 2018,
hostname03,
241.67,
4,
0.43

$

#it ruin the output... I just hope no matter what, it wont ruin, it will just put another (,) 

Hoping for something like this

Thu Aug 23 15:43:28 UTC 2018, hostname01, 232.02, 3, 0.00
Thu Aug 23 15:43:35 UTC 2018, hostname02, 231.09, 4, 0.31
Thu Aug 23 15:43:37 UTC 2018, hostname03, 241.67, , 0.43

# 2  
Old 08-23-2018
So, if there's no guarantee it's always five lines of data per record, how then would you tell one record from the other? Is there always a time stamp in line 1? A hostname in line 2? How do we tell it's line 4 that's missing (to put the comma right)?
# 3  
Old 08-23-2018
Quote:
Originally Posted by RudiC
So, if there's no guarantee it's always five lines of data per record, how then would you tell one record from the other? Is there always a time stamp in line 1? A hostname in line 2? How do we tell it's line 4 that's missing (to put the comma right)?
Could be a line that starts with the capital indicates the start of a 'record'.
But the OP would need to say if it's a safe assumption and/or if there's a better indication of a start of a record...

Furthermore, once we determine the boundaries of a 'block', how do we determine which field is missing?

Last edited by vgersh99; 08-23-2018 at 03:23 PM..
This User Gave Thanks to vgersh99 For This Post:
# 4  
Old 08-23-2018
Looks like a trailing comma indicates to append the next line.
And the record ends when there is no trailing comma.
That means, append the next line if the current line ends with a comma. Try four times.
Code:
sed '/,$/N; /,$/N; /,$/N; /,$/N; s/\n/ /g' test

These 2 Users Gave Thanks to MadeInGermany For This Post:
# 5  
Old 08-23-2018
Quote:
Originally Posted by MadeInGermany
Looks like a trailing comma indicates to append the next line.
And the record ends when there is no trailing comma.
That means, append the next line if the current line ends with a comma. Try four times.
Code:
sed '/,$/N; /,$/N; /,$/N; /,$/N; s/\n/ /g' test

I get the following on a sample file with missing field:
Code:
Thu Aug 23 15:43:28 UTC 2018, hostname01, 232.02, 3, 0.00
Thu Aug 23 15:43:35 UTC 2018, hostname02, 231.09, 0.31
Thu Aug 23 15:43:37 UTC 2018, hostname03, 241.67, 4, 0.43

Thought the desired output (incorrectly done by the OP) would be:
Code:
Thu Aug 23 15:43:28 UTC 2018, hostname01, 232.02, 3, 0.00
Thu Aug 23 15:43:35 UTC 2018, hostname02, 231.09, , 0.31
Thu Aug 23 15:43:37 UTC 2018, hostname03, 241.67, 4, 0.43

# 6  
Old 08-23-2018
I have provided a plausible answer to "How to determine the boundaries of a 'block'?". And it does no longer "ruin" the following blocks.
The "How do we determine which field is missing?" is not answered.
This User Gave Thanks to MadeInGermany For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Sed/awk command to convert number occurances into date format and club a set of lines

Hi, I have been stuck in this requirement where my file contains the below format. 20150812170500846959990854-25383-8.0.0 "ABC Report" hp96880 "4952" 20150812170501846959990854-25383-8.0.0 End of run 20150812060132846959990854-20495-8.0.0 "XYZ Report" vg76452 "1006962188"... (6 Replies)
Discussion started by: Chinmaya Kabi
6 Replies

2. Shell Programming and Scripting

Datestamp format 2nd change in csv file (awk or sed)

I have a csv file formatted like this: 2014-08-21 18:06:26,A,B,12345,123,C,1232,26/08/14 18:07and I'm trying to change it to MM/DD/YYYY HH:MM for both occurances. I have got this: awk -F, 'NR <=1 {print;next}{"date +%d/%m/%Y\" \"%H:%m -d\""$1 "\""| getline dte;$1=dte}1' OFS="," test.csvThis... (6 Replies)
Discussion started by: say170
6 Replies

3. Shell Programming and Scripting

Help on Log File format using sed or awk

Hello Gurus, First, i would like to know is there any way to solve my problem. i have a log file like this: INFO - ABCDRequest :: processing started for the record <0> TransactionNo <Txn#1> recordID <recID#1> INFO - ABCDRequest :: processing started for the record <0> TransactionNo... (9 Replies)
Discussion started by: VasuKukkapalli
9 Replies

4. Shell Programming and Scripting

Converting text files to xls through awk script for specific data format

Dear Friends, I am in urgent need for awk/sed/sh script for converting a specific data format (.txt) to .xls. The input is as follows: >gi|1234|ref| Query = 1 - 65, Target = 1677 - 1733 Score = 8.38, E = 0.6529, P = 0.0001513, GC = 46 fd sdfsdfsdfsdf fsdfdsfdfdfdfdfdf... (6 Replies)
Discussion started by: Amit1
6 Replies

5. Shell Programming and Scripting

Help with awk statement to format text file.

Hello, I am fairly new to shellscripting and have written a script to check on messages file and report failed logins: Here is the original file: Jul 17 03:38:07 sfldmilx086 sshd: error: PAM: Authentication failure for houghn97 from 10.135.77.201 Jul 17 03:38:07 sfldmilx086 sshd: error:... (2 Replies)
Discussion started by: neilh1704
2 Replies

6. Shell Programming and Scripting

Need awk/sed to format a file

My content of source file is as below scr1 a1 scr2 a2 b2 scr3 a3 b3 c3 I need a awk/sed command (to be used in C shell)to format it to something like below scr1 $a1 >file1 scr2 $a2 $b2 >file2 scr3 $a3 $b3 $c3 >file3 (12 Replies)
Discussion started by: animesharma
12 Replies

7. Shell Programming and Scripting

Please help me format with AWK or SED

INPUT FILE: 9780743565219 "GODS OF NEWPORT" "JAKES, JOHN" 2006 OUTPUT FILE I NEED to CREATE FROM INPUT FILE: cd /data/audiobooks/9780743565219 ~/Desktop/mp3-to-m4b 9780743565219-GODS OF NEWPORT "GODS OF NEWPORT" "JAKES, JOHN" 2006 n ---------- Post updated at 04:19 PM ----------... (6 Replies)
Discussion started by: glev2005
6 Replies

8. Shell Programming and Scripting

awk or sed to format text file

hi all, i have a text file which looks like the below 01 02 abc Top 40 music Kidz Only! MC 851 MC 852 MC 853 7NOW Arch_Diac xyz2 abc h211 Commacc1 Commacc2 Commacc3 (4 Replies)
Discussion started by: posner
4 Replies

9. Programming

awk script to convert a text file into csv format

hi...... thanks for allowing me to start a discussion i am collecting usb usage details of all users and convert it into csv files so that i can export it into some database.. the input text file is as follows:- USB History Dump by nabiy (c)2008 (1) --- Kingston DataTraveler 130 USB... (2 Replies)
Discussion started by: certteam
2 Replies

10. Shell Programming and Scripting

using awk to format text

I'm new to awk and would appreciate a jump start. I've got a text doc of people with first and last names, ages, home cities, and a phrase about the individual. I want to parse the text into fields and rows separated by tabs with the field names of - Firstname, Lastname, Age, City, Dollar... (5 Replies)
Discussion started by: jkandel
5 Replies
Login or Register to Ask a Question