Transpose Messy Data Post: 302943056

Sponsored Content

Top Forums UNIX for Advanced & Expert Users Transpose Messy Data Post 302943056 by Don Cragun on Tuesday 5th of May 2015 03:33:28 AM

05-05-2015

Registered User

If I understand your problem correctly, I don't see any need for anything but one awk script for this problem. Try:

Code:

awk '
BEGIN {	FS = OFS = "|"
}
{	while(NF < 5) {
		if(NF <= 1) {
			# Read a continuation line for field 5 or 1st line
			# of next record.
			if(getline != 1) {
				# Break out on EOF
				break
			}
		} else {# Read continuation line for field 4.
			if((getline x) != 1) {
				# We should not hit EOF in the middle of a
				# continued line, but check for it anyway.
				break
			}
			$0 = $0 " " x	# Replace incorrect <newline> with a
					# space.
			$1 = $1		# Reset NF after combining lines.
		}
	}
	# Discard <carriage-return>s.
	gsub(/\r/, "")
	n = split($4, sf, ";")
	for(i = 1; i <= n; i++)
		print $1, sf[i]
}' patienttest2.txt

If you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk.

If patienttest2.txt contains:

Code:

ID1|VAR2|VAR3|VAR4|VAR5
ID2|VAR2|VAR3|PART1;PART2|1;2
ID3|VAR2|VAR3|A, B, C;PART2;BEFORE LF
AFTER LF|1;2;3
ID4|VAR2|VAR3|1;2;3,;4|1;2;3;4
ID5|VAR2|VAR3|1,
2;3
4;5
6|f5
f6
f7
ID6|VAR2|VAR3|A,b;C,d|a
con

(with <carriage-return><newline> line terminators or <newline> line terminators), produces the output:

Code:

ID1|VAR4
ID2|PART1
ID2|PART2
ID3|A, B, C
ID3|PART2
ID3|BEFORE LF AFTER LF
ID4|1
ID4|2
ID4|3,
ID4|4
ID5|1, 2
ID5|3 4
ID5|5 6
ID6|A,b
ID6|C,d

Does this match what you're trying to do?

This User Gave Thanks to Don Cragun For This Post:

Don Cragun

View Public Profile for Don Cragun

Find all posts by Don Cragun

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to transpose data elements in awk

Hi, I have an input data file :- Test4599,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,2,2,Rain Test90,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,1,0,Not Rain etc.... I wanted to transpose these data to:-...

2. Shell Programming and Scripting

How to transpose a table of data using awk

Hi. I have this data below:- v1 28 14 1.72414 1.72414 1.72414 1.72414 1.72414 v2 77 7 7.47126 6.89655 6.89655 6.89655 6.89655 v3 156 3 21.2644 21.2644 20.6897 21.2644 20.6897 v4 39 3 1.72414 1.72414 1.72414 1.72414 1.72414 v5 155 1 21.2644 23.5632 24.1379 23.5632 24.1379 v6 62 2 2.87356...

3. Shell Programming and Scripting

Transpose columns to Rows : Big data

Hi, I did read a few posts on the subjects, tried out a few solutions, but did not solve my problem. https://www.unix.com/302121568-post11.html https://www.unix.com/shell-programming-scripting/137953-large-file-columns-into-rows-etc-4.html Please help. Problem very similar to the second link...

4. Shell Programming and Scripting

Transpose Daily Data from Column to Row.

Hi I'm looking to transpose Linux data from a daily report that logs every 10mins like below. After the first "comma" I need the daily total for Col2 and Col3 transposed like below. The new transposed format below will then be exported to Microsoft Excel for Reporting. Any help would be...

5. Shell Programming and Scripting

Transpose Data from Columns to rows

Hello. very new to shell scripting and would like to know if anyone could help me. I have data thats being pulled into a txt file and currently have to manually transpose the data which is taking a long time to do. here is what the data looks like. Server1 -- Date -- Other -- value...

6. Shell Programming and Scripting

Transpose Column of Data to Rows

I can no longer find my commands, but I use to be able to transpose data with common fields from a single column to rows using a command line. My data is separated as follows: NAME=BOB ADDRESS=COLORADO PET=CAT NAME=SUSAN ADDRESS=TEXAS PET=BIRD NAME=TOM ADDRESS=UTAH PET=DOG I would...

7. Shell Programming and Scripting

Transpose data as rows using awk

Hi I have below requirement, need help One file contains the meta data information and other file would have the data, match the column from file1 and with file2 and extract corresponding column value and display in another file File1: CUSTTYPECD COSTCENTER FNAME LNAME SERVICELVL ...

8. Shell Programming and Scripting

Help with transpose data content

Hi, Below is my input file: c116_g1_i1 -,-,-,+ c118_g2_i1 +,+ c118_g3_i1 + c120_g1_i1 +,+,+,+ . . Desired Output File c116_g1_i1 - c116_g1_i1 - c116_g1_i1 - c116_g1_i1 + c118_g2_i1 + c118_g2_i1 +

9. UNIX for Beginners Questions & Answers

Transpose the data

Hi All, I have sort of a case to transpose data from rows to column input data Afghanistan|10000|1 Albania|25000|4 Algeria|25000|7 Andorra|10000|4 Angola|25000|47 Antigua and Barbuda|25000|23 Argentina|5000|3 Armenia|100000|12 Aruba|20000|2 Australia|50000|2 I need to transpose...

10. UNIX for Beginners Questions & Answers

Transpose large data in UNIX

Hi I have the following sample of data: my full data dimention is 900,000* 1119 rs987435 C G 1 1 1 0 2 rs345783 C G 0 0 1 0 0 rs955894 G T 1 1 2 2 1 rs6088791 ...

LEARN ABOUT DEBIAN

h5fromtxt

H5FROMTXT(1)							      h5utils							      H5FROMTXT(1)

NAME

       h5fromtxt - convert text input to an HDF5 file

SYNOPSIS

       h5fromtxt [OPTION]... [HDF5FILE]

DESCRIPTION

       h5fromtxt takes a series of numbers from standard input and outputs a multi-dimensional numeric dataset in an HDF5 file.

       HDF5  is a free, portable binary format and supporting library developed by the National Center for Supercomputing Applications at the Uni-
       versity of Illinois in Urbana-Champaign.  A single h5 file can contain multiple data sets; by default, h5fromtxt creates a  dataset  called
       "data",	but  this  can	be  changed  via  the -d option, or by using the syntax HDF5FILE:DATASET.  The -a option can be used to append new
       datasets to an existing HDF5 file.

       All characters besides the numbers (and associated decimal points, etcetera) in the input are ignored.  By default, the data is assumed	to
       be a two-dimensional MxN dataset where M is the number of rows (delimited by newlines) and N is the number of columns.  In this case, it is
       an error for the number of columns to vary between rows.  If M or N is 1 then the data is written as a one-dimensional dataset.

       Alternatively, you can specify the dimensions of the data explicitly via the -n size option, where size is e.g.	"2x2x2".   In  this  case,
       newlines  are ignored and the data is taken as an array of the given size stored in row-major ("C") order (where the last index varies most
       quickly as you step through the data).  e.g. a 2x2x2 array would be have the elements listed  in  the  order:  (0,0,0),	(0,0,1),  (0,1,0),
       (0,1,1), (1,0,0), (1,0,1), (1,1,0), (1,1,1).

       A simple example is:

	   h5fromtxt foo.h5 <<EOF
	   1 2 3 4
	   5 6 7 8
	   EOF

       which reads in a 2x4 space-delimited array from standard input.

OPTIONS

       -h     Display help on the command-line options and usage.

       -V     Print the version number and copyright info for h5fromtxt.

       -v     Verbose output.

       -a     If  the  HDF5  output file already exists, append the data as a new dataset rather than overwriting the file (the default behavior).
	      An existing dataset of the same name within the file is overwritten, however.

       -n size
	      Instead of trying to infer the dimensions of the array from the rows and columns of the input, treat the data as a sequence of  num-
	      bers  in row-major order forming an array of dimensions size.  size is of the form MxNxLx... (with M, N, L being numbers) and may be
	      of any dimensionality.

       -T     Transpose the input when it is written, reversing the dimensions.

       -d name
	      Write to dataset name in the output; otherwise, the output dataset is called "data"  by  default.   Alternatively,  use  the  syntax
	      HDF5FILE:DATASET.

BUGS

       Send bug reports to S. G. Johnson, stevenj@alum.mit.edu.

AUTHORS

       Written by Steven G. Johnson.  Copyright (c) 2005 by the Massachusetts Institute of Technology.

h5utils 							   March 9, 2002						      H5FROMTXT(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to transpose data elements in awk

Discussion started by: ahjiefreak

2. Shell Programming and Scripting

How to transpose a table of data using awk

Discussion started by: ahjiefreak

3. Shell Programming and Scripting

Transpose columns to Rows : Big data

Discussion started by: genehunter

4. Shell Programming and Scripting

Transpose Daily Data from Column to Row.

Discussion started by: ravzter

5. Shell Programming and Scripting

Transpose Data from Columns to rows

Discussion started by: Mikes88

6. Shell Programming and Scripting

Transpose Column of Data to Rows

Discussion started by: docdave78

7. Shell Programming and Scripting

Transpose data as rows using awk

Discussion started by: ravlapo

8. Shell Programming and Scripting

Help with transpose data content

Discussion started by: perl_beginner

9. UNIX for Beginners Questions & Answers

Transpose the data

Discussion started by: radius

10. UNIX for Beginners Questions & Answers

Transpose large data in UNIX

Discussion started by: marwah

LEARN ABOUT DEBIAN

h5fromtxt