Transpose Messy Data


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users Transpose Messy Data
# 8  
Old 05-05-2015
Yes!

This looks great! Thanks so much!

This code was giving a record for each ID plus a blank field 4 line after the records with data, so I changed

Code:
for(i = 1; i <= n; i++)

to

Code:
for(i = 1; i < n; i++)

and that seemed to give the desired results and that seemed worthwhile given millions of records.

One additional aspect is that the federal government delivers this data monthly as four files:

thisyear.txt
thismonth.txt
changes.txt
previousyears.txt

I know I could make a copy of this code for each file and change the input datafile, but I was wondering how to loop over the four input files in one program.

Thanks so much for all your help!

---------- Post updated at 12:45 PM ---------- Previous update was at 10:08 AM ----------

I got rid of the leading

Code:
awk '

and the close single quote from the last line and ran the script via

Code:
awk -f myscript.awk thismonth.txt

and that seemed to work.

Thanks!
# 9  
Old 05-05-2015
Quote:
Originally Posted by 91674io
Yes!

This looks great! Thanks so much!

This code was giving a record for each ID plus a blank field 4 line after the records with data, so I changed

Code:
for(i = 1; i <= n; i++)

to

Code:
for(i = 1; i < n; i++)

and that seemed to give the desired results and that seemed worthwhile given millions of records.
If field 4 contains semicolon terminated fields instead of semicolon separated fields, that is a good change. If field 4 contains semicolon separated fields as shown in you sample input, this change will discard the last subfield in field 4 for each line and will completely skip lines that only have one sub-field terminated by the field separator (|). If some lines have an empty subfield after the last semicolon, you could check for an empty subfield before printing an output line.

Quote:
One additional aspect is that the federal government delivers this data monthly as four files:

thisyear.txt
thismonth.txt
changes.txt
previousyears.txt

I know I could make a copy of this code for each file and change the input datafile, but I was wondering how to loop over the four input files in one program.

Thanks so much for all your help!

---------- Post updated at 12:45 PM ---------- Previous update was at 10:08 AM ----------

I got rid of the leading

Code:
awk '

and the close single quote from the last line and ran the script via

Code:
awk -f myscript.awk thismonth.txt

and that seemed to work.

Thanks!
If you want to produce one output file from the concatenated four input files, just change:
Code:
awk -f myscript.awk thismonth.txt

to:
Code:
awk -f myscript.awk thisyear.txt thismonth.txt changes.txt previousyears.txt

But, if changes.txt contains additions and deletions, you'll need to modify the script to ignore deletions and only print additions (or if you have a combined file and need to remove other lines that have already been processed).

And, if you want the script to switch output files when it starts processing a new input file, you'll also need to make some minor changes to the script for that.

If you need help with additional changes like this, give detailed descriptions of how field 4 is formatted, how output file names are related in input file names, etc. for the changes that you want to make.
This User Gave Thanks to Don Cragun For This Post:
# 10  
Old 05-06-2015
Thanks so much for the helpful reply!

Yes, field 4 is terminated by
Code:
;

and not merely separated.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Transpose large data in UNIX

Hi I have the following sample of data: my full data dimention is 900,000* 1119 rs987435 C G 1 1 1 0 2 rs345783 C G 0 0 1 0 0 rs955894 G T 1 1 2 2 1 rs6088791 ... (7 Replies)
Discussion started by: marwah
7 Replies

2. UNIX for Beginners Questions & Answers

Transpose the data

Hi All, I have sort of a case to transpose data from rows to column input data Afghanistan|10000|1 Albania|25000|4 Algeria|25000|7 Andorra|10000|4 Angola|25000|47 Antigua and Barbuda|25000|23 Argentina|5000|3 Armenia|100000|12 Aruba|20000|2 Australia|50000|2 I need to transpose... (3 Replies)
Discussion started by: radius
3 Replies

3. Shell Programming and Scripting

Help with transpose data content

Hi, Below is my input file: c116_g1_i1 -,-,-,+ c118_g2_i1 +,+ c118_g3_i1 + c120_g1_i1 +,+,+,+ . . Desired Output File c116_g1_i1 - c116_g1_i1 - c116_g1_i1 - c116_g1_i1 + c118_g2_i1 + c118_g2_i1 + (3 Replies)
Discussion started by: perl_beginner
3 Replies

4. Shell Programming and Scripting

Transpose data as rows using awk

Hi I have below requirement, need help One file contains the meta data information and other file would have the data, match the column from file1 and with file2 and extract corresponding column value and display in another file File1: CUSTTYPECD COSTCENTER FNAME LNAME SERVICELVL ... (1 Reply)
Discussion started by: ravlapo
1 Replies

5. Shell Programming and Scripting

Transpose Column of Data to Rows

I can no longer find my commands, but I use to be able to transpose data with common fields from a single column to rows using a command line. My data is separated as follows: NAME=BOB ADDRESS=COLORADO PET=CAT NAME=SUSAN ADDRESS=TEXAS PET=BIRD NAME=TOM ADDRESS=UTAH PET=DOG I would... (7 Replies)
Discussion started by: docdave78
7 Replies

6. Shell Programming and Scripting

Transpose Data from Columns to rows

Hello. very new to shell scripting and would like to know if anyone could help me. I have data thats being pulled into a txt file and currently have to manually transpose the data which is taking a long time to do. here is what the data looks like. Server1 -- Date -- Other -- value... (7 Replies)
Discussion started by: Mikes88
7 Replies

7. Shell Programming and Scripting

Transpose Daily Data from Column to Row.

Hi I'm looking to transpose Linux data from a daily report that logs every 10mins like below. After the first "comma" I need the daily total for Col2 and Col3 transposed like below. The new transposed format below will then be exported to Microsoft Excel for Reporting. Any help would be... (9 Replies)
Discussion started by: ravzter
9 Replies

8. Shell Programming and Scripting

Transpose columns to Rows : Big data

Hi, I did read a few posts on the subjects, tried out a few solutions, but did not solve my problem. https://www.unix.com/302121568-post11.html https://www.unix.com/shell-programming-scripting/137953-large-file-columns-into-rows-etc-4.html Please help. Problem very similar to the second link... (15 Replies)
Discussion started by: genehunter
15 Replies

9. Shell Programming and Scripting

How to transpose a table of data using awk

Hi. I have this data below:- v1 28 14 1.72414 1.72414 1.72414 1.72414 1.72414 v2 77 7 7.47126 6.89655 6.89655 6.89655 6.89655 v3 156 3 21.2644 21.2644 20.6897 21.2644 20.6897 v4 39 3 1.72414 1.72414 1.72414 1.72414 1.72414 v5 155 1 21.2644 23.5632 24.1379 23.5632 24.1379 v6 62 2 2.87356... (2 Replies)
Discussion started by: ahjiefreak
2 Replies

10. Shell Programming and Scripting

How to transpose data elements in awk

Hi, I have an input data file :- Test4599,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,2,2,Rain Test90,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,1,0,Not Rain etc.... I wanted to transpose these data to:-... (2 Replies)
Discussion started by: ahjiefreak
2 Replies
Login or Register to Ask a Question