Large file - columns into rows etc


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Large file - columns into rows etc
# 1  
Old 06-08-2010
Large file - columns into rows etc

I have done a couple of searches on this and have found many threads but I don't think I've found one that is useful to me - probably because I have very basic comprehension of perl and beginners shell so trying to manipulate a script already posted maybe beyond my capabilities....

Anyway - I have a huge file (247 columns, over 500,000 lines). What I want to do ultimately is transpose this entire file to make the columns, rows and the rows, columns. Is there an easy way to do this in perl and/or shell? If so, how?

Cheers.
# 2  
Old 06-09-2010
Convert Colums to Rows

Code:
perl -F, -lane 'for ( 0 .. $#F ) { $rows[$_] .= $F[$_] }; eof && print map "$_\n", @rows' data

if the file "data" contains CSV data, like this:

Code:
1,2,3,4,5,6,7
a,b,c,d,e,f,g

The code above will output something like this:

Code:
1a
2b
3c
4d
5e
6f
7g

Whatever is the separatrix for your data can be exchanged for the comma after the "-F" switch in the code.

This should work on arbitrarily large files.

Hope That Helps

P.S. you are talking about 123,500,000 cells in a 247 by 500,000 matrix so memory could become a problem for the $rows variable, particularly if you are on 32bit. We are building up the result in the $rows array and waiting to the end to print it out. I can work on a streaming solution if you get the old "Out of Memory!" error Smilie

Last edited by deindorfer; 06-09-2010 at 03:51 AM.. Reason: Caveat about Memory
# 3  
Old 06-09-2010
Yes - I am getting the out of memory error... but I dunno why. I can easily open the original file in an 32bit OS system, but when I transpose, all hell breaks loose. I tried to grep out a single line from the new data file, but I got this:

Code:
grep: line too long

What does that mean? That my dat file is in one single line?

Cheers
# 4  
Old 06-09-2010
it could. there are many ways to find out if that is the case. Here is one way:

Code:
shell_prompt-> head -1  datafile

if that kicks back the whole file, there are no line endings
# 5  
Old 06-09-2010
Mmmmm... this is where I get stupid. Typing that in doesn't give me anything...
# 6  
Old 06-09-2010
use perl module called Array::Transpose:-

if file is space separated.
Code:
perl -M'Array::Transpose'  -wlane '

push @in , [ @F ] ;

END{
@out = transpose(\@in) ;
print "@out" ;
}
' infile.txt

SmilieSmilieSmilie
# 7  
Old 06-09-2010
Thanks - I'll give this a go. Though, my file is actually tab-delimited - will this matter?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Extract spread columns from large file

Dear all, I want to extract around 300 columns from a very large file with almost 2million columns. There are no headers, but I can find out which column numbers I want. I know I can extract with the function 'cut -f2' for example just the second column but how do I do this for such a large... (1 Reply)
Discussion started by: fndijk
1 Replies

2. UNIX for Dummies Questions & Answers

Help with solution to add together columns of large file

Hi everyone. I have a file with ~500 columns and I would like to perform a simple calculation on every two columns. The file looks like this: $cat input id A B C D E F.....X 1 2 4 2 3 4 1 n 2 4 6 4 6 4 5 n 3 4 7 5 2 2 3 n 4 ... (5 Replies)
Discussion started by: torchij
5 Replies

3. Shell Programming and Scripting

Deleting all the fields(columns) from a .csv file if all rows in that columns are blanks

Hi Friends, I have come across some files where some of the columns don not have data. Key, Data1,Data2,Data3,Data4,Data5 A,5,6,,10,, A,3,4,,3,, B,1,,4,5,, B,2,,3,4,, If we see the above data on Data5 column do not have any row got filled. So remove only that column(Here Data5) and... (4 Replies)
Discussion started by: ks_reddy
4 Replies

4. Shell Programming and Scripting

Dedup a large file(30M rows)

Hi, I have a large file with number of records in there. I need some help to find only first row based on a key and ignore other rows with the same key. I tried few things but file is huge(30 million rows). So need some solution that is very efficient. e.g Junk|Apple|7|Random|data|here...... (2 Replies)
Discussion started by: ran123
2 Replies

5. Shell Programming and Scripting

Convert columns to rows in a file

Hello, I have a huge tab delimited file with around 40,000 columns and 900 rows I want to convert columns to a row. INPUT file look like this. the first line is a headed of a file. ID marker1 marker2 marker3 marker4 b1 A G A C ... (5 Replies)
Discussion started by: ryan9011
5 Replies

6. UNIX for Dummies Questions & Answers

Delete large number of columns rom file

Hi, I have a data file that contains 61 columns. I want to delete all the columns except columns, 3,6 and 8. The columns are tab de-limited. How would I achieve this on the terminal? Thanks (2 Replies)
Discussion started by: lost.identity
2 Replies

7. Shell Programming and Scripting

Deleting specific rows in large files having rows greater than 100000

Hi Guys, I need help in modifying a large text file containing more than 1-2 lakh rows of data using unix commands. I am quite new to the unix language the text file contains data in a pipe delimited format sdfsdfs sdfsdfsd START_ROW sdfsd|sdfsdfsd|sdfsdfasdf|sdfsadf|sdfasdf... (9 Replies)
Discussion started by: manish2009
9 Replies

8. Shell Programming and Scripting

Rows to Columns - File Transpose

Hi I have an input file and I want to transpose it but I need to take care that if any field is missing for a record it should be popoulated with space for that field - using a shell script INFILE ---------- emp=1 sal=2 loc=abc emp=2 sal=21 sal=22 loc=xyz emp=5 loc=abc OUTFILE... (10 Replies)
Discussion started by: 46019
10 Replies

9. Shell Programming and Scripting

How to delete rows by RowNumber from a Large text file

Friends, I have text file with 700,000 rows. Once I load this file to our database via our cutom process, it logs the row number for rejected rows. How do I delete rows from a Large text file based on the Row Number? Thanks, Prashant (8 Replies)
Discussion started by: ppat7046
8 Replies

10. Shell Programming and Scripting

How to changes rows to columns in a file

Hi, I have a small requirement in chainging the rows to columns. The below example.txt contains info as shown Name:Person1 Age:30 Name:Person2 Age:40 Name:Person3 Age:50 I want to make it displayed as hown below Name:Person1 Age:30 Name:person2 Age:40 Name:Person3 Age:50 I... (4 Replies)
Discussion started by: oracle123
4 Replies
Login or Register to Ask a Question