Sponsored Content
Top Forums Shell Programming and Scripting Large file - columns into rows etc Post 302428915 by deindorfer on Friday 11th of June 2010 08:31:27 AM
Old 06-11-2010
The algorithm in Array::Transpose is better than the one I was working together, I would suggest using that. The out of memory error is still there because there is still a single array holding all the data at once.

One way around this problem is to break the run into smaller chunks of data

Code:
bash> head -100000 | tail -100000 | perl -e '....' > out.1
bash> head -200000 | tail -100000 | perl -e '....' > out.2
bash> head -300000 | tail -100000 | perl -e '....' > out.3
bash> head -400000 | tail -100000 | perl -e '....' > out.4
bash> head -500000 | tail -100000 | perl -e '....' > out.5

Then /bin/cat the out files together:

Code:
bash> cat out.1 out.2 out.3 out.4 out.5 > concattted.out

This will break the data in five more managable chunks. Let us know if you are still getting the "Out of Memory!" Error. you can always break the datafile into 50_000 line segments, and so on...

HTH,

Scott
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to changes rows to columns in a file

Hi, I have a small requirement in chainging the rows to columns. The below example.txt contains info as shown Name:Person1 Age:30 Name:Person2 Age:40 Name:Person3 Age:50 I want to make it displayed as hown below Name:Person1 Age:30 Name:person2 Age:40 Name:Person3 Age:50 I... (4 Replies)
Discussion started by: oracle123
4 Replies

2. Shell Programming and Scripting

How to delete rows by RowNumber from a Large text file

Friends, I have text file with 700,000 rows. Once I load this file to our database via our cutom process, it logs the row number for rejected rows. How do I delete rows from a Large text file based on the Row Number? Thanks, Prashant (8 Replies)
Discussion started by: ppat7046
8 Replies

3. Shell Programming and Scripting

Rows to Columns - File Transpose

Hi I have an input file and I want to transpose it but I need to take care that if any field is missing for a record it should be popoulated with space for that field - using a shell script INFILE ---------- emp=1 sal=2 loc=abc emp=2 sal=21 sal=22 loc=xyz emp=5 loc=abc OUTFILE... (10 Replies)
Discussion started by: 46019
10 Replies

4. Shell Programming and Scripting

Deleting specific rows in large files having rows greater than 100000

Hi Guys, I need help in modifying a large text file containing more than 1-2 lakh rows of data using unix commands. I am quite new to the unix language the text file contains data in a pipe delimited format sdfsdfs sdfsdfsd START_ROW sdfsd|sdfsdfsd|sdfsdfasdf|sdfsadf|sdfasdf... (9 Replies)
Discussion started by: manish2009
9 Replies

5. UNIX for Dummies Questions & Answers

Delete large number of columns rom file

Hi, I have a data file that contains 61 columns. I want to delete all the columns except columns, 3,6 and 8. The columns are tab de-limited. How would I achieve this on the terminal? Thanks (2 Replies)
Discussion started by: lost.identity
2 Replies

6. Shell Programming and Scripting

Convert columns to rows in a file

Hello, I have a huge tab delimited file with around 40,000 columns and 900 rows I want to convert columns to a row. INPUT file look like this. the first line is a headed of a file. ID marker1 marker2 marker3 marker4 b1 A G A C ... (5 Replies)
Discussion started by: ryan9011
5 Replies

7. Shell Programming and Scripting

Dedup a large file(30M rows)

Hi, I have a large file with number of records in there. I need some help to find only first row based on a key and ignore other rows with the same key. I tried few things but file is huge(30 million rows). So need some solution that is very efficient. e.g Junk|Apple|7|Random|data|here...... (2 Replies)
Discussion started by: ran123
2 Replies

8. Shell Programming and Scripting

Deleting all the fields(columns) from a .csv file if all rows in that columns are blanks

Hi Friends, I have come across some files where some of the columns don not have data. Key, Data1,Data2,Data3,Data4,Data5 A,5,6,,10,, A,3,4,,3,, B,1,,4,5,, B,2,,3,4,, If we see the above data on Data5 column do not have any row got filled. So remove only that column(Here Data5) and... (4 Replies)
Discussion started by: ks_reddy
4 Replies

9. UNIX for Dummies Questions & Answers

Help with solution to add together columns of large file

Hi everyone. I have a file with ~500 columns and I would like to perform a simple calculation on every two columns. The file looks like this: $cat input id A B C D E F.....X 1 2 4 2 3 4 1 n 2 4 6 4 6 4 5 n 3 4 7 5 2 2 3 n 4 ... (5 Replies)
Discussion started by: torchij
5 Replies

10. UNIX for Dummies Questions & Answers

Extract spread columns from large file

Dear all, I want to extract around 300 columns from a very large file with almost 2million columns. There are no headers, but I can find out which column numbers I want. I know I can extract with the function 'cut -f2' for example just the second column but how do I do this for such a large... (1 Reply)
Discussion started by: fndijk
1 Replies
LIBBASH(7)							  libbash Manual							LIBBASH(7)

NAME
libbash -- A bash shared libraries package. DESCRIPTION
libbash is a package that enables bash dynamic-like shared libraries. Actually its a tool for managing bash scripts whose functions you may want to load and use in scripts of your own. It contains a 'dynamic loader' for the shared libraries ( ldbash(1)), a configuration tool (ldbashconfig(8)), and some libraries. Using ldbash(1) you are able to load loadable bash libraries, such as getopts(1) and hashstash(1). A bash shared library that can be loaded using ldbash(1) must answer 4 requirments: 1. It must be installed in $LIBBASH_PREFIX/lib/bash (default is /usr/lib/bash). 2. It must contain a line that begins with '#EXPORT='. That line will contain (after the '=') a list of functions that the library exports. I.e. all the function that will be usable after loading that library will be listed in that line. 3. It must contain a line that begins with '#REQUIRE='. That line will contain (after the '=') a list of bash libraries that are required for our library. I.e. every bash library that is in use in our bash library must be listed there. 4. The library must be listed (For more information, see ldbashconfig(8)). Basic guidelines for writing library of your own: 1. Be aware, that your library will be actually sourced. So, basically, it should contain (i.e define) only functions. 2. Try to declare all variables intended for internal use as local. 3. Global variables and functions that are intended for internal use (i.e are not defined in '#EXPORT=') should begin with: __<library_name>_ For example, internal function myfoosort of hashstash library should be named as __hashstash_myfoosort This helps to avoid conflicts in global name space when using libraries that come from different vendors. 4. See html manual for full version of this guide. AUTHORS
Hai Zaar <haizaar@haizaar.com> Gil Ran <ril@ran4.net> SEE ALSO
ldbash(1), ldbashconfig(8), getopts(1), hashstash(1) colors(1) messages(1) urlcoding(1) locks(1) Linux Epoch Linux
All times are GMT -4. The time now is 12:56 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy