Sponsored Content
Top Forums Shell Programming and Scripting Extracting a portion of data from a very large tab delimited text file Post 302412060 by Franklin52 on Sunday 11th of April 2010 11:36:32 AM
Old 04-11-2010
Quote:
Originally Posted by Lucky Ali
Hi All

I wanted to know how to effectively delete some columns in a large tab delimited file.

I have a file that contains 5 columns and almost 100,000 rows

Code:
3456 f g t t
3456 g h
456   f  h
4567 f g h z
345   f g
567   h j k l

This is a very large data file and tab delimited.

I need to extract the rows that have values in all the 5 columns. At present, there are several rows that contain only 3 values.

please let me know the best way to extract the rows with all 5 values

Thanks.
LA
Code:
awk 'NF==5' file > newfile

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Check whether a given file is in ASCII format and data is tab-delimited

Hi All, Please help me out with a script which checks whether a given file say abc.txt is in ASCII format and data is tab-delimited. If the condition doesn't satisfy then it should generate error code "100" for file not in ASCII format and "105" if it is not in tab-delimited format. If the... (9 Replies)
Discussion started by: Mandab
9 Replies

2. Shell Programming and Scripting

Removing blanks in a text tab delimited file

Hi Experts I am very new to perl and need to make a script using perl. I would like to remove blanks in a text tab delimited file in in a specfic column range ( colum 21 to column 43) sample input and output shown below : Input: 117 102 650 652 654 656 117 93 95... (3 Replies)
Discussion started by: Faisal Riaz
3 Replies

3. Shell Programming and Scripting

Delete first column in tab-delimited text-file

I have a large text-file with tab-delimited genetic data that looks like: KSC112 KSC234 0 0 1 1 A G C T I simply wan to delete the first column, but since the file has 600 000 columns, it is not possible with awk (seems to be limited at 32k columns). Does anyone have an idea how to do this? (2 Replies)
Discussion started by: andmal
2 Replies

4. Shell Programming and Scripting

Extracting a portion of a data file with identifier

Hi, I do have a TAB delimted text file with the following format. 1 (- identifier of each group. this text is not present in the file only number) 1 3 4 65 56 WERTF 2 3 4 56 56 GHTYHU 3 3 5 64 23 VMFKLG 2 1 3 4 65 56 DGTEYDH 2 3 4 56 56 FJJJCKC 3 3 5 64 23 FNNNCHD 3 1 3 4 65 56 JDHJDH... (9 Replies)
Discussion started by: Lucky Ali
9 Replies

5. UNIX for Dummies Questions & Answers

How to convert text to columns in tab delimited text file

Hello Gurus, I have a text file containing nearly 12,000 tab delimited characters with 4000 rows. If the file size is small, excel can convert the text into coloumns. However, the file that I have is very big. Can some body help me in solving this problem? The input file example, ... (6 Replies)
Discussion started by: Unilearn
6 Replies

6. UNIX for Dummies Questions & Answers

Deleting columns from a tab delimited text file?

I have a tab limited text file with 10000+ columns. I want to delete columns 6 through 23, how do I go about doing that? Thanks! (1 Reply)
Discussion started by: evelibertine
1 Replies

7. UNIX for Dummies Questions & Answers

Add a new column to a tab delimited text file

I want to add a new column to a tab delimited text file. It will be the first column and it will just be 1's. How do I go about doing that? Thanks! (1 Reply)
Discussion started by: evelibertine
1 Replies

8. UNIX for Dummies Questions & Answers

How to convert a text file into tab delimited format?

I have a text file that made using text editor in Ubuntu. However the text file is not being recognized as space or tab delimited, the formatting seems to be messed up. How can I convert the text file into tab delimited format? (3 Replies)
Discussion started by: evelibertine
3 Replies

9. Shell Programming and Scripting

How to read data from tab delimited file after a specific position?

Hi Experts, I have a tab deliminated file as below myfile.txt Local Group Memberships *Administrators *Guests I need data in below format starting from 4th position. myfile1.txt Administrators Guests the above one is just an example and there could... (15 Replies)
Discussion started by: Litu1988
15 Replies

10. UNIX for Dummies Questions & Answers

Need to convert a pipe delimited text file to tab delimited

Hi, I have a rquirement in unix as below . I have a text file with me seperated by | symbol and i need to generate a excel file through unix commands/script so that each value will go to each column. ex: Input Text file: 1|A|apple 2|B|bottle excel file to be generated as output as... (9 Replies)
Discussion started by: raja kakitapall
9 Replies
RS(1)							    BSD General Commands Manual 						     RS(1)

NAME
rs -- reshape a data array SYNOPSIS
rs [-CcSs[x]] [-GgKkw N] [-EeHhjmnTtyz] [rows [cols]] DESCRIPTION
rs reads the standard input, interpreting each line as a row of blank-separated entries in an array, transforms the array according to the options, and writes it on the standard output. With no arguments it transforms stream input into a columnar format convenient for terminal viewing. The shape of the input array is deduced from the number of lines and the number of columns on the first line. If that shape is inconvenient, a more useful one might be obtained by skipping some of the input with the -k option. Other options control interpretation of the input col- umns. The shape of the output array is influenced by the rows and cols specifications, which should be positive integers. If only one of them is a positive integer, rs computes a value for the other which will accommodate all of the data. When necessary, missing data are supplied in a manner specified by the options and surplus data are deleted. There are options to control presentation of the output columns, including transposition of the rows and columns. The options are as follows: -C[x] Output columns are delimited by the single character x. A missing x is taken to be '^I'. -c[x] Input columns are delimited by the single character x. A missing x is taken to be '^I'. -E Consider each character of input as an array entry. -e Consider each line of input as an array entry. -GN The gutter width has N percent of the maximum column width added to it. -gN The gutter width (inter-column space), normally 2, is taken to be N. -H Like -h, but also print the length of each line. -h Print the shape of the input array and do nothing else. The shape is just the number of lines and the number of entries on the first line. -j Right adjust entries within columns. -KN Like -k, but print the ignored lines. -kN Ignore the first N lines of input. -m Do not trim excess delimiters from the ends of the output array. -n On lines having fewer entries than the first line, use null entries to pad out the line. Normally, missing entries are taken from the next line of input. -S[x] Like -C, but padded strings of x are delimiters. -s[x] Like -c, but maximal strings of x are delimiters. -T Print the pure transpose of the input, ignoring any rows or cols specification. -t Fill in the rows of the output array using the columns of the input array, that is, transpose the input while honoring any rows and cols specifications. -wN The width of the display, normally 80, is taken to be the positive integer N. -y If there are too few entries to make up the output dimensions, pad the output by recycling the input from the beginning. Normally, the output is padded with blanks. -z Shrink column widths to fit the largest entries appearing in them. With no arguments, rs transposes its input, and assumes one array entry per input line unless the first non-ignored line is longer than the display width. Option letters which take numerical arguments interpret a missing number as zero unless otherwise indicated. EXAMPLES
rs can be used as a filter to convert the stream output of certain programs (e.g., spell, du, file, look, nm, who, and wc(1)) into a conve- nient ``window'' format, as in $ who | rs This function has been incorporated into the ls(1) program, though for most programs with similar output rs suffices. To convert stream input into vector output and back again, use $ rs 1 0 | rs 0 1 A 10 by 10 array of random numbers from 1 to 100 and its transpose can be generated with $ jot -r 100 | rs 10 10 | tee array | rs -T >tarray In the editor vi(1), a file consisting of a multi-line vector with 9 elements per line can undergo insertions and deletions, and then be neatly reshaped into 9 columns with :1,$!rs 0 9 Finally, to sort a database by the first line of each 4-line field, try $ rs -eC 0 4 | sort | rs -c 0 1 SEE ALSO
jot(1), pr(1), sort(1), vi(1) BUGS
Handles only two dimensional arrays. The algorithm currently reads the whole file into memory, so files that do not fit in memory will not be reshaped. Fields cannot be defined yet on character positions. Re-ordering of columns is not yet possible. There are too many options. BSD
April 14, 2012 BSD
All times are GMT -4. The time now is 06:03 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy