Command to extract empty field in a large UNIX file? Post: 303032879

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract data from large file 80+ million records

Hello, I have got one file with more than 120+ million records(35 GB in size). I have to extract some relevant data from file based on some parameter and generate other output file. What will be the besat and fastest way to extract the ne file. sample file format :--...

2. Shell Programming and Scripting

split large file based on field criteria

I have a file containing date/time sorted data of the form ... 2009/06/10,20:59:59.950,XAG/USD,Q,1,1115, 14.3025,100,1,1 2009/06/10,20:59:59.950,XAG/USD,Q,1,1116, 14.3026,125,1,1 2009/06/10,20:59:59.950,XAG/USD,R,0,0, , 0,0,0 2009/06/10,20:59:59.950,XAG/USD,R,1,0, 14.1910,100,1,1...

3. Shell Programming and Scripting

extract unique pattern from large text file

Hi All, I am trying to extract data from a large text file , I want to extract lines which contains a five digit number followed by a hyphen , like 12345- , i tried with egrep ,eg : egrep "+" text.txt but which returns all the lines which contains any number of digits followed by hyhen ,...

4. Shell Programming and Scripting

awk - if field is empty, move line to new file

I have a script with this statement: /usr/xpg4/bin/awk -F"" 'NR==FNR{s=$2;next}{printf "%s\"%s\"\n", $0, s}' LOOKUP.TXT finallistnew.txt >test.txt I want to include logic or an additional step that says if there is no data in field 3, move the whole line out of test.txt into an additional...

5. Shell Programming and Scripting

Format the file by deleting empty field

I have the test data with 10 column separated by comma and each column has more than 1000000 rows. Can anyone help me to find empty field in all columns and delete that empty field alone and lift that column up by one row. Data with empty field: A74203XYZ,A21718XYZ,A72011XYZ,A41095XYZ,...

6. Shell Programming and Scripting

How to extract a field from ls-l command and display?

So I want to put a line at the end of my script which greps for keywords from syslog.log that outputs the following after it is done: "This file was last modified on (thisdate)" I know I can use the following to get the date: rtidsvb(izivanov):/home/izivanov> ll /var/adm/syslog/syslog.log ...

7. Shell Programming and Scripting

Splitting large file and renaming based on field

I am trying to update an older program on a small cluster. It uses individual files to send jobs to each node. However the newer database comes as one large file, containing over 10,000 records. I therefore need to split this file. It looks like this: HMMER3/b NAME 1-cysPrx_C ACC ...

8. Shell Programming and Scripting

How to remove empty field in a text file?

Hi all, I want to remove empty field in a text file. I tried to used sed. But it failed. Input: LG10_PM_map_19_LEnd 1000560 G AG AG LG10_PM_map_19_LEnd 1005621 G AG LG10_PM_map_19_LEnd 1011214 A AG AG LG10_PM_map_19_LEnd 1011673 T CT CT ...

9. UNIX for Dummies Questions & Answers

Extract spread columns from large file

Dear all, I want to extract around 300 columns from a very large file with almost 2million columns. There are no headers, but I can find out which column numbers I want. I know I can extract with the function 'cut -f2' for example just the second column but how do I do this for such a large...

10. Shell Programming and Scripting

Need to extract 8 characters from a large file.

Hi All!! I have a large file containing millions of records. My purpose is to extract 8 characters immediately from the given file. 222222222|ZRF|2008.pdf|2008|01/29/2009|001|B|C|C 222222222|ZRF|2009.pdf|2009|01/29/2010|001|B|C|C 222222222|ZRF|2010.pdf|2010|01/29/2011|001|B|C|C...

LEARN ABOUT DEBIAN

tablify

TABLIFY(1p)						User Contributed Perl Documentation					       TABLIFY(1p)

NAME

       tablify - turn a delimited text file into a text table

SYNOPSIS

	 tablify [options] file

       Options:

	 -h|--help	     Show help
	 --no-headers	     Assume first line is data, not headers
	 --no-pager	     Do not use $ENV{'PAGER'} even if defined
	 --strip-quotes      Strip " or ' around fields
	 -l|--list	     List the fields in the file (for use with -f)
	 -f|--fields=f1[,f2] Show only fields in comma-separated list;
			     when used in conjunction with "no-headers"
			     the list should be field numbers (starting at 1);
			     otherwise, should be field names
	 -w|where=f<cmp>v    Apply the "cmp" Perl operator to restrict output
			     where field "f" matches the value "v";  acceptable
			     operators include ==, eq, >, >=, <=, and =~
	 -v|--vertical	     Show records vertically
	 --limit=n	     Limit to first "n" records
	 --fs=x 	     Use "x" as the field separator
			     (default is tab "	")
	 --rs=x 	     Use "x" as the record separator
			     (default is newline "
")
	 --as-html	     Create an HTML table instead of plain text

DESCRIPTION

       This script is essentially a quick way to parse a delimited text file and view it as a nice ASCII table.  By selecting only certain fields,
       employing a where clause to only select records where a field matches some value, and using the limit to only see some of the output, you
       almost have a mini-database front-end for a simple text file.

EXAMPLES

       Given a data file like this:

	 name,rank,serial_no,is_living,age
	 George,General,190293,0,64
	 Dwight,General,908348,0,75
	 Attila,Hun,,0,56
	 Tojo,Emporor,,0,87
	 Tommy,General,998110,1,54

       To find the fields you can reference, use the list option:

	 $ tablify --fs ',' -l people.dat
	 +-----------+-----------+
	 | Field No. | Field	 |
	 +-----------+-----------+
	 | 1	     | name	 |
	 | 2	     | rank	 |
	 | 3	     | serial_no |
	 | 4	     | is_living |
	 | 5	     | age	 |
	 +-----------+-----------+

       To extract just the name and serial numbers, use the fields option:

	 $ tablify --fs ',' -f name,serial_no people.dat
	 +--------+-----------+
	 | name   | serial_no |
	 +--------+-----------+
	 | George | 190293    |
	 | Dwight | 908348    |
	 | Attila |	      |
	 | Tojo   |	      |
	 | Tommy  | 998110    |
	 +--------+-----------+
	 5 records returned

       To extract the first through third fields and the fifth field (where field numbers start at "1" -- tip: use the list option to quickly
       determine field numbers), use this syntax for fields:

	 $ tablify --fs ',' -f 1-3,5 people.dat
	 +--------+---------+-----------+------+
	 | name   | rank    | serial_no | age  |
	 +--------+---------+-----------+------+
	 | George | General | 190293	| 64   |
	 | Dwight | General | 908348	| 75   |
	 | Attila | Hun     |		| 56   |
	 | Tojo   | Emporor |		| 87   |
	 | Tommy  | General | 998110	| 54   |
	 +--------+---------+-----------+------+
	 5 records returned

       To select only the ones with six serial numbers, use a where clause:

	 $ tablify --fs ',' -w 'serial_no=~/^d{6}$/' people.dat
	 +--------+---------+-----------+-----------+------+
	 | name   | rank    | serial_no | is_living | age  |
	 +--------+---------+-----------+-----------+------+
	 | George | General | 190293	| 0	    | 64   |
	 | Dwight | General | 908348	| 0	    | 75   |
	 | Tommy  | General | 998110	| 1	    | 54   |
	 +--------+---------+-----------+-----------+------+
	 3 records returned

       To find Dwight's record, you would do this:

	 $ tablify --fs ',' -w 'name eq "Dwight"' people.dat
	 +--------+---------+-----------+-----------+------+
	 | name   | rank    | serial_no | is_living | age  |
	 +--------+---------+-----------+-----------+------+
	 | Dwight | General | 908348	| 0	    | 75   |
	 +--------+---------+-----------+-----------+------+
	 1 record returned

       To find the name of all the people with a serial number who are living:

	 $ tablify --fs ',' -f name -w 'is_living==1' -w 'serial_no>0' people.dat
	 +-------+
	 | name  |
	 +-------+
	 | Tommy |
	 +-------+
	 1 record returned

       To filter outside of program and simply format the results, use "-" as the last argument to force reading of STDIN (and probably assume no
       headers):

	 $ grep General people.dat | tablify --fs ',' -f 1-3 --no-headers -
	 +---------+--------+--------+
	 | Field1  | Field2 | Field3 |
	 +---------+--------+--------+
	 | General | 190293 | 0      |
	 | General | 908348 | 0      |
	 | General | 998110 | 1      |
	 +---------+--------+--------+
	 3 records returned

       When dealing with data lacking field names, you can specify "no-headers" and then refer to fields by number (starting at one), e.g.:

	 $ tail -5 people.dat | tablify --fs ',' --no-headers -w '3 eq "General"' -
	 +--------+---------+--------+--------+--------+
	 | Field1 | Field2  | Field3 | Field4 | Field5 |
	 +--------+---------+--------+--------+--------+
	 | George | General | 190293 | 0      | 64     |
	 | Dwight | General | 908348 | 0      | 75     |
	 | Tommy  | General | 998110 | 1      | 54     |
	 +--------+---------+--------+--------+--------+
	 3 records returned

       If your file has many fields which are hard to see across the screen, consider using the vertical display with "-v" or "--vertical", e.g.:

	 $ tablify --fs ',' -v --limit 1 people.dat
	 ************ Record 1 ************
	      name: George
	      rank: General
	 serial_no: 190293
	 is_living: 0
	      age : 64

	 1 record returned

SEE ALSO

       o   Text::RecordParser

       o   Text::TabularDisplay

       o   DBD::CSV

	   Although I don't DBD::CSV this module, the idea was much the inspiration for this.  I just didn't want to have to install DBI and
	   DBD::CSV to get this kind of functionality.	I think my interface is simpler.

AUTHOR

       Ken Youens-Clark <kclark@cpan.org>.

LICENSE AND COPYRIGHT

       Copyright (C) 2006-10 Ken Youens-Clark.	All rights reserved.

       This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
       the Free Software Foundation; version 2.

       This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.

perl v5.10.1							    2010-07-26							       TABLIFY(1p)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract data from large file 80+ million records

Discussion started by: learner16s

2. Shell Programming and Scripting

split large file based on field criteria

Discussion started by: asriva

3. Shell Programming and Scripting

extract unique pattern from large text file

Discussion started by: shijujoe

4. Shell Programming and Scripting

awk - if field is empty, move line to new file

Discussion started by: scriptr2be