Fast way to cut fields


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Fast way to cut fields
# 1  
Old 07-13-2006
Fast way to cut fields

I have a tab delimited file which has 90 fields and 15 million rows approximately...

I need to cut the first 78 fields only and I am using the

Code:
cut -f1-78 <filename> <outputfile>

It is taking a lot of time...

Is there a faster way of doing this? Can we use NAWK to gain better performance?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Cut command: can't make it cut fields

I'm a complete beginner in UNIX (and not a computer science student either), just undergoing a tutoring course. Trying to replicate the instructions on my own I directed output of the ls listing command (lists all files of my home directory ) to My_dir.tsv file (see the screenshot) to make use of... (9 Replies)
Discussion started by: scrutinizerix
9 Replies

2. Shell Programming and Scripting

Cut counting consecutive delimiters as fields

When cut encounters consecutive delimiters it seems to count each instance as a field, at least with spaces. Is this typical behavior for any delimiter? #:~$ ifconfig eth0 | grep HWaddr eth0 Link encap:Ethernet HWaddr 94:de:80:a7:6d:e1 #:~$ ifconfig eth0 | grep HWaddr | cut -d " " -f... (6 Replies)
Discussion started by: Riker1204
6 Replies

3. Shell Programming and Scripting

How To Count Fields For Cut?

I am new to cut and I want to use the field option with a space delimiter on an Apache log file. For example, if I wanted to find the 200 HTTP code using cut in this manner on the file below cat access_abc.log | cut -d' ' -f7 | grep "200" 157.55.39.183 - - "GET /content/696-news041305... (4 Replies)
Discussion started by: sharingsunshine
4 Replies

4. UNIX for Dummies Questions & Answers

Cut fields between delimiters

I'm having bother getting both lines contained in a file to output as the same value. A simple example: john:123456:123:456:doe john:123456:123:doe cut -d: -f1,4 input file john:456 john:doe ^ first line should be same as second. trick one for me, i know why it's because of the... (4 Replies)
Discussion started by: landofus
4 Replies

5. Shell Programming and Scripting

Cut 2 fields and write to a output file

Hi, I am writing a code where the file is a pipe delimited and I would need to extract the 2nd part of field2 if it is "ATTN", "C/O" or "%" and check to see if field9 is populated or not. If field9 is already populated then leave it as is but if field9 is not populated then take the 2nd part of... (3 Replies)
Discussion started by: msalam65
3 Replies

6. Shell Programming and Scripting

how to cut fields in file

Hi, I have data in following format. 10001, John, Daves, Architecture, -2219 10002, Jim, Cirners, Businessman, -2219 1003, Tom, Katch, Engineer, -14003 I want to select the last column of the above given file and paste it on a different file in the following manner. File TEST column... (11 Replies)
Discussion started by: kam786sim
11 Replies

7. Shell Programming and Scripting

awk sed cut? to rearrange random number of fields into 3 fields

I'm working on formatting some attendance data to meet a vendors requirements to upload to their system. With some help on the forums here, I have the data close. But they've since changed what they want. The vendor wants me to submit three fields to them. Field 1 is the studentid field,... (4 Replies)
Discussion started by: axo959
4 Replies

8. Shell Programming and Scripting

cut: get either one or two fields

Hello, I would like to extract one or two fields of a line. At the moment, I am extracting the first field of a line: command | cut -f 1 -d '.' > file The line can have two or three fields delimited with a dot. if three fields, I want to be able to get the first two ie if line =... (3 Replies)
Discussion started by: maxvirrozeito
3 Replies

9. Shell Programming and Scripting

Cut Last 3 Fields

I have a text: dsj khfksjdh <time> EST 2006 ab cgnr jkkjt <time> EST 2006 gfhdgjghg <time> EST 2006 fkdjh kjhsekjrh kdjhfkh jhdfkhfdkjh kjdf <time> EST 2006 In the above file i need to extract time from every line... which is always the third from the last... Pls help! Cheers, Bouren (4 Replies)
Discussion started by: bourne
4 Replies

10. Shell Programming and Scripting

how to cut fields

I want to cut two coloums simulatiously and paste in some other file for ex: cut d ' ' -f3 -f4 xxx | paste yyy - > zzz; from the above i want to cut two fileds 3 and 4 and paste as last coloums of single file (zzz). how to solve this regards rajan (1 Reply)
Discussion started by: rajan_ka1
1 Replies
Login or Register to Ask a Question
cut(1)							      General Commands Manual							    cut(1)

NAME
cut - Displays specified parts from each line of a file SYNOPSIS
cut -b list [-n] [file...] cut -c list [file...] cut -f list [-d delim] [-s] [file...] STANDARDS
Interfaces documented on this reference page conform to industry standards as follows: cut: XCU5.0 Refer to the standards(5) reference page for more information about industry standards and associated tags. OPTIONS
Cuts based on a list of bytes. Each selected byte is output, unless you also specify the -n option. For example, if you specify -b 1-72, the cut command writes out the first 72 bytes in each line of the file. Cuts based on a list of characters. It is not an error if you specify a character not in the input. Uses the specified character as the field delimiter (separator) when you specify the -f option. You must quote characters with special meaning to the shell, such as the space character. Any character can be used as delim. The default field delimiter is a tab character. Specifies a list of fields assumed to be separated in the file by a field delimiter character, speci- fied by the -d option or the tab character by default. For example, if you specify -f 1,7, the cut command writes out only the first and seventh fields of each line. If a line contains no field delimiters, the cut command passes them through intact (useful for table subhead- ings), unless you specify the -s option. Does not split characters. When specified with the -b option, each element in list of the form low-high (hyphen-separated numbers) is modified as follows: If the byte selected by low is not the first byte of a character, low is decre- mented to select the first byte of the character originally selected by low. If the byte selected by high is not the last byte of a char- acter, high is decremented to select the last byte of the character prior to the character originally selected by high, or zero (0) if there is no prior character. If the resulting range element has high equal to zero (0) or low greater than high, the list element is dropped from list for that input line without causing an error. Each element in list of the form low- is treated as previously described with high set to the number of bytes in the current line, not including the terminating newline character. Each element in list of the form -high is treated as previously described with low set to 1. Each element in list of the form number (a single number) is treated as previously described with low set to number and high set to number. Suppresses lines that do not contain delimiter characters (use only with the -f option). Unless you include this option, lines with no delimiters are passed through. OPERANDS
The path name of the file to be examined. If you do not specify a file or you specify a hyphen (-), the cut command reads standard input. DESCRIPTION
The cut command locates the specified fields in each line of the specified file and writes the characters in those fields to standard out- put. You must specify the -b option (to select bytes), the -c option (to select characters) or the -f option (to select fields). The list argu- ment (see the -b, -c, and -f options) must be a space-separated or comma-separated list of positive numbers and ranges. Ranges can be in three forms: Two positive numbers separated by a hyphen (-), as in the form low-high, which represents all fields from the first number to the second number. A positive number preceded by a hyphen (-), as in the form -high, which represents all fields from field number 1 to that number. A positive number followed by a hyphen (-), as in the form low-, which represents that number to the last field, inclusive. The elements in list can be repeated, can overlap, and can be specified in any order. Some sample list specifications are as follows: First, fourth, and seventh bytes or fields. First through third and eighth bytes or fields. First through fifth and tenth bytes or fields. Third through last bytes or fields. The fields specified by list can be a fixed number of byte positions, or the length can vary from line to line and be marked with a field delimiter character, such as a tab character. [Tru64 UNIX] You can also use the grep command to make horizontal cuts through a file and the paste command to put the files back together. To change the order of columns in a file, use the cut and paste commands. EXIT STATUS
The following exit values are returned: Successful completion. An error occurred. EXAMPLES
To display several fields of each line of a file, enter: cut -f 1,5 -d : /etc/passwd This displays the login name and full user name fields of the system password file. These are the first and fifth fields (-f 1,5) sepa- rated by colons (-d :). So, if the /etc/passwd file looks like this: su:UHuj9Pgdvz0J":0:0:User with special privileges:/: daemon:*:1:1::/etc: bin:*:2:2::/usr/bin: sys:*:3:3::/usr/src: adm:*:4:4:System Admin- istrator:/usr/adm: pierre:*:200:200:Pierre Harper:/u/pierre: joan:*:202:200:Joan Brown:/u/joan: Then, cut -f 1,5 -d : /etc/passwd produces this output: su:User with special privileges daemon: bin: sys: adm:System Administrator pierre:Pierre Harper joan:Joan Brown ENVIRONMENT VARIABLES
The following environment variables affect the execution of cut: Provides a default value for the internationalization variables that are unset or null. If LANG is unset or null, the corresponding value from the default locale is used. If any of the internationalization vari- ables contain an invalid setting, the utility behaves as if none of the variables had been defined. If set to a non-empty string value, overrides the values of all the other internationalization variables. Determines the locale for the interpretation of sequences of bytes of text data as characters (for example, single-byte as opposed to multibyte characters in arguments and input files). Determines the locale for the format and contents of diagnostic messages written to standard error. Determines the location of message catalogues for the processing of LC_MESSAGES. SEE ALSO
Commands: grep(1), fold(1), join(1), paste(1) Standards: standards(5) cut(1)