Sponsored Content
Top Forums Shell Programming and Scripting Extract columns into seperate file Post 303000208 by Ads89 on Thursday 6th of July 2017 09:30:58 AM
Old 07-06-2017
Extract columns into seperate file

I have a comma delimited file as per the one below and I am currently extracting the values in 2 columns (COL1 & COL6) to produce a smaller trimmed down version of the file which only contains the columns we need;

Code:
COL1,COL2,COL3,COL4,COL5,COL6,COL7,COL8,COL9
1111,AAAA,AAAA,AAAA,AAAA,X100,AAAA,AAAA,XXXX
2222,AAAA,AAAA,AAAA,AAAA,X200,AAAA,AAAA,
3333,AAAA,AAAA,AAAA,AAAA,X300,AAAA,AAAA,XXXX
4444,AAAA,AAAA,AAAA,AAAA,X400,AAAA,AAAA,XXXX
5555,AAAA,AAAA,AAAA,AAAA,X500,AAAA,AAAA,

I now have an additional requirement to only extract the values of COL1 & COL6 when COL9 has value present(could be anything) i.e. lines 1,3,4
The output produced would therefore look something like;

Code:
COL1,COL2
1111,X100
3333,X300
4444,X400

I have the below code which extracts only COL1 & COL2, but need to additional functionality


Code:
 
 awk -F, 'BEGIN {OFS=","} {gsub(/^[ \t]+/, "", $1); gsub(/[ \t]+$/, "", $1); gsub(/^[ \t]+/, "", $6); gsub(/[ \t]+$/, "", $6)} {if (NR>1) {print $1,$6}}' input.csv > output.csv

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how to extract columns from a text file

Hi, In ksh, I have a file with similar rows as follows: Department = 1234 G/L Asset Acct No = 12.0000. 2/29/2008 Department = 1234 G/L Asset Acct No = 13.0000. 3/29/2008. I want to create a new text file that contains only the numbers and date: 1234 12.0000. 2/29/2008 1234 13.0000. ... (16 Replies)
Discussion started by: ihot
16 Replies

2. Shell Programming and Scripting

Help, need to extract columns from file

I have huge fixed width, text file in unix box and I need to extract columns found between the width 105 and 200 and output it to a new file. Can anyone tell me how to extract it? Thanks for your help. (1 Reply)
Discussion started by: kiran2k
1 Replies

3. UNIX for Dummies Questions & Answers

extract only the "numbers" that are present in this file to a seperate file..

It may sounds too easy but appreciate any help on this.. i have a file with the below details: ****************************************** Please find the locked pernr details for the Time evaluation Personnel number could not be locked Person rejected: 04552737 Personnel number could not... (4 Replies)
Discussion started by: rohit.shetty84
4 Replies

4. Shell Programming and Scripting

extract columns from 2 different files and create new file

Hi All, I have 2 issues while working with file. 1. I have 2 delimited(~) files. I want to extract column numbner 3 from file1 and column number 8 from file2 and paste it into file3. I have tried using cut, but not able to get answer. 2. i have 2 filxed-width file. I wanted to do same... (1 Reply)
Discussion started by: Amit.Sagpariya
1 Replies

5. Shell Programming and Scripting

Extract Columns from file

Hi All, Could you please help me with following: I have to parse a .csv file. For example: If the csv file contains 3 columns, then i have to print the column names. The field separator is (comma). example.csv (contains 2 lines as follows) This is,a test file, for validation... (2 Replies)
Discussion started by: vfrg
2 Replies

6. Shell Programming and Scripting

fileutility to extract columns from source file

Hi experts,Please help me for the below requirement.i have a source file.(lets say contains 50 columns). I am extarcting five columns from the source file by using pattern file.for exampleinput file:--------a,b,c,d,"a,g","v b",s,koutputfile=======a,"a,g","v b",s,kThanks in advancesubhendu (1 Reply)
Discussion started by: subhendu81
1 Replies

7. Shell Programming and Scripting

extract columns from file and send mail

Hi I have a file of the form name1,lastname1,email1@gmail.com,9.08243E+12,team1,role1,username1,password1 name2,lastname2,email2@gmail.com,9.08243E+11,team2,role2,username2,password2 I need to extract the email (column 3) and send a mail to each person, with their details ( specifically... (3 Replies)
Discussion started by: pkabali
3 Replies

8. UNIX for Dummies Questions & Answers

Seperate columns according to delimiters

Hi all I need your help to separate colomns based on "-" delimiter for a very big file 30 millions rows I have a colmun looking like this : clomun 1 1-100000989-A_ATC 1-10000179-AAAAA 1-100002154-TGTTA 1-100002155-GTTAG 1-100002443 1-100002490 1-100002619 I need to separte in three... (5 Replies)
Discussion started by: biopsy
5 Replies

9. Shell Programming and Scripting

Command to extract all columns except the last few from a txt file

hello, i have publicly available txt file with little less than 300000 rows. i want to extract from column 1 to column 218 and save it in another text file. i use the cut command but the file is saved with multiple rows from the source file onto a single row in the destination. basically it is... (6 Replies)
Discussion started by: madrazzii
6 Replies

10. Shell Programming and Scripting

Match Columns in one file and extract columns from another file

Kindly help merging information from two files with the following data structure. I want to match for the CHR-SNP in Foo and get the columns that match from CHROM-rsID Fields 1 & 2 of foo may have duplicates, however, a joint key of Fields $1$2$3$4 is unique. Also would be helpful to clean up... (4 Replies)
Discussion started by: genehunter
4 Replies
CUT(1)							    BSD General Commands Manual 						    CUT(1)

NAME
cut -- select portions of each line of a file SYNOPSIS
cut -b list [-n] [file ...] cut -c list [file ...] cut -f list [-d delim] [-s] [file ...] DESCRIPTION
The cut utility selects portions of each line (as specified by list) from each file and writes them to the standard output. If no file argu- ments are specified, or a file argument is a single dash ('-'), cut reads from from the standard input. The items specified by list can be in terms of column position or in terms of fields delimited by a special character. Column numbering starts from 1. The list option argument is a comma or whitespace separated set of increasing numbers and/or number ranges. Number ranges consist of a num- ber, a dash ('-'), and a second number and select the fields or columns from the first number to the second, inclusive. Numbers or number ranges may be preceded by a dash, which selects all fields or columns from 1 to the first number. Numbers or number ranges may be followed by a dash, which selects all fields or columns from the last number to the end of the line. Numbers and number ranges may be repeated, over- lapping, and in any order. It is not an error to select fields or columns not present in the input line. The options are as follows: -b list The list specifies byte positions. -c list The list specifies character positions. -d delim Use the first character of delim as the field delimiter character instead of the tab character. -f list The list specifies fields, delimited in the input by a single tab character. Output fields are separated by a single tab character. -n Do not split multi-byte characters. -s Suppress lines with no field delimiter characters. Unless specified, lines with no delimiters are passed through unmodified. ENVIRONMENT
The LANG, LC_ALL and LC_CTYPE environment variables affect the execution of cut if the -n option is specified. Their effect is described in environ(7). EXAMPLES
Extract users' login names and shells from the system passwd(5) file as ``name:shell'' pairs: cut -d : -f 1,7 /etc/passwd Show the names and login times of the currently logged in users: who | cut -c 1-16,26-38 DIAGNOSTICS
The cut utility exits 0 on success, and >0 if an error occurs. SEE ALSO
paste(1) STANDARDS
The cut utility conforms to IEEE Std 1003.2-1992 (``POSIX.2''). HISTORY
A cut command appeared in AT&T System III UNIX. BUGS
The -c option is a synonym for the -b option, which causes incorrect behaviour in locales that support multibyte characters. When operating on fields (-f option is specified), cut does not recognise multibyte characters, and the delim character is recognised in the middle of multibyte sequences. BSD
June 6, 1993 BSD
All times are GMT -4. The time now is 01:47 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy