Sponsored Content
Top Forums Shell Programming and Scripting awk to format file and combine two fields using comma Post 302994904 by cmccabe on Wednesday 29th of March 2017 01:01:42 PM
Old 03-29-2017
Thank you both very much, works great Smilie.

---------- Post updated at 12:01 PM ---------- Previous update was at 09:34 AM ----------

Since my actual data can vary on length (that is have multiple commas in it), I can not seem to account for that. There will never less less than 1 comma to split on. Thank you Smilie.


file
Code:
1799	NM_002036.3	chr1	+	159173802	159176290	159174749	159176240	2	159173802,159175250,	159174770,159176290,	0	ACKR1	cmpl	cmpl	0,0,
978	NM_002443.3	chr10	+	51549552	51562518	51549584	51562400	4	51549552,51555730,51556750,51562270,	51549587,51555836,51556856,51562518,	0	MSMB	cmpl	cmpl	0,0,1,2,
732	NM_002404.2	chr17	-	19286754	19290532	19287774	19290468	6	19286754,19288411,19288670,19289622,19290072,19290462,	19288022,19288594,19288767,19289777,19290151,19290532,	0	MFAP4	cmpl	cmpl	1,1,0,1,0,0,
1629	NM_003467.2	chr2	-	136871918	136875725	136872438	136875630	1	136871918,	136873482,	0	CXCR4	cmpl	cmpl	0,0,

current output
Code:
awk -F '\t' '{split($10,s1,",");split($11,s2,","); print $9,$13,$2,s1[1]"-"s2[1],s1[2]"-"s2[2]}' file
2 ACKR1 NM_002036.3 159173802-159174770 159175250-159176290
4 MSMB NM_002443.3 51549552-51549587 51555730-51555836
6 MFAP4 NM_002404.2 19286754-19288022 19288411-19288594
1 CXCR4 NM_003467.2 136871918-136873482 -

desired output
Code:
2 ACKR1 NM_002036.3 159173802-159174770 159175250-159176290
4 MSMB NM_002443.3 51549552-51549587 51555730-51555836 51556750-51556856 51562270-51562518
6 MFAP4 NM_002404.2 19286754-19288022 19288411-19288594 19288670-19288767 19289622-19289777 19290072-19290151 19290462-19290532
1 CXCR4 NM_003467.2 136871918-136873482

 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Remove whitespaces between comma separated fields from file

Hello all, I am a unix dummy. I am trying to remove spaces between fields. I have the file in the following format 12332432, 2345 , asdfsdf ,100216 , 9999999 12332431, 2341 , asdfsd2 ,100213 , 9999999 &... (2 Replies)
Discussion started by: nitinbjoshi
2 Replies

2. UNIX for Dummies Questions & Answers

Sort the fields in a comma delimited file

Hi, I have a comma delimited file. I want to sort the fields alphabetically and again store them in a comma delimited file. For example, My file looks like this. abc,aaa,xyz,xxx,def pqr,ggg,eee,iii,qqq zyx,lmo,pqr,abc,fff and I want my output to look like this, all fields sorted... (3 Replies)
Discussion started by: swethapatil
3 Replies

3. Shell Programming and Scripting

How to combine first,second and third fields in a file

Hi Guys, I have a file as follows: 1 2 3 There are 5 spaces between 1 and 2 and one space between 2 and 3. I want: 123 How can I do this using awk? I tried using: (8 Replies)
Discussion started by: npatwardhan
8 Replies

4. Shell Programming and Scripting

awk print - fields separated with comma's need to ignore inbetween double quotes

I am trying to re-format a .csv file using awk. I have 6 fields in the .csv file. Some of the fields are enclosed in double quotes and contain comma's inside the quotes. awk is breaking this into multiple fields. Sample lines from the .csv file: Device Name,Personnel,Date,Solution... (1 Reply)
Discussion started by: jxrst
1 Replies

5. UNIX for Dummies Questions & Answers

Awk: Wondering how to format numbers with comma

I have tried the following commands and can't get a number to format with commas: echo 1234567.12 |awk '{printf("%-12s %20s\n", $0, comma($0)) }' This prints out value 50000 without a comma for i in *13*; do (cd $i && du -sk . && echo $i);done|grep -v 0000|gawk -F OFS="," ' {SUM += $1}... (8 Replies)
Discussion started by: newbie2010
8 Replies

6. Shell Programming and Scripting

Awk: Combine multiple lines based on number of fields

If a file has following kind of data, comma delimited 1,2,3,4 1 1 1,2,3,4 1,2 2 2,3,4 My required output must have only 4 columns with comma delimited 1,2,3,4 111,2,3,4 1,222,3,4 I have tried many awk command using ORS="" but couldnt progress (10 Replies)
Discussion started by: mdkm
10 Replies

7. Shell Programming and Scripting

Combine two Fields if there's a comma between

Hi All, Seeking for your assistance on how to combine two fields if there's a comma between them. What i did was, extract the records from database. file1.csv(extracted from db) 82|Supplies Station, Inc.|112012|KARISSA APPAREL, INC. - 112012|NON TRADE F/A AND... (2 Replies)
Discussion started by: znesotomayor
2 Replies

8. Shell Programming and Scripting

awk to combine matches and use a field to adjust coordinates in other fields

Trying to output a result that uses the data from file to combine and subtract specific lines. If $4 matches in each line then the last $6 value is added to $2 and that becomes the new$3. Each matching line in combined into one with $1 then the original $2 then the new$3 then $5. For the cases... (4 Replies)
Discussion started by: cmccabe
4 Replies

9. Shell Programming and Scripting

awk to combine all matching fields in input but only print line with largest value in specific field

In the below I am trying to use awk to match all the $13 values in input, which is tab-delimited, that are in $1 of gene which is just a single column of text. However only the line with the greatest $9 value in input needs to be printed. So in the example below all the MECP2 and LTBP1... (0 Replies)
Discussion started by: cmccabe
0 Replies

10. UNIX for Beginners Questions & Answers

How to extract fields from a CSV i.e comma separated where some of the fields having comma as value?

can anyone help me!!!! How to I parse the CSV file file name : abc.csv (csv file) The above file containing data like abv,sfs,,hju,',',jkk wff,fst,,rgr,',',rgr ere,edf,erg,',',rgr,rgr I have a requirement like i have to extract different field and assign them into different... (4 Replies)
Discussion started by: J.Jena
4 Replies
IGAWK(1)							 Utility Commands							  IGAWK(1)

NAME
igawk - gawk with include files SYNOPSIS
igawk [ all gawk options ] -f program-file [ -- ] file ... igawk [ all gawk options ] [ -- ] program-text file ... DESCRIPTION
Igawk is a simple shell script that adds the ability to have ``include files'' to gawk(1). AWK programs for igawk are the same as for gawk, except that, in addition, you may have lines like @include getopt.awk in your program to include the file getopt.awk from either the current directory or one of the other directories in the search path. OPTIONS
See gawk(1) for a full description of the AWK language and the options that gawk supports. EXAMPLES
cat << EOF > test.awk @include getopt.awk BEGIN { while (getopt(ARGC, ARGV, "am:q") != -1) ... } EOF igawk -f test.awk SEE ALSO
gawk(1) Effective AWK Programming, Edition 1.0, published by the Free Software Foundation, 1995. AUTHOR
Arnold Robbins (arnold@skeeve.com). Free Software Foundation Nov 3 1999 IGAWK(1)
All times are GMT -4. The time now is 04:34 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy