awk script to (un)/concatenate fields in file Post: 302406245

Sponsored Content

Top Forums Shell Programming and Scripting awk script to (un)/concatenate fields in file Post 302406245 by anthony.cros on Monday 22nd of March 2010 11:24:41 AM

03-22-2010

Registered User

awk script to (un)/concatenate fields in file

Hi everyone,

I'm trying to use the "join" function for more than 1 field. Since it's not possible as it is, I want to take my input files and concatenate the joining fields as 1 field (separated by "|"). I wrote 2 awk script to do and undo it (see below). However I'm new to awk and I'm certain I could do it in a much more efficient way.

I found various topics around the question but often the syntax proposed is a bit of a mystery to me. For instance someone posted this:

BEGIN{FS=OFS="\t"}NR==FNR{a[$1$2]=$4;b[$1$2]=$5;c[$1$2]=$6;next}{$4=$4-a[$1$2];$5=$5-b[$1$2];$6=$6-c[$1$2]}1

what does the trailing '1' mean? what are there 2 separated {} and what distinguish them? finally, where can I find doc on that kind of questions (googling "awk trailing digit" didn't help me much!!)

Here are my scripts, I don't care much about syntax shortcuts, I only care about speed of execution!

any help would be greatly appreciated Smilie

to concatenate:

Code:

#!/bin/sh
#
# usage:
#     nawk -F$'\t' -v JF=3,5 -f concatene.awk ~/tmp/tmp15
#     nawk -F$'\t' -v JF=15,16,17,18 -f concatene.awk split/snp_j > concat
#
# JF stands for "join fields"
BEGIN { FS="\t";OFS="\t" }
{ 
    if (NR==1) {    # to do it only once (NR starts at 1)
        N=split(JF,JFS,",");
        for (i=1;i<=N;i++) {    # reverse it
            RJFS[JFS[i]] = i;
        }
    }

    LINE="";
    for (FIELD_INDEX=1 ; FIELD_INDEX<=N ; FIELD_INDEX++ ) {
        LINE=(FIELD_INDEX==1 ? "" : LINE"|")$JFS[FIELD_INDEX];
    }
    for (FIELD_INDEX=1 ; FIELD_INDEX<=NF ; FIELD_INDEX++ ) {
        if (!RJFS[FIELD_INDEX]) {
            LINE=LINE"\t"$FIELD_INDEX;
        }
    }
    print LINE;
}

example:
input: a b c d e f
output: c|e a b d f

to "un"concatenate:

Code:

#!/bin/sh
# nawk -F$'\t' -v JF=3,5 -f unconcatene.awk test
BEGIN { FS="\t";OFS="\t" }
{ 
    if (NR==1) {    # to do it only once (NR starts at 1)
        N=split(JF,JFS,",");
        for (i=1;i<=N;i++) {    # reverse it
            RJFS[JFS[i]] = i;
        }
    }

    N2=split($1,JFS2,"|");    # N=N2
    for (i=1;i<=N;i++) {    # reverse it
        RJFS[JFS[i]] = JFS2[i];
    }

    SIZE=NF-1+N;
    FIELD_INDEX=2;
    LINE="";
    for (NEW_FIELD_INDEX=1 ; NEW_FIELD_INDEX<=SIZE ; NEW_FIELD_INDEX++ ) {
        LINE=LINE(NEW_FIELD_INDEX==1 ? "" : "\t");
        if (RJFS[NEW_FIELD_INDEX]) {
            LINE=(LINE)RJFS[NEW_FIELD_INDEX];
        } else {
            LINE=(LINE)$FIELD_INDEX;        
            FIELD_INDEX++;
        }
    }
    print LINE;
}

Thanks!!

example:
input: c|e a b d f
output: a b c d e f

Anthony

anthony.cros

View Public Profile for anthony.cros

Find all posts by anthony.cros

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk sed cut? to rearrange random number of fields into 3 fields

I'm working on formatting some attendance data to meet a vendors requirements to upload to their system. With some help on the forums here, I have the data close. But they've since changed what they want. The vendor wants me to submit three fields to them. Field 1 is the studentid field,...

2. Shell Programming and Scripting

Get 4 character each from 2 different fields concatenate and add as a new field

Hi, I have a huge text file. It looks like abcde bangalo country 12345 lastfield i want to get first 3 characters from field1 and first 3 characters from field 2 and insert the result as a new field. example the result should be: abcde bangalo abcban country 12345 lastfield Please...

3. Shell Programming and Scripting

Need awk script to compare 2 fields in fixed length file.

Need a script that manipulates a fixed length file that will compare 2 fields in that file and if they are equal write that line to a new file. i.e. If fields 87-93 = fields 119-125, then write the entire line to a new file. Do this for every line in the file. After we get only the fields...

4. UNIX for Advanced & Expert Users

Concatenate lines in file shell script

Hi colleagues, I have a file in this format. "/cccc/pppp/dddd/ggg/prueba.txt". ERROR" THE error bbbbbbbbbb finish rows. "/kkkk/mmmm/hhhh/jjj/ejemplo.txt". ERROR This is other error rows.I need my file in this format. "/cccc/pppp/dddd/ggg/prueba.txt". ERROR" THE error bbbbbbbbbb finish rows....

5. Shell Programming and Scripting

Comparing two csv file fields using awk script

Hi All, I want to remove the rows from File1.csv by comparing the columns/fields in the File2.csv. I only need the records whose first column is same and the second column is different for the same record in both files.Here is an example on what I need. File1.csv: RAJAK|ACTIVE|1...

6. Shell Programming and Scripting

How to get fields and get output with awk or shell script.?

7. Shell Programming and Scripting

awk script to parse case with information in two fields of file

The below awk parser works for most data inputs, but I am having trouble with the last one. The problem is in the below rules steps 1 and 2 come from $2 (NC_000013.10:g.20763686_20763687delinsA) and steps 3 and 4 come from $1 (NM_004004.5:c.34_35delGGinsT). Parse Rules: The header is...

8. Shell Programming and Scripting

XML Fields comparison using awk script

Hello All, I have many zipped XMLs (example file name in tgz formate - file_rec.trx.2016-01-23.000123.exc.85sesdzd45wsds5299c8f2994f7.tgz) looks following and I need to verify two numbers, they are RecordNumber and EnrolData (only sequence number, NOT hole). for all the records, both should be...

9. Shell Programming and Scripting

awk script concatenate two column and perform mutiplication

Need your help in solving this puzzle. Any kind of help will be appreciated and link for any documents to read and learn and to deal with such scenarios would be helpful Concatenate column1 and column2 of file 1. Then check for the concatenated value in Column1 of File2. If found extract the...

10. UNIX for Beginners Questions & Answers

Concatenate a string and number and compare that with another string in awk script

I have below code inside my awk script if ( $0 ~ /SVC IN:/ ) { svc_in=substr( $0,23 , 3); if (msg_start == 1 && msg_end == 0) { msg_arr=$0; } } else if ( $0 ~ /^SVC OUT:/ ) { svc_out=substr( $0, 9, 3); if (msg_start == 1 && msg_end == 0) ...

LEARN ABOUT LINUX

igawk

IGAWK(1)							 Utility Commands							  IGAWK(1)

NAME

       igawk - gawk with include files

SYNOPSIS

       igawk [ all gawk options ] -f program-file [ -- ] file ...
       igawk [ all gawk options ] [ -- ] program-text file ...

DESCRIPTION

       Igawk is a simple shell script that adds the ability to have ``include files'' to gawk(1).

       AWK programs for igawk are the same as for gawk, except that, in addition, you may have lines like

	      @include getopt.awk

       in your program to include the file getopt.awk from either the current directory or one of the other directories in the search path.

OPTIONS

       See gawk(1) for a full description of the AWK language and the options that gawk supports.

EXAMPLES

       cat << EOF > test.awk
       @include getopt.awk

       BEGIN {
	    while (getopt(ARGC, ARGV, "am:q") != -1)
		 ...
       }
       EOF

       igawk -f test.awk

SEE ALSO

       gawk(1)

       Effective AWK Programming, Edition 1.0, published by the Free Software Foundation, 1995.

AUTHOR

       Arnold Robbins (arnold@skeeve.com).

Free Software Foundation					    Nov 3 1999								  IGAWK(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk sed cut? to rearrange random number of fields into 3 fields

Discussion started by: axo959

2. Shell Programming and Scripting

Get 4 character each from 2 different fields concatenate and add as a new field

Discussion started by: ajithshankar@ho

3. Shell Programming and Scripting

Need awk script to compare 2 fields in fixed length file.

Discussion started by: Muga801

4. UNIX for Advanced & Expert Users

Concatenate lines in file shell script

Discussion started by: systemoper