How to remove mth and nth column from a file?

08-06-2013

Registered User

7,747, 559

Join Date: Feb 2007

Last Activity: 20 April 2020, 11:28 AM EDT

Location: The Netherlands

Posts: 7,747

Thanks Given: 139

Thanked 559 Times in 520 Posts

Try this:

Code:

awk -F, '{sub($a FS,x)} sub($(b-1) FS,x)' a=7 b=12 file

Franklin52

View Public Profile for Franklin52

Find all posts by Franklin52

08-06-2013

Registered User

1,040, 213

Join Date: Dec 2010

Last Activity: 20 September 2014, 2:08 AM EDT

Posts: 1,040

Thanks Given: 62

Thanked 213 Times in 203 Posts

Should work without brackets too
awk -F, 'sub($a FS,x) sub($(b-1) FS,x)' a=7 b=12 file

This works by changing the column and its filed separator $a FS to x
Since x is not defined, it will be blank.

Jotne

View Public Profile for Jotne

Find all posts by Jotne

08-06-2013

Registered User

12, 0

Join Date: Jan 2013

Last Activity: 11 July 2014, 2:02 AM EDT

Posts: 12

Thanks Given: 8

Thanked 0 Times in 0 Posts

This is done.. Thanks Everybody for your helps

zaq1xsw2

View Public Profile for zaq1xsw2

Find all posts by zaq1xsw2

08-06-2013

Registered User

1,040, 213

Join Date: Dec 2010

Last Activity: 20 September 2014, 2:08 AM EDT

Posts: 1,040

Thanks Given: 62

Thanked 213 Times in 203 Posts

Or you can use cut

Code:

m=7
n=12
cut -d, --complement -s -f$m,$n file

This User Gave Thanks to Jotne For This Post:

Jotne

View Public Profile for Jotne

Find all posts by Jotne

08-06-2013

Registered User

48, 10

Join Date: Aug 2010

Last Activity: 30 August 2014, 6:29 AM EDT

Posts: 48

Thanks Given: 0

Thanked 10 Times in 9 Posts

Quote:

Originally Posted by Jotne

This should do it

Code:

awk -F, '{$a=$b="";gsub(FS "+",FS)} 1' a=$m b=$n OFS=, df_test_removing_column

The first and last columns won't have an extra pair of comma if they are deleted so FS "+" would not work. Also if one column was originally empty that column would also be deleted. It's better to just use split and array looping instead:

Code:

awk -v m=3 -v n=9 -- '{
    last = split($0, a, /,/)
    append = 0
    for (i = 1; i < last; ++i) {
        if (i != m && i != n) {
            if (append) {
                printf "," a[i]
            } else {
                printf a[i]
                append = 1
            }
        }
    }
    print ""
}' file

Or

Code:

awk -v m=3 -v n=9 -- '{ last = split($0, a, /,/); append = 0; for (i = 1; i < last; ++i) { if (i != m && i != n) { if (append) { printf "," a[i]; } else { printf a[i]; append = 1; }; }; } print ""; }' file

If we use bash for the shell it could be simpler:

Code:

#!/bin/bash

function remove_columns {
    local A LINE IFS=,
    while read -ra LINE; do
        for A; do
            unset "LINE[$A]"
        done
        echo "${LINE[*]}"
    done
}

remove_columns "$m" "$n" < file  ## could be more than two columns specified

konsolebox

View Public Profile for konsolebox

Find all posts by konsolebox

08-06-2013

Registered User

3,231, 978

Join Date: Dec 2009

Last Activity: 11 June 2014, 8:40 PM EDT

Posts: 3,231

Thanks Given: 179

Thanked 978 Times in 791 Posts

Quote:

Originally Posted by Franklin52

Try this:

Code:

awk -F, '{sub($a FS,x)} sub($(b-1) FS,x)' a=7 b=12 file

Quote:

Originally Posted by Jotne

Should work without brackets too
awk -F, 'sub($a FS,x) sub($(b-1) FS,x)' a=7 b=12 file

This works by changing the column and its filed separator $a FS to x
Since x is not defined, it will be blank.

These suggestions should not be trusted for a moment. Blindly treating literal text data as regular expressions is asking for trouble. Both sub($a FS, x) and sub($(b-1) FS, x) are problematic in multiple respects.

If that text contains a regular expression metacharacter, who knows where in $0 and how much of $0 it will match.

Even if there are no metacharacters, neither substitution is guaranteed to occur at the correct field; if a preceding field matches the regular expression, that earlier field takes precedence. For example, if a is 5, nothing prevents $5 FS from matching at any point between $1 and $5 inclusive.

If there are metacharacters, the situation is worse, because then a literal string when treated as a regular expression may not even match itself. For example, both of the following expressions are false: "(1)" FS ~ "(1)" FS and "[a]" FS ~ "[a]" FS. This means that for a=5 $5 FS may not match itself but it could match at some other location, both before or after $5.

In Franklin52's code, since the numeric return value of sub($(b-1) FS,x) evaluated in a boolean context controls printing, and since (as just pointed out) literal text as a regular expression may not match itself, entire records could be silently deleted if the controlling sub makes no substitutions.

In Jotne's version, the two numeric return values are converted to strings, concatenated, and the result is evaluated in a boolean context. Since there will always be a numeric return value, and since no number converts to a null string, the string which is evaluated in a boolean context is never empty and so is always true. Even if there were no other problems with the code, I would recommend against this because of the subtlety involved. The chances are extremely high that whoever inherits this code will not fully understand it.

Finally, there's also the possibility that the text is an undefined regular expression, which could produce different output on different awk implementations given identical input.

Quote:

Originally Posted by Jotne

Or you can use cut

Code:

m=7
n=12
cut -d, --complement -s -f$m,$n file

Great solution, so long as portability isn't a concern.

If portability is a constraint, --complement is disallowed. In which case using cut would require constructing the -f option-argument from $m and $n, yielding something similar to 1-($m-1),($m+1)-($n-1),($n+1)-. However, the logic required to properly handle all boundary conditions isn't worth the trouble when there's a simple, portable AWK solution available (presented below).

Quote:

Originally Posted by konsolebox

It's better to just use split and array looping instead:

Code:

awk -v m=3 -v n=9 -- '{
    last = split($0, a, /,/)
    append = 0
    for (i = 1; i < last; ++i) {
        if (i != m && i != n) {
            if (append) {
                printf "," a[i]
            } else {
                printf a[i]
                append = 1
            }
        }
    }
    print ""
}' file

I agree with you that the best (only?) AWK solution is to iterate over the fields, excluding the undesirables. It is a robust approach that's immune to all the problems arising from treating literal text as regular expressions.

I did not test your code, but looking at it there appears to be an off-by-one bug at i < last. last corresponds to the final field and it is never printed. It should be i <= last.

Aside from that, your implementation is also a bit overcomplicated. There is no need to explicitly split the record into an array when AWK has already split it into field variables for your convenience.

For portability, simplicty, and flexibility, I recommend:

Code:

{
    for (i=1; i<=NF; i++)
        if (i != m  &&  i != n)
            s = s OFS $i
    print substr(s, length(OFS)+1)
    s=""
}

Obviously, FS and OFS must be set to the appropriate values.

Quote:

Originally Posted by zaq1xsw2

This is done.. Thanks Everybody for your helps

For the sake of those who follow in your footsteps, seeking a solution to an identical or similar problem, the least you can do is state how you solved your problem. This is especially true if all the suggestions provided to you were inadequate and you either crafted your own approach or found one elsewhere.

Regards,
Alister

Last edited by alister; 08-06-2013 at 06:55 PM..

These 2 Users Gave Thanks to alister For This Post:

alister

View Public Profile for alister

Find all posts by alister

08-06-2013

Registered User

5,091, 1,931

Join Date: May 2012

Last Activity: 15 July 2020, 4:46 AM EDT

Location: Simplicity

Posts: 5,091

Thanks Given: 565

Thanked 1,931 Times in 1,668 Posts

Another formatting trick

Code:

{
  sep=""
  for (i=1; i<=NF; i++)
    if (i != m  &&  i != n) {
      printf sep"%s", $i
      sep=OFS
    }
  print ""
}

These 2 Users Gave Thanks to MadeInGermany For This Post:

MadeInGermany

View Public Profile for MadeInGermany

Find all posts by MadeInGermany

Shell Programming and Scripting

How to remove mth and nth column from a file?

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Taking nth column and putting its value in n+1 column using awk

Discussion started by: umarsatti

2. Shell Programming and Scripting

How to search and replace string from nth column from a file?

Discussion started by: Amit Joshi

3. Shell Programming and Scripting

Remove the values from a certain column without deleting the Column name in a .CSV file

Discussion started by: dhruuv369

4. Shell Programming and Scripting

Break Column nth in a CSV file into two

Discussion started by: awk-admirer

5. Shell Programming and Scripting

Need help with awk statement to break nth column in csv file into 3 separate columns

Discussion started by: awk-admirer

6. Shell Programming and Scripting

Get the nth word of mth line in a file

Discussion started by: freezelty

7. Shell Programming and Scripting

Calculating average for every Nth line in the Nth column

Discussion started by: ncwxpanther

8. Shell Programming and Scripting

Using AWK to find top Nth values in Nth column

Discussion started by: ncwxpanther

9. Shell Programming and Scripting

How to Print from nth field to mth fields using awk

Discussion started by: machomaddy