Sponsored Content
Top Forums Shell Programming and Scripting Make copy of text file with columns removed (based on header) Post 302931829 by LMHmedchem on Thursday 15th of January 2015 07:10:17 PM
Old 01-15-2015
Quote:
Originally Posted by RudiC
If you want those fields removed in every record, not just the header, try:
Code:
awk     'NR==3          {MX=split (RM, T, " ")
                         for (i=1; i<=NF; i++)
                             for (n=1; n<=MX; n++)
                                 if ($i==T[n]) T[n]=i
                        }
         !(NR%3)        {for (n=1; n<=MX; n++) $(T[n])=""
                         $0=$0; $1=$1
                        }
         1
        ' FS="\t+" OFS="\t" RM="dxv1 k2" file

This approach does not seem to work. The input and output files still have the same number of columns. The values dxv1 and k2 have been removed from the third row, but it looks like for the rest of the file, one column has been removed from every third row instead of the entire column being removed.

I have attached the original file,
original_f0_RSV_1912_A_S1v6_RI7_1916_15-01-10.txt

the file as modified by the code above,
modified_f0_RSV_1912_A_S1v6_RI7_1916_15-01-10.txt

and the output I was trying to create,
intended_f0_RSV_1912_A_S1v6_RI7_1916_15-01-10.txt

The method posted by RavinderSingh13 modifies the third row, but not the rest of the file.

This code does what I want,
Code:
# assign value of header for column to be removed
REMOVE='dxv1'
# assign data input file for $FOLD
BASE_INPUT_FILE_LIST=($(ls './'$SET'/input_data/base/'$FOLD'_'*'_'$SET'_'*'.txt'))
# assign modified input file directory
MOD_INPUT_FILE_DIR=$(ls -d './'$SET'/input_data/')
echo $MOD_INPUT_FILE_DIR

for BASE_INPUT_FILE in "${BASE_INPUT_FILE_LIST[@]}"
do
   echo $BASE_INPUT_FILE
   # change path to filename
   REVISED_FILE=$(echo $BASE_INPUT_FILE | awk 'BEGIN {FS="/"} {print $5}')
   REVISED_FILE='./'$SET'/input_data/'$REVISED_FILE
   echo $REVISED_FILE

   # find the location of the column to be removed
   HEADER_ROW_LIST=($(cat $BASE_INPUT_FILE | sed -n '3p'))
   ELEMENT_COUNTER='0';  HEADER_POSITION='0'

   # loop through headers
   for HEADER_ROW in "${HEADER_ROW_LIST[@]}"
   do
      # incrementer counter
      (( ELEMENT_COUNTER++ ))
      echo $HEADER_ROW
      if [ "$HEADER_ROW" == "$REMOVE" ]; then
         echo "found remove at position" $ELEMENT_COUNTER
         HEADER_POSITION=$ELEMENT_COUNTER
      fi
   done
   echo $REMOVE "was found at position" $HEADER_POSITION

   # create values before and after position to be removed
   let "REMOVE_m1=$HEADER_POSITION-1";  let "REMOVE_p1=$HEADER_POSITION+1";

   echo "REMOVE_m1" $REMOVE_m1
   echo "REMOVE_p1" $REMOVE_p1

   # remove column from file
   cut --output-delimiter=$'\t' -f1-$REMOVE_m1,$REMOVE_p1-  $BASE_INPUT_FILE > $REVISED_FILE

This does not currently allow for more than one column to be removed, though the code could be called separately for each column.

Thanks,

LMHmedchem

Last edited by LMHmedchem; 01-15-2015 at 08:38 PM..
 

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Merging two files based on two columns to make a third file

Hi there, I'm trying to merge two files and make a third file. However, two of the columns need to match exactly in both files AND I want everything from both files in the output if the two columns match in that row. First file looks like this: chr1 10001980 T A Second... (12 Replies)
Discussion started by: infiniteabyss
12 Replies

2. Shell Programming and Scripting

Copy and Paste Columns in a Tab-Limited Text file

I have this text file with a very large number of columns (10,000+) and I want to move the first column to the position of the six column so that the text file looks like this: Before cutting and pasting ID Family Mother Father Trait Phenotype aaa bbb ... (5 Replies)
Discussion started by: evelibertine
5 Replies

3. UNIX for Dummies Questions & Answers

Extracting rows from a text file based on the values of two columns (given ranges)

Hi, I have a tab delimited text file with multiple columns. The second and third columns include numbers that have not been sorted. I want to extract rows where the second column includes a value between -0.01 and 0.01 (including both numbers) and the first third column includes a value between... (1 Reply)
Discussion started by: evelibertine
1 Replies

4. Shell Programming and Scripting

Reading columns from a text file and to make an array for each column

Hi, I am not so familiar with bash scripting and would appreciate your help here. I have a text file 'input.txt' like this: 2 3 4 5 6 7 8 9 10 I want to store each column in an array like this a ={2 5 8}, b={3 6 9}, c={4 7 10} so that i can access any element, e.g b=6 for the later use. (1 Reply)
Discussion started by: Asif Siddique
1 Replies

5. Shell Programming and Scripting

Extract columns based on header

Hi to all, I have two files. File1 has no header, two columns: sample1 A sample2 B sample3 B sample4 C sample5 A sample6 D sample7 D File2 has a header, except for the first 3 columns (chr,start,end). "sample1" is the header for the 4th ,5th ,6th columns, "sample2" is the header... (4 Replies)
Discussion started by: aec
4 Replies

6. Emergency UNIX and Linux Support

Average columns based on header name

Hi Friends, I have files with columns like this. This sample input below is partial. Please check below for main file link. Each file will have only two rows. ... (8 Replies)
Discussion started by: jacobs.smith
8 Replies

7. UNIX for Beginners Questions & Answers

Keep only columns in first two rows based on partial header pattern.

I have this code below that only prints out certain columns from the first two rows (doesn't affect rows 3 and beyond). How can I do the same on a partial header pattern “G_TP” instead of having to know specific column numbers (e.g. 374-479)? I've tried many other commands within this pipe with no... (4 Replies)
Discussion started by: aachave1
4 Replies

8. Shell Programming and Scripting

Find columns in a file based on header and print to new file

Hello, I have to fish out some specific columns from a file based on the header value. I have the list of columns I need in a different file. I thought I could read in the list of headers I need, # file with header names of required columns in required order headers_file=$2 # read contents... (11 Replies)
Discussion started by: LMHmedchem
11 Replies

9. Shell Programming and Scripting

Find header in a text file and prepend it to all lines until another header is found

I've been struggling with this one for quite a while and cannot seem to find a solution for this find/replace scenario. Perhaps I'm getting rusty. I have a file that contains a number of metrics (exactly 3 fields per line) from a few appliances that are collected in parallel. To identify the... (3 Replies)
Discussion started by: verdepollo
3 Replies
All times are GMT -4. The time now is 12:44 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy