Removing Headers and a Column


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Removing Headers and a Column
# 15  
Old 01-30-2008
Sample

Due to the nature of the data I can't provide an actual example. However I have attached a file that I placed a T where the text is, a N where numbers are and a b for the data I do not want.

I showed how the data looks originally, what I want it to look like and what it is coming out as.

I also attached the totals section, hopefully it makes since. If you have any questions let me know.
# 16  
Old 01-30-2008
I'm impressed how much Smiling Dragon did with such vague and confusing requirements to go by. With the benefit of seeing the sample data I came up with this.

Code:
#!/usr/local/bin/perl
use strict;
use warnings;
while(<>){
    next if $. < 7; # skip header lines
    if (/Organization Totals/) {
        print;
        print <>; # prints untill eof
        exit();
    }
    {
        no warnings;
        next if (/^\s{58}/);
        print substr $_,0,58;
        print substr $_,85;
    }
}

the "no warnings" block is in there because I have no idea how the lines of the real data file are formatted. If there are blank lines the substr() function will throw warnings about "substr outside of string at blah blah blah" and the print lines will throw warnings as well. You could check the length of each line to avoid this but it is probably not necessary unless all lines need to be padded to a certain length.

You say the file is 51 lines long but the sample data only accounts for about 27 lines.

I assume you know how to direct the input and output when running the script.

Last edited by KevinADC; 01-30-2008 at 04:38 AM.. Reason: chnaged perl code
# 17  
Old 01-30-2008
the output from using the sample data:

Code:
      NNNNN TTTTTTTTT,  T T            TTTT NNN NNN-NNNN  NN NNTTTNN      NN.NN-       NN.NN-  NN       
            NNNNN TTTTTT TTTTTT        TTTT NNN NNN-NNNN  NN NNTTTNN       N.NN+       NN.NN-    
        TNN TTTT TTTTT, TT NNNNN                          NN NNTTTNN      NN.NN-       NN.NN-    
                                                          NN NNTTTNN     NNN.NN-        N.NN     

Organization Totals      tttttt O/D  tttttt  %    t/t ttttttt   %                   $ttttt  tttttt  %    t/t ttttttt   %
                     tttt:  nn Days   nnn  nn.n     n,nnn.nn  nn.n           Thru:  $nn.nn    nn   n.n        nn.nn   n.n
                            nn Days    nn  nn.n     n,nnn.nn  nn.n                  $nn.nn    nn  nn.n       nnn.nn   n.n
                            nn Days    nn   n.n     n,nnn.nn  nn.n                  $nn.nn    nn  nn.n     n,nnn.nn   n.n
                            nn Days    nn   n.n     n,nnn.nn  nn.n                  $nn.nn    nn  nn.n       nnn.nn   n.n
                            nn Days     n               n.nn                        $nn.nn    nn  nn.n     n,nnn.nn   n.n
                           nnn Days     n               n.nn                       $nnn.nn    nn  nn.n     n,nnn.nn  nn.n
                           nnn Days     n               n.nn                       $nnn.nn    nn  nn.n     n,nnn.nn  nn.n
                           nnn Days     n               n.nn                     $n,nnn.nn     n   n.n     n,nnn.nn  nn.n
                    ttttt: nnn Days     n               n.nn              ttttt: $n,nnn.nn     n   n.n     n,nnn.nn  nn.n
                                     ----         ----------                                ----         ----------
                                      nnn         $nn,nnn.nn                                 nnn         $nn,nnn.nn


Last edited by KevinADC; 01-30-2008 at 04:40 AM..
# 18  
Old 01-30-2008
The script you provided seems to have done the trick for the most part. It does cut into the header (there is a header at the top of each page) but I am fine with that. I replaced all of the text that was cut out using the sed command and it comes out decent. The only thing I can see that may be annoying is while printing it may still use the same amount of pages if I do not cut out the page break that is in the file, but I will have to look at it a little further to see what I may be able to do about that.

The character that seperates the pages is causing a problem, I am working on trying to replace it with a space or something, but then the headers will not end up at the top of the page. Not to sure what I am going to do with this one yet.


All in all thank you both for all the help, I will deal with the minor things I need to do with it from here. On the upside I did learn a good bit out of doing this so I suppose that is a upside.

Side note, the script does seem to be removing the 4th line off of column 5. I will keep playing with it to see if I can figure out why. If there are 4 lines in column 5, it just leaves 3. If there are less than 4 it just keeps whatever is there.


Thanks again.

Last edited by DerangedNick; 01-30-2008 at 10:21 AM..
# 19  
Old 01-30-2008
Your original explanation included the widths for each column, that is what I used. If the width varies then there will be problems. If they are fixed-width columns this is a pretty simple task, but if they are not, trying to verify the data and extract a column looks to be very difficult, very very difficult.

If you want to keep the header remove the line that "skips the header" or adjust the number 7 to 6 or 8 and see if one of those works better.

Good luck.
# 20  
Old 01-30-2008
The width of the columns do appear to be fixed. It is just on a line that has no data but column 4 it seems to be removing the data in column 5 on the bottom.

Example:

Address is 4 lines long (column 1), so it will keep all 4 lines in column 5. If the address is 3 lines long, it will only keep 3 lines of column 5.

The header does not appear to be a issue though, it just removes the very top which is fine.
# 21  
Old 01-30-2008
try changing this:

print substr $_,85;

to:

print substr $_,84;

but it's a bit of a long shot.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Row bind multiple csv files having different column headers

All, I guess by this time someone asked this kind of question, but sorry I am unable to find after a deep search. Here is my request I have many files out of which 2 sample files provided below. File-1 (with A,B as column headers) A,B 1,2 File-2 (with C, D as column headers) C,D 4,5 I... (7 Replies)
Discussion started by: ks_reddy
7 Replies

2. Shell Programming and Scripting

Capturing column headers in an array

Hello, I am processing a tab delimited text file and need to grab all of the column headers in an array. The input looks like, num Name PCA_A1 PCA_A2 PCA_A3 0 compound_00 -3.5054 -1.1207 -2.4372 1 compound_01 -2.2641 0.4287 ... (5 Replies)
Discussion started by: LMHmedchem
5 Replies

3. Shell Programming and Scripting

Merge csvs with column headers

hello gurus, Somebody must have done this before, I couldn't find anything. Please redirect me if this was solved before, and if not please help. To the problem now, I have multiple csv files (about 1000) which I need to concatenate by column header. The final file should have a superset... (4 Replies)
Discussion started by: abh.kumar
4 Replies

4. Shell Programming and Scripting

Sar -u generates multiple column headers in csv file

Hi All, The below sar -u command generates multiple column headers in csv file Expected output should print column headers only once in the csv file shell script: $cat sar_cpu_EBS.sh #!/bin/bash while ; do sar -u 15 1 | awk '/^/ {print $1,$2,$4,$6,$7}' | tr -s ' ' ',' >>... (6 Replies)
Discussion started by: a1_win
6 Replies

5. Shell Programming and Scripting

Merge column headers and transpose

Hello Everyone! I am new on this forum and this is my first post. I wish to apologize for my, not canonical, English. I would like to solve this problem but I have no clue of how do it!I will be grateful if someone could help me! I have a table like this: gene TF1 TF2 TF3 TF4 gene1 1 2 3 4... (5 Replies)
Discussion started by: giuliangiuseppe
5 Replies

6. Shell Programming and Scripting

Transpose field names from column headers to values in one column

Hi All, I'm looking for a script which can transpose field names from column headers to values in one column. for example, the input is: IDa;IDb;IDc;PARAM1;PARAM2;PARAM3; a;b;c;p1val;p2val;p3val; d;e;f;p4val;p5val;p6val; g;h;i;p7val;p8val;p9val; into the output like this: ... (6 Replies)
Discussion started by: popesk
6 Replies

7. Shell Programming and Scripting

Matching words based on column headers

Hi , Pls help on this. Input file: NAME1 BSC1 TEXT ID 1 MAINSFAIL TEXT ID 2 DGON TEXT ID 3 lOADONDG NAME2 BSC2 TEXT ID 1 DGON TEXT ID 3 lOADONG (1 Reply)
Discussion started by: bha148
1 Replies

8. UNIX for Dummies Questions & Answers

Sort by Column Headers

Hi All, I am new to UNIX can you please help me to sort a file with different columns my file looks like this $ cat gaut.txt UID PID PPID PGID SID C STIME TTY TIME CMD liveuser 3008 2892 3008 3008 0 11:58 ? 00:00:00 gnome-session liveuser 3019 1 ... (8 Replies)
Discussion started by: cgk1983
8 Replies

9. Shell Programming and Scripting

Merging of files with different headers to make combined headers file

Hi , I have a typical situation. I have 4 files and with different headers (number of headers is varible ). I need to make such a merged file which will have headers combined from all files (comman coluns should appear once only). For example - File 1 H1|H2|H3|H4 11|12|13|14 21|22|23|23... (1 Reply)
Discussion started by: marut_ashu
1 Replies

10. Shell Programming and Scripting

Excel Column Headers

cat ABC.log | egrep "Error 500" >> /tmp/Logs.log egrep "<Mango>.*<.Mango>" Logs.log | sed -e "s/^.*<Mango/<Mango/" | cut -f2 -d">"| cut -f1 -d"<" >> /tmp/temp1.xls egrep "<Apple>.*<.Apple>" Logs.log | sed -e "s/^.*<Apple/<Apple/" | cut -f2 -d">"| cut -f1 -d"<" >> /tmp/temp2.xls print Heading1,... (1 Reply)
Discussion started by: pk_eee
1 Replies
Login or Register to Ask a Question