Removing Headers and a Column

Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Removing Headers and a Column
# 1  
Old 01-29-2008
Removing Headers and a Column

I have a text file in unix with a layout like this

Column 1 - 1-12
Column 2 - 13-39
Column 3 - 40-58
Column 4 - 59-85
Column 5 - 86-120
Columbn 6 - 121-131

The file also has a header on the first 6 lines of each page. Each page is 51 lines long. So I want to remove the header from each page first off and rewrite the file (which I have been able to figure out for the most part, it just isn't very clean yet)

All I have been able to do is rewrite the header with blank spaces, which just gives me a mess in the first place.

But here is the ultimate goal. Remove the first 6 lines of each page and remove column 4 from the entire report. Rewrite the report with a new header and just put all of the data in a new report, excluding column 4.

Honestly I haven't really done anything quite this complex (at least it seems complex to me), so I am not really sure where to get started. The most I have really done is just rewriting strings in a text file or removing specific words.

Any help would be appreciated with this.

One thing I forgot to mention and I am not even sure that this is possible. At the very end of the report is a totals section. It has a specific word identifying where it starts and i'd need to reprint from where it starts down at the end of the new report. I am not even sure if this is possible because removing column 4 would obviously cut into that section. So it seems the only way to save it would be to write that one area into a new file and appending it at the end of new report once it has been completed. I understand the concept, the matter is figuring out how to do it.

Thinking about it a little more, if it was possible to leave the headers and just ignore removing column 4 from that section on each page, that would work as well. I actually do not mind the headers there and it may create printing problems if I do not keep each page 51 lines long. I imagine i'll deal with it but if there was another way I am sure I could handle that as well. Also this may work for the final report at the very end. Just telling it to ignore the last 12 lines of the report, somehow. Just a thought.

Thank you

Last edited by DerangedNick; 01-29-2008 at 07:04 PM.. Reason: Forgot something
# 2  
Old 01-29-2008

#!/usr/bin/perl -w
while (<>) {
  if (/^whatever line indicates the start of the "totals" section$/) {
  if ($intotals) {
    print $_;
  } elsif ($linenumber % $PAGESIZE >= $HEADERSIZE) {
    if (/^(.{58}).{27}(.*)$/) {
      print "$1$2\n";
    } elsif (/^(.{58}).{1-27}$) {
      print $1\n";
    } else {
      print $_;

Untested and you'll have to replace "whatever line indicates the start of the "totals" section" with something sensible.

Last edited by Smiling Dragon; 01-29-2008 at 09:26 PM.. Reason: Fixed a few bugs
# 3  
Old 01-29-2008
I am currently looking at trying to use the script you provided. However my knowledge of running this against the file is rather slim since most of the commands I have run in the past do not call a script into it. If you wouldn't mind providing some more information on how to get this to run against the file i'd appreciate it. In the mean time I will continue messing with it to see if I can get anything. Thanks for the help. (Ignore above)

I seem to have gotten it to run ok, but i am getting these errors currently.

syntax error at testscript line 12, near "<>"
syntax error at testscript line 15, near "} else"
Execution of testscript aborted due to compilation errors.

Last edited by DerangedNick; 01-29-2008 at 08:10 PM.. Reason: Running
# 4  
Old 01-29-2008
Oops, my bad, have fixed it in the original post
(change the <> to !=)
# 5  
Old 01-29-2008
The script ran through the file and gave me a output, however it didn't remove the 4th column. Everything seems to be there that was there originally but it is scattered all over the place instead of in columns. Not really sure.

What part of the request was the script addressing? I will keep playing with it for the time being to see if I can get different results. Thanks for the help

Looking over the file again it does seem to have removed something but i am not quite sure at which point yet. Will
update once I know. I do know that alot of the data that I wanted removed is still in place however.

Ok what it appears to be doing is once it removes the columns on the first line, it is then pulling the second line up to the first line and going to the second line and removing that same section on the second line and so on down the entire document. The totals section appears to be in tact, however it did lose its formating so it is rather hard to tell since it is scattered.


Last edited by DerangedNick; 01-29-2008 at 08:25 PM.. Reason: Findings
# 6  
Old 01-29-2008
Yeah, I had some bugs :/ It should do everything you are after (I hope)

Fixed more bugs in the orginal:
Added \n to the print $1$2 line
Replaced the / symbol in the pagebreak calculation with % (modulo arithmatic)

Edit: Woops, didn't read your request right - I've been removing the first line of each page, not the first 6... Will fix...
# 7  
Old 01-29-2008
Ok this one looks alot better. Totals are in tact however it needs to start cutting off 1 character earlier (which I think I may be able to change).

The problem however now is that some lines do not have data at the beginning of the lines, but column 4 does have data in it (so 1,2,3,5,6 are blank). This is still being printed it is just moving over into what was column 5.

The next part is that it is just cutting sections of the header out, i don't know if this can be fixed or not.

I will try to fix the width issue. I am not sure where to start on getting it to cut out the other parts of column 4 though

Thanks alot for all the help.

I'd rather not remove the first 6 of the lines if we can just ignore those lines somehow? They all start with the same thing (except there are multiple starts to each line of the header.)

This is how the first 6 lines of each page look
Line 1: (this has a square control character) I imagine it is used as the page sep
Line 2: XXXXXX (always the same, different word obviously)
Line 3: ALL
Line 4: (blank line)
Line 5: ACCOUNT (4 blank spaces before this)
Line 6: --------- (4 blank spaces before this)

Line 7 is blank and data starts under that. That is how the header begins on each page. If it was possible to ignore that the entire way down that would be ideal.

Last error:

Name "main::HEADERSIZE" used only once: possible typo at testfile line 3.

Last edited by DerangedNick; 01-29-2008 at 08:53 PM.. Reason: other thoughts
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Row bind multiple csv files having different column headers

All, I guess by this time someone asked this kind of question, but sorry I am unable to find after a deep search. Here is my request I have many files out of which 2 sample files provided below. File-1 (with A,B as column headers) A,B 1,2 File-2 (with C, D as column headers) C,D 4,5 I... (7 Replies)
Discussion started by: ks_reddy
7 Replies

2. Shell Programming and Scripting

Capturing column headers in an array

Hello, I am processing a tab delimited text file and need to grab all of the column headers in an array. The input looks like, num Name PCA_A1 PCA_A2 PCA_A3 0 compound_00 -3.5054 -1.1207 -2.4372 1 compound_01 -2.2641 0.4287 ... (5 Replies)
Discussion started by: LMHmedchem
5 Replies

3. Shell Programming and Scripting

Merge csvs with column headers

hello gurus, Somebody must have done this before, I couldn't find anything. Please redirect me if this was solved before, and if not please help. To the problem now, I have multiple csv files (about 1000) which I need to concatenate by column header. The final file should have a superset... (4 Replies)
Discussion started by: abh.kumar
4 Replies

4. Shell Programming and Scripting

Sar -u generates multiple column headers in csv file

Hi All, The below sar -u command generates multiple column headers in csv file Expected output should print column headers only once in the csv file shell script: $cat #!/bin/bash while ; do sar -u 15 1 | awk '/^/ {print $1,$2,$4,$6,$7}' | tr -s ' ' ',' >>... (6 Replies)
Discussion started by: a1_win
6 Replies

5. Shell Programming and Scripting

Merge column headers and transpose

Hello Everyone! I am new on this forum and this is my first post. I wish to apologize for my, not canonical, English. I would like to solve this problem but I have no clue of how do it!I will be grateful if someone could help me! I have a table like this: gene TF1 TF2 TF3 TF4 gene1 1 2 3 4... (5 Replies)
Discussion started by: giuliangiuseppe
5 Replies

6. Shell Programming and Scripting

Transpose field names from column headers to values in one column

Hi All, I'm looking for a script which can transpose field names from column headers to values in one column. for example, the input is: IDa;IDb;IDc;PARAM1;PARAM2;PARAM3; a;b;c;p1val;p2val;p3val; d;e;f;p4val;p5val;p6val; g;h;i;p7val;p8val;p9val; into the output like this: ... (6 Replies)
Discussion started by: popesk
6 Replies

7. Shell Programming and Scripting

Matching words based on column headers

Discussion started by: bha148
1 Replies

8. UNIX for Dummies Questions & Answers

Sort by Column Headers

Hi All, I am new to UNIX can you please help me to sort a file with different columns my file looks like this $ cat gaut.txt UID PID PPID PGID SID C STIME TTY TIME CMD liveuser 3008 2892 3008 3008 0 11:58 ? 00:00:00 gnome-session liveuser 3019 1 ... (8 Replies)
Discussion started by: cgk1983
8 Replies

9. Shell Programming and Scripting

Merging of files with different headers to make combined headers file

Hi , I have a typical situation. I have 4 files and with different headers (number of headers is varible ). I need to make such a merged file which will have headers combined from all files (comman coluns should appear once only). For example - File 1 H1|H2|H3|H4 11|12|13|14 21|22|23|23... (1 Reply)
Discussion started by: marut_ashu
1 Replies

10. Shell Programming and Scripting

Excel Column Headers

cat ABC.log | egrep "Error 500" >> /tmp/Logs.log egrep "<Mango>.*<.Mango>" Logs.log | sed -e "s/^.*<Mango/<Mango/" | cut -f2 -d">"| cut -f1 -d"<" >> /tmp/temp1.xls egrep "<Apple>.*<.Apple>" Logs.log | sed -e "s/^.*<Apple/<Apple/" | cut -f2 -d">"| cut -f1 -d"<" >> /tmp/temp2.xls print Heading1,... (1 Reply)
Discussion started by: pk_eee
1 Replies
Login or Register to Ask a Question