![]() |
|
|
|
|
|||||||
| Forums | Portal | Register | Forum Rules | FAQ | Contribute | Members List | Arcade | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here. |
|
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Removing comma after 3rd column | buddyme | UNIX for Dummies Questions & Answers | 13 | 03-17-2008 07:44 AM |
| C Headers | biosdos | High Level Programming | 0 | 01-22-2006 11:48 AM |
| kernel-headers rpm | Negm | Linux | 2 | 04-05-2005 04:40 AM |
| removing a column from list | jxh461 | Shell Programming and Scripting | 3 | 10-09-2002 12:20 PM |
| removing trailing spaces of a particular column in a file | rooh | UNIX for Dummies Questions & Answers | 2 | 01-12-2002 08:34 AM |
|
|
Submit Tools | LinkBack | Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
Removing Headers and a Column
I have a text file in unix with a layout like this
Column 1 - 1-12 Column 2 - 13-39 Column 3 - 40-58 Column 4 - 59-85 Column 5 - 86-120 Columbn 6 - 121-131 The file also has a header on the first 6 lines of each page. Each page is 51 lines long. So I want to remove the header from each page first off and rewrite the file (which I have been able to figure out for the most part, it just isn't very clean yet) All I have been able to do is rewrite the header with blank spaces, which just gives me a mess in the first place. But here is the ultimate goal. Remove the first 6 lines of each page and remove column 4 from the entire report. Rewrite the report with a new header and just put all of the data in a new report, excluding column 4. Honestly I haven't really done anything quite this complex (at least it seems complex to me), so I am not really sure where to get started. The most I have really done is just rewriting strings in a text file or removing specific words. Any help would be appreciated with this. One thing I forgot to mention and I am not even sure that this is possible. At the very end of the report is a totals section. It has a specific word identifying where it starts and i'd need to reprint from where it starts down at the end of the new report. I am not even sure if this is possible because removing column 4 would obviously cut into that section. So it seems the only way to save it would be to write that one area into a new file and appending it at the end of new report once it has been completed. I understand the concept, the matter is figuring out how to do it. Thinking about it a little more, if it was possible to leave the headers and just ignore removing column 4 from that section on each page, that would work as well. I actually do not mind the headers there and it may create printing problems if I do not keep each page 51 lines long. I imagine i'll deal with it but if there was another way I am sure I could handle that as well. Also this may work for the final report at the very end. Just telling it to ignore the last 12 lines of the report, somehow. Just a thought. Thank you Last edited by DerangedNick; 01-29-2008 at 04:04 PM. Reason: Forgot something |
| Forum Sponsor | ||
|
|
|
#2
|
||||
|
||||
|
Code:
#!/usr/bin/perl -w
$PAGESIZE=51;
$HEADERSIZE=6;
$linenumber=0;
$intotals=0;
while (<>) {
$linenumber++;
if (/^whatever line indicates the start of the "totals" section$/) {
$intotals=1;
}
if ($intotals) {
print $_;
} elsif ($linenumber % $PAGESIZE >= $HEADERSIZE) {
if (/^(.{58}).{27}(.*)$/) {
print "$1$2\n";
} elsif (/^(.{58}).{1-27}$) {
print $1\n";
} else {
print $_;
}
}
}
Last edited by Smiling Dragon; 01-29-2008 at 06:26 PM. Reason: Fixed a few bugs |
|
#3
|
|||
|
|||
|
I am currently looking at trying to use the script you provided. However my knowledge of running this against the file is rather slim since most of the commands I have run in the past do not call a script into it. If you wouldn't mind providing some more information on how to get this to run against the file i'd appreciate it. In the mean time I will continue messing with it to see if I can get anything. Thanks for the help. (Ignore above)
I seem to have gotten it to run ok, but i am getting these errors currently. syntax error at testscript line 12, near "<>" syntax error at testscript line 15, near "} else" Execution of testscript aborted due to compilation errors. Last edited by DerangedNick; 01-29-2008 at 05:10 PM. Reason: Running |
|
#4
|
||||
|
||||
|
Oops, my bad, have fixed it in the original post
(change the <> to !=) |
|
#5
|
|||
|
|||
|
The script ran through the file and gave me a output, however it didn't remove the 4th column. Everything seems to be there that was there originally but it is scattered all over the place instead of in columns. Not really sure.
What part of the request was the script addressing? I will keep playing with it for the time being to see if I can get different results. Thanks for the help Looking over the file again it does seem to have removed something but i am not quite sure at which point yet. Will update once I know. I do know that alot of the data that I wanted removed is still in place however. ::Update:: Ok what it appears to be doing is once it removes the columns on the first line, it is then pulling the second line up to the first line and going to the second line and removing that same section on the second line and so on down the entire document. The totals section appears to be in tact, however it did lose its formating so it is rather hard to tell since it is scattered. Thanks Last edited by DerangedNick; 01-29-2008 at 05:25 PM. Reason: Findings |
|
#6
|
||||
|
||||
|
Yeah, I had some bugs :/ It should do everything you are after (I hope)
Fixed more bugs in the orginal: Added \n to the print $1$2 line Replaced the / symbol in the pagebreak calculation with % (modulo arithmatic) Edit: Woops, didn't read your request right - I've been removing the first line of each page, not the first 6... Will fix... |
|
#7
|
|||
|
|||
|
Ok this one looks alot better. Totals are in tact however it needs to start cutting off 1 character earlier (which I think I may be able to change).
The problem however now is that some lines do not have data at the beginning of the lines, but column 4 does have data in it (so 1,2,3,5,6 are blank). This is still being printed it is just moving over into what was column 5. The next part is that it is just cutting sections of the header out, i don't know if this can be fixed or not. I will try to fix the width issue. I am not sure where to start on getting it to cut out the other parts of column 4 though Thanks alot for all the help. I'd rather not remove the first 6 of the lines if we can just ignore those lines somehow? They all start with the same thing (except there are multiple starts to each line of the header.) This is how the first 6 lines of each page look Line 1: (this has a square control character) I imagine it is used as the page sep Line 2: XXXXXX (always the same, different word obviously) Line 3: ALL Line 4: (blank line) Line 5: ACCOUNT (4 blank spaces before this) Line 6: --------- (4 blank spaces before this) Line 7 is blank and data starts under that. That is how the header begins on each page. If it was possible to ignore that the entire way down that would be ideal. Last error: Name "main::HEADERSIZE" used only once: possible typo at testfile line 3. Last edited by DerangedNick; 01-29-2008 at 05:53 PM. Reason: other thoughts |
|||
| Google The UNIX and Linux Forums |