Best way to remove sections of text from a file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Best way to remove sections of text from a file
# 1  
Old 09-22-2008
Best way to remove sections of text from a file

Greetings! I found this fourm via a google search on "sed expressions".

I have a file that contains notices and they are all the same length in lines. For example the file would contains 15 notices, each being 26 lines each. I need some way to eliminate notices that contain a "S" in a particular line in a particular column, lets say line 8 position 30, if it contains a "S" it is omitted, if its an "L" its kept.

I considered writing a pearl script for it but I thought there may be a easy/simple shell script or sed expression that could do it. My regex is not all that is cracked up to be, but I can work with it.

Any examples or suggestions would be most apreciated.

Thank you
Carl
# 2  
Old 09-22-2008
Hammer & Screwdriver Without enough details, perhaps this will give you a start

Code:
> cat file1
123456789012345678901234567890123456789S
01abcdefghijklmnopqrstuvwxyzabcdefghijkS
02abcdefghijklmnopqrstuvwxyzabcdefghijkS
03abcdefghijklmnopqrstuvwxyzabcdefghijkS
04abcdefghijklmnopqrstuvwxyzabcdefghijkS
05abcdefghijklmnopqrstuvwxyzabcdefghijkS
06abcdefghijklmnopqrstuvwxyzabcdefghijkS
07abcdefghijklmnopqrstuvwxyzabcdefghijkL
08abcdefghijklmnopqrstuvwxyzabcdefghijkS
09abcdefghijklmnopqrstuvwxyzabcdefghijkS
10abcdefghijklmnopqrstuvwxyzabcdefghijkT
11abcdefghijklmnopqrstuvwxyzabcdefghijkT
12abcdefghijklmnopqrstuvwxyzabcdefghijkS
13abcdefghijklmnopqrstuvwxyzabcdefghijkL
14abcdefghijklmnopqrstuvwxyzabcdefghijkS
15abcdefghijklmnopqrstuvwxyzabcdefghijkS
> cat file1 | grep "[0-9a-z]\{39\}S"
123456789012345678901234567890123456789S
01abcdefghijklmnopqrstuvwxyzabcdefghijkS
02abcdefghijklmnopqrstuvwxyzabcdefghijkS
03abcdefghijklmnopqrstuvwxyzabcdefghijkS
04abcdefghijklmnopqrstuvwxyzabcdefghijkS
05abcdefghijklmnopqrstuvwxyzabcdefghijkS
06abcdefghijklmnopqrstuvwxyzabcdefghijkS
08abcdefghijklmnopqrstuvwxyzabcdefghijkS
09abcdefghijklmnopqrstuvwxyzabcdefghijkS
12abcdefghijklmnopqrstuvwxyzabcdefghijkS
14abcdefghijklmnopqrstuvwxyzabcdefghijkS
15abcdefghijklmnopqrstuvwxyzabcdefghijkS
> cat file1 | grep "[0-9a-z]\{39\}T"
10abcdefghijklmnopqrstuvwxyzabcdefghijkT
11abcdefghijklmnopqrstuvwxyzabcdefghijkT
>cat file1 | grep "[0-9a-z]\{39\}L"
07abcdefghijklmnopqrstuvwxyzabcdefghijkL
13abcdefghijklmnopqrstuvwxyzabcdefghijkL
>

# 3  
Old 09-22-2008
I would have added that the text is not the same for each line, each notice contains 26 line not all are the same. For example:
Code:
                         WE HAVE TRANSFERRED FUNDS FROM YOUR OVERDRAFT PROTECTION
                         SOURCE(S) AS SHOWN BELOW TO PAY THE FOLLOWING TRANSACTION(S):
                         PLEASE RECORD THE TOTAL TRANSFER AMOUNT AS A DEPOSIT IN YOUR
                         DRAFT REGISTER.



    12345-01 JOHN R DOE                     TRAN DATE:09/19/08 EFF DATE:09/19/08 BR:   1

                                         OVERDRAFT TRANSFER TRANSACTION  DRAFT(S) PAID
                                         TYPE  ACCOUNT    AMOUNT    FEE  NUMBER   AMOUNT

                                         S    12345-00    393.28    .00       0   484.00

                                         TOTALS:       393.28    .00







       JOHN R DOE
       123 ANY ST
       ANYWHERE         TX 12345-1234







                         WE HAVE TRANSFERRED FUNDS FROM YOUR OVERDRAFT PROTECTION
                         SOURCE(S) AS SHOWN BELOW TO PAY THE FOLLOWING TRANSACTION(S):
                         PLEASE RECORD THE TOTAL TRANSFER AMOUNT AS A DEPOSIT IN YOUR
                         DRAFT REGISTER.



    12345-01 JOHN R DOE                     TRAN DATE:09/19/08 EFF DATE:09/19/08 BR:   1

                                         OVERDRAFT TRANSFER TRANSACTION  DRAFT(S) PAID
                                         TYPE  ACCOUNT    AMOUNT    FEE  NUMBER   AMOUNT

                                         S    12345-00    393.28    .00       0   484.00

                                         TOTALS:        17.56    .00







       JOHN R DOE
       123 ANY ST
       ANYWHERE         TX 12345-1234







                         WE HAVE TRANSFERRED FUNDS FROM YOUR OVERDRAFT PROTECTION
                         SOURCE(S) AS SHOWN BELOW TO PAY THE FOLLOWING TRANSACTION(S):
                         PLEASE RECORD THE TOTAL TRANSFER AMOUNT AS A DEPOSIT IN YOUR
                         DRAFT REGISTER.



    12345-01 JOHN R DOE                     TRAN DATE:09/19/08 EFF DATE:09/19/08 BR:   1

                                         OVERDRAFT TRANSFER TRANSACTION  DRAFT(S) PAID
                                         TYPE  ACCOUNT    AMOUNT    FEE  NUMBER   AMOUNT

                                         S    12345-00    393.28    .00       0   484.00

                                         TOTALS:        60.54    .00







       JOHN R DOE
       123 ANY ST
       ANYWHERE         TX 12345-1234







                         WE HAVE TRANSFERRED FUNDS FROM YOUR OVERDRAFT PROTECTION
                         SOURCE(S) AS SHOWN BELOW TO PAY THE FOLLOWING TRANSACTION(S):
                         PLEASE RECORD THE TOTAL TRANSFER AMOUNT AS A DEPOSIT IN YOUR
                         DRAFT REGISTER.



    12345-01 JOHN R DOE                     TRAN DATE:09/19/08 EFF DATE:09/19/08 BR:   1

                                         OVERDRAFT TRANSFER TRANSACTION  DRAFT(S) PAID
                                         TYPE  ACCOUNT    AMOUNT    FEE  NUMBER   AMOUNT

                                         L    12345-20    393.28    .00       0   484.00

                                         TOTALS:        32.24    .00







       JOHN R DOE
       123 ANY ST
       ANYWHERE         TX 12345-1234
                                        






                         WE HAVE TRANSFERRED FUNDS FROM YOUR OVERDRAFT PROTECTION
                         SOURCE(S) AS SHOWN BELOW TO PAY THE FOLLOWING TRANSACTION(S):
                         PLEASE RECORD THE TOTAL TRANSFER AMOUNT AS A DEPOSIT IN YOUR
                         DRAFT REGISTER.



    12345-01 JOHN R DOE                     TRAN DATE:09/19/08 EFF DATE:09/19/08 BR:   1

                                         OVERDRAFT TRANSFER TRANSACTION  DRAFT(S) PAID
                                         TYPE  ACCOUNT    AMOUNT    FEE  NUMBER   AMOUNT

                                         S    12345-00    393.28    .00       0   484.00

                                         TOTALS:        49.58    .00







       JOHN R DOE
       123 ANY ST
       ANYWHERE         TX 12345-1234







                         WE HAVE TRANSFERRED FUNDS FROM YOUR OVERDRAFT PROTECTION
                         SOURCE(S) AS SHOWN BELOW TO PAY THE FOLLOWING TRANSACTION(S):
                         PLEASE RECORD THE TOTAL TRANSFER AMOUNT AS A DEPOSIT IN YOUR
                         DRAFT REGISTER.



    12345-01 JOHN R DOE                     TRAN DATE:09/19/08 EFF DATE:09/19/08 BR:   1

                                         OVERDRAFT TRANSFER TRANSACTION  DRAFT(S) PAID
                                         TYPE  ACCOUNT    AMOUNT    FEE  NUMBER   AMOUNT

                                         L    12345-01    393.28    .00       0   484.00

                                         TOTALS:       454.34    .00







       JOHN R DOE
       123 ANY ST
       ANYWHERE         TX 12345-1234

I want to keep the notices with "L" and discard the ones with "S". How this helps. Thank you!
# 4  
Old 09-22-2008
Tools convoluted approach, but appears to work

Determine the records (account #'s) you want to keep
Code:
> cut -c42-54 file1 | grep "^[SL]    [0-9]\{5\}-[0-9]\{2\}" | grep "^L" >L_FILES
> cat L_FILES
L    12345-20
L    12345-01

Now, a strange set of commands piped together
Code:
> cat file1 | sed "s/                         WE HAVE/~                         WE HAVE/" | tr "\n" "|" | tr "~" "\n" | egrep -f L_FILES | tr "\n" "~" | tr "|" "\n" | tr -d "~"

And the output looks right
Code:
                         WE HAVE TRANSFERRED FUNDS FROM YOUR OVERDRAFT PROTECTION
                         SOURCE(S) AS SHOWN BELOW TO PAY THE FOLLOWING TRANSACTION(S):
                         PLEASE RECORD THE TOTAL TRANSFER AMOUNT AS A DEPOSIT IN YOUR
                         DRAFT REGISTER.



    12345-01 JOHN R DOE                     TRAN DATE:09/19/08 EFF DATE:09/19/08 BR:   1

                                         OVERDRAFT TRANSFER TRANSACTION  DRAFT(S) PAID
                                         TYPE  ACCOUNT    AMOUNT    FEE  NUMBER   AMOUNT

                                         L    12345-20    393.28    .00       0   484.00

                                         TOTALS:        32.24    .00







       JOHN R DOE
       123 ANY ST
       ANYWHERE         TX 12345-1234
                                        






                         WE HAVE TRANSFERRED FUNDS FROM YOUR OVERDRAFT PROTECTION
                         SOURCE(S) AS SHOWN BELOW TO PAY THE FOLLOWING TRANSACTION(S):
                         PLEASE RECORD THE TOTAL TRANSFER AMOUNT AS A DEPOSIT IN YOUR
                         DRAFT REGISTER.



    12345-01 JOHN R DOE                     TRAN DATE:09/19/08 EFF DATE:09/19/08 BR:   1

                                         OVERDRAFT TRANSFER TRANSACTION  DRAFT(S) PAID
                                         TYPE  ACCOUNT    AMOUNT    FEE  NUMBER   AMOUNT

                                         L    12345-01    393.28    .00       0   484.00

                                         TOTALS:       454.34    .00







       JOHN R DOE
       123 ANY ST
       ANYWHERE         TX 12345-1234

Although gotta believe there is an awk command to verify the account #. Will need to think about that, or perhaps someone else here better at awk could help.
# 5  
Old 09-22-2008
That looks right, I will test it on an actual file and let you know.
# 6  
Old 09-22-2008
Oh bugger, I just discoved somthing that may throw a kink in things, here is one that has both a S and an L, but in this case since it has an L record in it we want to keep it :-(

Code:
                         WE HAVE TRANSFERRED FUNDS FROM YOUR OVERDRAFT PROTECTION
                         SOURCE(S) AS SHOWN BELOW TO PAY THE FOLLOWING TRANSACTION(S):
                         PLEASE RECORD THE TOTAL TRANSFER AMOUNT AS A DEPOSIT IN YOUR
                         DRAFT REGISTER.



    12345-01 JOHN R DOE                       TRAN DATE:09/22/08 EFF DATE:09/22/08 BR:   1

                                         OVERDRAFT TRANSFER TRANSACTION  DRAFT(S) PAID
                                         TYPE  ACCOUNT    AMOUNT    FEE  NUMBER   AMOUNT

                                         S    12345-00     90.45    .00       0   232.79
                                         L    12345-06    142.34    .00

                                         TOTALS:       232.79    .00






       JOHN R DOE
       123 ANYWHERE AVE
       HERETHERE       WY 12345-1234

# 7  
Old 09-22-2008
Tools I think you are perhaps still good with two account #'s

The egrep will still 'pass' since an account-to-keep will match.
Test and update on status.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to remove the text between all curly brackets from text file?

Hello experts, I have a text file with lot of curly brackets (both opening { & closing } ). I need to delete them alongwith the text between opening & closing brackets' pair. For ex: Input:- 59. Rh1 Qe4 {(Qf5-e4 Qd8-g8+ Kg6-f5 Qg8-h7+ Kf5-e5 Qh7-e7+ Ke5-f5 Qe7-d7+ Qe4-e6 Qd7-h7+ Qe6-g6... (6 Replies)
Discussion started by: prvnrk
6 Replies

2. Shell Programming and Scripting

Remove sections based on duplicate first line

Hi, I have a file with many sections in it. Each section is separated by a blank line. The first line of each section would determine if the section is duplicate or not. if the section is duplicate then remove the entire section from the file. below is the example of input and output.... (5 Replies)
Discussion started by: ahmedwaseem2000
5 Replies

3. Shell Programming and Scripting

How to remove sections of a filename?

Hello, I need some help with renaming some files by removing a certain portion of the filename. The current file name is: ABC_2013186197_20130708_203556.95336 I need to remove the 5 digits after the first "_". The new file name should be: ABC_197_20130708_203556.95336 I'm not quite... (5 Replies)
Discussion started by: bbbngowc
5 Replies

4. Shell Programming and Scripting

Omitting sections of file that contain word

I have a configuration file that contains hundreds of these chunks. Each "chunk" is the section that begins with "define service {" and ends with "}". define service { check_command check_proc!java hostgroup_name service_description ... (5 Replies)
Discussion started by: SkySmart
5 Replies

5. Shell Programming and Scripting

awk removing sections of a file

I have a file that looks liek this (see below). can somebody provide me with and awk or sed command that can take a piece of the file starting from the time to the blank line and put in into another file. For example: How would I get the data from 10:56:11 to the blank line. Two things: ... (5 Replies)
Discussion started by: BeefStu
5 Replies

6. Programming

extract different sections of a file

Hi All, I have a file with the data 10;20;30;40;50;60;70;80;123;145;156;345. the output i want is the first fourth sixth elements and everything from there on. How do i achieve this. (1 Reply)
Discussion started by: raghu_shekar
1 Replies

7. Shell Programming and Scripting

Remove sections of a xml file with sed

I've been trying to remove some lines of a xml file that looks like this: <parent> <child>name1</child> <lots_of_other tags></lots_of_other_tags> </parent> <parent> <child>name2</child> <lots_of_other tags></lots_of_other_tags> </parent> <parent> <child>name3</child> ... (5 Replies)
Discussion started by: viniciusandre
5 Replies

8. Shell Programming and Scripting

Parsing file, yaml file? Extracting specific sections

Here is a data file, which I believe is in YAML. I am trying to retrieve just the 'addon_domains" section, which doesnt seem to be as easy as I had originally thought. Any help on this would be greatly appreciated!! I have been trying to do this in awk and mostly bash scripting instead of perl... (3 Replies)
Discussion started by: Rhije
3 Replies

9. Shell Programming and Scripting

extract multiple sections of file

I have a file that I need to parse multiple sections from the file. The file contains multiple lines that start with ST (Abunch of data) Then the file contains multiple lines that start with SE (Abunch of data) SE*30*0001 ST*810*0002 I need all of the lines between and including these.... (6 Replies)
Discussion started by: rgentis
6 Replies

10. UNIX for Advanced & Expert Users

extract multiple sections of a file

I have a file that I need to parse multiple sections from the file. The file contains multiple lines that start with ST (Abunch of data) Then the file contains multiple lines that start with SE (Abunch of data) SE*30*0001 ... (1 Reply)
Discussion started by: rgentis
1 Replies
Login or Register to Ask a Question