Removing blocks from a file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Removing blocks from a file
# 1  
Old 12-02-2009
Data Removing blocks from a file

I have a file like the one below. Each record is separated with >
In between I have lines consisting of 3 numeric values separated by a space.

I need to take each block between the > sign and read the first number in the line.

Then take the first after the > sign and the last before the > sign. Check whether the difference is greater than a certain value. If it is, the block is removed from the file.

For example, in the following block, I check whether ABS(12.9306 - 10) > 38

If it is greater than 38, the block got to be removed

Can someone help please

>
12.9306 0 5.80696
12.722 0.138373 5.31509
12.3915 0.298905 4.65587
12.0588 0.409492 4.04942
11.7234 0.473844 3.46864
11.3851 0.492713 2.89112
11.0435 0.464082 2.29359
10.6984 0.382409 1.6451
10.3501 0.236171 0.891863
10 0 0
>
# 2  
Old 12-02-2009
Quote:
Originally Posted by kristinu
...
Then take the first after the > sign and the last before the > sign. Check whether the difference is greater than a certain value. If it is, the block is removed from the file.

For example, in the following block, I check whether ABS(12.9306 - 10) > 38

If it is greater than 38, the block got to be removed
...
>
12.9306 0 5.80696
12.722 0.138373 5.31509
12.3915 0.298905 4.65587
12.0588 0.409492 4.04942
11.7234 0.473844 3.46864
11.3851 0.492713 2.89112
11.0435 0.464082 2.29359
10.6984 0.382409 1.6451
10.3501 0.236171 0.891863
10 0 0
>
Here's one way to do it with Perl:

Code:
$ 
$ cat -n f8
     1  >  
     2  12.9306 0 5.80696
     3  12.722 0.138373 5.31509
     4  12.3915 0.298905 4.65587
     5  12.0588 0.409492 4.04942
     6  11.7234 0.473844 3.46864
     7  11.3851 0.492713 2.89112
     8  11.0435 0.464082 2.29359
     9  10.6984 0.382409 1.6451 
    10  10.3501 0.236171 0.891863
    11  10 0 0                   
    12  >                        
    13  52.9306 0 5.80696        
    14  12.722 0.138373 5.31509  
    15  10.3501 0.236171 0.891863
    16  10 0 0                   
    17  >                        
    18  12.9306 0 5.80696        
    19  10.3501 0.236171 0.891863
    20  10 0 0                   
    21  50.9306 0 5.80696        
    22  >                        
    23  12.9306 0 5.80696        
    24  12.9306 0 5.80696        
$                                
$
$ ##
$ perl -lne 'BEGIN{$lim=38; undef $/}
>            while(/^(>\n([\d.]+)[^>]*\n([\d.]+) ([\d.]+) ([\d.]+))/msg) {
>              print $1 if abs($2-$3) <= $lim;
>            }' f8
>
12.9306 0 5.80696
12.722 0.138373 5.31509
12.3915 0.298905 4.65587
12.0588 0.409492 4.04942
11.7234 0.473844 3.46864
11.3851 0.492713 2.89112
11.0435 0.464082 2.29359
10.6984 0.382409 1.6451
10.3501 0.236171 0.891863
10 0 0
>
12.9306 0 5.80696
10.3501 0.236171 0.891863
10 0 0
50.9306 0 5.80696
>
12.9306 0 5.80696
12.9306 0 5.80696
$
$

tyler_durden
# 3  
Old 12-02-2009
Using awk

Was wondering how I could do it in awk.
# 4  
Old 12-02-2009
This is close (but no cigar!)

Code:
$ cat file1
>
99 0 5.80696
12.722 0.138373 5.31509
12.3915 0.298905 4.65587
12.0588 0.409492 4.04942
11.7234 0.473844 3.46864
11.3851 0.492713 2.89112
11.0435 0.464082 2.29359
10.6984 0.382409 1.6451
10.3501 0.236171 0.891863
10 0 0
>
82.9306 0 5.80696
12.722 0.138373 5.31509
12.3915 0.298905 4.65587
12.0588 0.409492 4.04942
11.7234 0.473844 3.46864
11.3851 0.492713 2.89112
11.0435 0.464082 2.29359
10.6984 0.382409 1.6451
10.3501 0.236171 0.891863
10 0 0
>
60.9306 0 5.80696
12.722 0.138373 5.31509
12.3915 0.298905 4.65587
12.0588 0.409492 4.04942
11.7234 0.473844 3.46864
11.3851 0.492713 2.89112
11.0435 0.464082 2.29359
10.6984 0.382409 1.6451
10.3501 0.236171 0.891863
10 0 0
>
39.9306 0 5.80696
12.722 0.138373 5.31509
12.3915 0.298905 4.65587
12.0588 0.409492 4.04942
11.7234 0.473844 3.46864
11.3851 0.492713 2.89112
11.0435 0.464082 2.29359
10.6984 0.382409 1.6451
10.3501 0.236171 0.891863
10 0 0
>
$ awk -v RS=">" -v ORS=">" '(NF > 2) && (($1 - $(NF-2) > 38))' file1

99 0 5.80696
12.722 0.138373 5.31509
12.3915 0.298905 4.65587
12.0588 0.409492 4.04942
11.7234 0.473844 3.46864
11.3851 0.492713 2.89112
11.0435 0.464082 2.29359
10.6984 0.382409 1.6451
10.3501 0.236171 0.891863
10 0 0
>
82.9306 0 5.80696
12.722 0.138373 5.31509
12.3915 0.298905 4.65587
12.0588 0.409492 4.04942
11.7234 0.473844 3.46864
11.3851 0.492713 2.89112
11.0435 0.464082 2.29359
10.6984 0.382409 1.6451
10.3501 0.236171 0.891863
10 0 0
>
60.9306 0 5.80696
12.722 0.138373 5.31509
12.3915 0.298905 4.65587
12.0588 0.409492 4.04942
11.7234 0.473844 3.46864
11.3851 0.492713 2.89112
11.0435 0.464082 2.29359
10.6984 0.382409 1.6451
10.3501 0.236171 0.891863
10 0 0
>

# 5  
Old 12-02-2009
Code:
awk  'function abs(val) {return val>0?val:-val } 
BEGIN {RS=ORS=">" } 
{if (abs($1-$28)<38) {print}}' urfile

# 6  
Old 12-02-2009
$28 is not good.


from the list I need to take the first column,
then subtract the first from the last, taking the absolute
value. If result is greater than 38 I remove the block.

>
99 0 5.80696
12.722 0.138373 5.31509
12.3915 0.298905 4.65587
12.0588 0.409492 4.04942
11.7234 0.473844 3.46864
11.3851 0.492713 2.89112
11.0435 0.464082 2.29359
10.6984 0.382409 1.6451
10.3501 0.236171 0.891863
10 0 0
>
# 7  
Old 12-02-2009
There's 30 numbers between > and > .... $28 means the 28th number (not to be confused with the number 38)...


http://en.wikipedia.org/wiki/Ordinal_number
http://en.wikipedia.org/wiki/Cardinal_number
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Solaris

Translate file name to disk blocks on UFS

Is there any way to translate a file name to the underlying file system's disk blocks/sectors/extents on UFS (Solaris OS on Sparc)? I found several ways to do it on linux file systems like ext2/3/4, using command like hdparm -- fibmap and filefrag. I also found one equivalent way to get that... (9 Replies)
Discussion started by: dorbaruch
9 Replies

2. Shell Programming and Scripting

Blocks of text in a file - extract when matches...

I sat down yesterday to write this script and have just realised that my methodology is broken........ In essense I have..... ----------------------------------------------------------------- (This line really is in the file) Service ID: 12345 ... (7 Replies)
Discussion started by: Bashingaway
7 Replies

3. Shell Programming and Scripting

Adding and removing blocks of text from file

Hello all, short story: I'm writing a script to add and remove dns records in dns files. Its on a RHEL 5.5 So far i've locked up the basic operations in a couple of functions: - validate the parameters - search for existant ip in file when adding - search for existant name records in... (6 Replies)
Discussion started by: maverick72
6 Replies

4. Shell Programming and Scripting

Row blocks to column blocks

Hello, Searched for a while and found some "line-to-column" script. My case is similar but with multiple fields each row: S02 Length Per S02 7043 3.864 S02 54477 29.89 S02 104841 57.52 S03 Length Per S03 1150 0.835 S03 1321 0.96 S03 ... (9 Replies)
Discussion started by: yifangt
9 Replies

5. Shell Programming and Scripting

Extracting data blocks from file

Hi all, I want to extract blocks of data from a file depending on the contents of that block. The input file(table) has several blocks each starting with 'gene' in the first column. I want to extract only those blocks which do not have the expression '_T02' in the second column. Input file ... (3 Replies)
Discussion started by: newbie83
3 Replies

6. Shell Programming and Scripting

how to split this file into blocks and then send these blocks as input to the tool called Yices?

Hello, I have a file like this: FILE.TXT: (define argc :: int) (assert ( > argc 1)) (assert ( = argc 1)) <check> # (define c :: float) (assert ( > c 0)) (assert ( = c 0)) <check> # now, i want to separate each block('#' is the delimeter), make them separate files, and then send them as... (5 Replies)
Discussion started by: paramad
5 Replies

7. UNIX for Dummies Questions & Answers

Convert 512-blocks to 4k blocks

I'm Unix. I'm looking at "df" on Unix now and below is an example. It's lists the filesystems out in 512-blocks, I need this in 4k blocks. Is there a way to do this in Unix or do I manually convert and how? So for container 1 there is 7,340,032 in size in 512-blocks. What would the 4k block be... (2 Replies)
Discussion started by: rockycj
2 Replies

8. Shell Programming and Scripting

extract blocks of text from a file

Hi, This is part of a large text file I need to separate out. I'd like some help to build a shell script that will extract the text between sets of dashed lines, write that to a new file using the whole or part of the first text string as the new file name, then move on to the next one and... (7 Replies)
Discussion started by: cajunfries
7 Replies

9. Solaris

Why does the # of blocks change for a file on a ZFS filesystem?

I created a zpool and zfs filesystem in OpenSolaris. I made two NFS mount points: > zpool history History for 'raidpool': 2009-01-15.17:12:48 zpool create -f raidpool raidz1 c4t1d0 c4t2d0 c4t3d0 c4t4d0 c4t5d0 2009-01-15.17:15:54 zfs create -o mountpoint=/vol01 -o sharenfs=on -o... (0 Replies)
Discussion started by: sqa777
0 Replies

10. Shell Programming and Scripting

Delete blocks of lines from text file

Hello, Hello Firends, I have file like below. I want to remove selected blocks say abc,pqr,lst. how can i remove those blocks from file. zone abc { blah blah blah } zone xyz { blah blah blah } zone pqr { blah blah blah } (4 Replies)
Discussion started by: nrbhole
4 Replies
Login or Register to Ask a Question