How to get data only inside polygon created by points which is part of whole data from file?


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers How to get data only inside polygon created by points which is part of whole data from file?
# 1  
Old 04-08-2010
Question How to get data only inside polygon created by points which is part of whole data from file?

hiii, Help me out..i have a huge set of data stored in a file.This file has has 2 columns which is latitude & longitude of a region. Now i have a program which asks for the number of points & based on this number it asks the user to enter that latitude & longitude values which are in the same region.here first point & last point should have same latitude & longitude so as to form a polygon. I mean to say this points form a polygon on the entire region of the data given in file.
Now i want a program using awk or unix which takes data from this file & points entered from user & then comparing this , it takes data from the file & stores data in another file such that for data which is only in the region of points or for data which is inside the polygon.
In short i want program which stores the data in another file only polygon region's data not the entire region of the file.
Please please Help me out.
For exampl:
Few lines of the file data are as follows
a.dat:
HTML Code:
   BDA 1908  8 20  9 53  0.00  32.0000N  89.0000E  60.0   0 0.00   0 0.00 0.00  6.60   0  7.00  7.00   0 NULL
   BDA 1915 12  3  2 39 19.00  29.5000N  91.5000E  60.0   0 0.00   0 0.00 0.00  6.70   0  7.10  7.10   0 NULL
   SIG 1927  5 22  0  0  0.00  36.0000N  96.0000E   0.0   0 0.00   0 0.00 0.00  7.50   0  8.00  8.00   0 NULL
   BDA 1934 12 15  1 57 37.00  31.3000N  89.3000E  60.0   0 0.00   0 0.00 0.00  6.70   0  7.10  7.10   0 NULL
   SIG 1937  1  7  0  0  0.00  35.5000N  98.0000E   0.0   0 0.00   0 0.00 0.00  7.10   0  7.60  7.60   0 NULL
   LEE 1937  1  7 13 20 41.00  35.5000N  97.6000E   0.0   0 0.00   0 0.00 0.00  7.10   0  7.60  7.60  10 NULL
   SIG 1947  7 29  0  0  0.00  28.5000N  94.0000E  60.0   0 0.00   0 0.00 0.00  7.30   0  7.80  7.80   0 NULL
   BDA 1947  7 29 13 43 22.00  28.5000N  94.0000E  60.0   0 0.00   0 0.00 0.00  7.40   0  7.90  7.90   0 NULL
   G-R 1950  8 15 14  9 30.00  28.5000N  96.5000E  25.0   0 0.00   0 0.00 0.00  8.10   0  8.70  8.70  10 NULL
   ISS 1950  9 13 11  7 27.00  27.5000N  96.4000E   0.0   0 0.00   0 0.00 0.00  6.60   0  7.00  7.00   7 NULL
   G-R 1951 11 18  9 35 47.00  30.5000N  91.0000E  25.0   0 0.00   0 0.00 0.00  7.40   0  7.90  7.90   4 NULL
   LEE 1951 11 18  9 35 50.00  31.1000N  91.4000E   0.0   0 0.00   0 0.00 0.00  7.50   0  8.00  8.00   0 NULL
   G-R 1952  8 17 16  2  7.00  30.5000N  91.5000E   0.0   0 0.00   0 0.00 0.00  7.00   0  7.50  7.50   9 NULL
   BDA 1963  4 19  7 35 24.00  35.8000N  96.9000E  33.0   0 0.00   0 0.00 0.00  6.60   0  7.00  7.00   0 NULL
   PDE 2001 11 14  9 26 10.01  35.9500N  90.5400E  10.0   0 0.00   0 0.00 7.80  8.30   0  0.00  8.30   0 NULL
Here the 8 & 9th column is latitude & longitude of a region.

Using small program what i have it asks for

Enter no of points as : 7
Then for this 7 points the user will enter latitude & longitude using for loop. this 7 points i am entering is for ex:.
HTML Code:
29.45, 89.43
32.47, 90.98
27.25, 95.63
27.29, 98.27
36.74, 96.32
31.90, 87.67
29.45, 89.43
Actu this points forms a plygon in the given region of the file.. Now i want a program in such way that the points or the latitude & longitude only inside the polygon should be stored in another file say b.dat.

The answer for the above example or the output file using above file a.dat & the points
are
b.dat:
HTML Code:
   BDA 1908  8 20  9 53  0.00  32.0000N  89.0000E  60.0   0 0.00   0 0.00 0.00  6.60   0  7.00  7.00   0 NULL
   SIG 1927  5 22  0  0  0.00  36.0000N  96.0000E   0.0   0 0.00   0 0.00 0.00  7.50   0  8.00  8.00   0 NULL
   BDA 1934 12 15  1 57 37.00  31.3000N  89.3000E  60.0   0 0.00   0 0.00 0.00  6.70   0  7.10  7.10   0 NULL
   G-R 1950  8 15 14  9 30.00  28.5000N  96.5000E  25.0   0 0.00   0 0.00 0.00  8.10   0  8.70  8.70  10 NULL
   ISS 1950  9 13 11  7 27.00  27.5000N  96.4000E   0.0   0 0.00   0 0.00 0.00  6.60   0  7.00  7.00   7 NULL
Now please me out.
Actu i have a region full of data i.e from latitude 0 to 40 & longitude 60 to 100.
Now i am asking for few points & based on how many ever points i am entering the lat & long for those points, this x,y points or lat long form a closed loop or polygon inside the given region 0 to 40 & 60 to 100..Now what i need is only the data which is inside that polygon or points which form a loop..Please help me out with a simple program..

SmilieSmilie
# 2  
Old 04-08-2010
I don't have a simple program for you. It's not even a simple problem. Furthermore just considering this a polygon may not be strictly correct since we're concerned with points on a sphere not a plane; if you just consider latitude as "Y" and longitude as "X" your polygon would need curved lines. You'd have to map it into mercator projection or something to get a correct representation. But if a naive solution will do, you could do it in a similar way to how they draw polygons. Please excuse my awful ASCII art:

Code:
        /\      /--\
......./**\..../****\....
      /    \  /      \
      |     \/       /
      |             /
      \            /
       \          /
        \---------

So you loop through your list of edges left to right, stopping at the segment following it, starting again at the next segment, stopping at the next, etc, etc until you run out of edges, checking if your data point exists between any of the spans you'd be "drawing" in. Note that the order of segments left to right can change depending on the latitude of the data point in question, you have to keep re-sorting the list!

To speed this up you could first check the data points against a bounding box of [min, min]-[max,max] which your polygon fits inside. If it's outside that box you can skip it without the bother of re-sorting your list of edges and trawling them one by one.

Last edited by Corona688; 04-08-2010 at 01:52 PM..
# 3  
Old 04-09-2010
Question

Ya i agree with you..Its not simple problem...
consider it as just X & Y cooridnates & tel me..
Thats what i am asking you how to loop through the list of my edges...I am not getting it..atleast can you tel me that part of code..Smilie
# 4  
Old 04-09-2010
Quote:
Originally Posted by reva
Ya i agree with you..Its not simple problem...
consider it as just X & Y cooridnates & tel me..
Just did I thought... what do you want to know?
Quote:
Thats what i am asking you how to loop through the list of my edges...I am not getting it..
Okay. You have a list of points:
Code:
29.45, 89.43
32.47, 90.98
27.25, 95.63
27.29, 98.27
36.74, 96.32
31.90, 87.67
29.45, 89.43

Convert it to a list of lines, where line A is [ point 1 ] , [ point 2] and line B is [ point 2 ], [ point 3 ] and the last is [ point n ], [ point 1 ]

Code:
[ 29.45, 89.43 ], [ 32.47, 90.98 ]
[ 32.47, 90.98 ], [ 27.25, 95.63 ]
[ 27.25, 95.63 ], [ 27.29, 98.27 ]
[ 27.29, 98.27 ], [ 36.74, 96.32 ]
[ 36.74, 96.32 ], [ 31.90, 87.67 ]
[ 31.90, 87.67 ], [ 29.45, 89.43 ]
[ 29.45, 89.43 ], [ 29.45, 89.43 ]

Quote:
at least can you tel me that part of code..Smilie
I don't know of an off-the-shelf solution offhand, and you can google as well as I can, so I figured you wanted to know how to build one. There's some interesting looking perl modules that may help you, particularly this one's polygon_contains_point function.

---------- Post updated at 10:41 AM ---------- Previous update was at 09:53 AM ----------

Code:
#!/usr/bin/perl

use Math::Polygon::Calc;

my @poly = (    [       0,      0       ],
                [       10,     0       ],
                [       10,     10      ],
                [       0,      0       ]       );

if(polygon_contains_point([9, 1], @poly))
{
        print "Point is inside\n";
}
else
{
        print "Point is not inside\n";
}

exit 0;

you'll of course need to install the Math::Polycon::Calc CPAN module. You can do that by running the interactive 'cpan' program as root. Once it's done configuring itself, tell it 'install Math::Polygon::Calc' and it'll do so. Then 'quit' and it'll do so.
# 5  
Old 04-09-2010
Thanks for the help...
As you said me After looping what should i do..i realy am not getting how to do...
Help me out
i checked out the link You had posted & the function..But I Dont know hoe\w to use that nor perl even little so i cant use that ..So if you can tel me in awk or simpler program i would be much more thankful...I actu got a function which tests if a point is inside polygon or not.But i dont know how to use it in unix..If you can check out & implement it in unix & tell me..
the link is posted below.
HTML Code:
http://local.wasp.uwa.edu.au/~pbourke/geometry/insidepoly/
OR if this is too complicated Please just tel me how to check if a point exist inside or outside a polygon & the to print the points which are inside the polygon.SmilieSmilie
# 6  
Old 04-09-2010
Quote:
Originally Posted by reva
But I Dont know hoe\w to use that nor perl even little so i cant use that ..So if you can tel me in awk or simpler program i would be much more thankful...
The code you've given me is for C, and doesn't define most of its data structures, and if you're terrified of copy-pasting a perl script I'm certainly not going to be able to talk you through using a compiler. Doing this in awk, bash, etc. is going to be a herculean task, most of these scripting languages don't have complex data structures or even compute floating point numbers. perl on the other hand has both... and a pre-made library that does what you want...
Quote:
OR if this is too complicated Please just tel me how to check if a point exist inside or outside a polygon & the to print the points which are inside the polygon.SmilieSmilie
I've already given you theory, pseudocode, and working code that does so, it doesn't take an intricate knowledge of perl to copy-paste it into a script and run it. If I modify it a little into a script that takes commandline parameters, will that work for you? Like
Code:
./polygon.pl "29.45, 89.43" "32.47, 90.98" "27.25, 95.63" "27.29, 98.27" "36.74, 96.32" "31.90, 87.67" "29.45, 89.43" < datafile

Another thing. Are these coordinates all decimals, or are some of them minutes?

---------- Post updated at 03:57 PM ---------- Previous update was at 03:14 PM ----------

First draft of the script assuming all coordinates are decimals:
Code:
#!/usr/bin/perl

use Math::Polygon::Calc;

# Each argument is a point on the polygon like "94.3,37.2"
foreach $argnum(0 .. $#ARGV)
{
        my @point=split( /\s*,\s*/, $ARGV[$argnum]);

#       printf("point %d: (%s)= [%s, %s]\n", $argnum, $ARGV[$argnum],
#               @point[0], @point[1]);

        push(@poly, [ @point[0], @point[1] ]);
}

# The last point needs to be the same as the first
@poly[ $#ARGV + 1 ] = [ @poly[0]->[0], @poly[0]->[1] ];

foreach $line(<STDIN>)
{
        my @arr=split(/\s+/, "$line");

        @arr[8] =~ s/[A-Z]//g;
        @arr[9] =~ s/[A-Z]//g;

        polygon_contains_point([ @arr[8], @arr[9] ], @poly) &&
                print "$line";
}

exit 0;

And a trial run on your sample polygon and sample data:
Code:
$ ./poly.pl "29.45, 89.43" "32.47, 90.98" "27.25, 95.63" "27.29, 98.27" "36.74, 96.32" "31.90, 87.67" < data
   BDA 1908  8 20  9 53  0.00  32.0000N  89.0000E  60.0   0 0.00   0 0.00 0.00  6.60   0  7.00  7.00   0 NULL
   SIG 1927  5 22  0  0  0.00  36.0000N  96.0000E   0.0   0 0.00   0 0.00 0.00  7.50   0  8.00  8.00   0 NULL
   BDA 1934 12 15  1 57 37.00  31.3000N  89.3000E  60.0   0 0.00   0 0.00 0.00  6.70   0  7.10  7.10   0 NULL
   G-R 1950  8 15 14  9 30.00  28.5000N  96.5000E  25.0   0 0.00   0 0.00 0.00  8.10   0  8.70  8.70  10 NULL
   ISS 1950  9 13 11  7 27.00  27.5000N  96.4000E   0.0   0 0.00   0 0.00 0.00  6.60   0  7.00  7.00   7 NULL


Last edited by Corona688; 04-09-2010 at 07:51 PM..
# 7  
Old 04-12-2010
Question

These coordinates are all decimals not minutes.
I am very new to perl.Please explain me the perl code &
How do i compile & execute perl program.
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Reducing the decimal points of numbers (3d coordinates) in a file; how to input data to e.g. Python

I have a file full of coordinates of the form: 37.68899917602539 58.07500076293945 57.79100036621094 The numbers don't always have the same number of decimal points. I need to reduce the decimal points of all the numbers (there are 128 rows of 3 numbers) to 2. I have tried to do this... (2 Replies)
Discussion started by: crunchgargoyle
2 Replies

2. Shell Programming and Scripting

Grabbing data between 2 points in text file

I have a text file that shows the output of my solar inverters. I want to separate this into sections. overview , device 1 , device 2 , device 3. Each device has different number of lines. but they all have unique starting points. Overview starts with 6 #'s, Devices have 4#'s and their data starts... (6 Replies)
Discussion started by: Mikey
6 Replies

3. UNIX for Dummies Questions & Answers

Copying part of a data file into another

Hi, I have a large number of data files each containing simple integers from 1 to around 25000 in ascending order. However, they are not in a specific progression; some numbers are missing in each file. For ex. datfile1 may have the numbers in order 1 2 4 6 7 8 12 ... 24996 24999 while datfile2... (8 Replies)
Discussion started by: latsyrc
8 Replies

4. UNIX for Dummies Questions & Answers

Finding data value that contains x% of points

Hi, I need help on finding the value of my data that encompasses certain percentage of my total data points (n). Attached is an example of my data, n=30. What I want to do is for instance is find the minimum threshold that still encompasses 60% (n=18), 70% (n=21) and 80% (n=24). manually to... (4 Replies)
Discussion started by: ida1215
4 Replies

5. Shell Programming and Scripting

Calculate difference between consecutive data points in a column from a file

Hi, I have a file with one column data (sample below) and I am trying to write a shell script to calculate the difference between consecutive data valuse i.e Var = Ni -N(i-1) 0.3141 -3.6595 0.9171 5.2001 3.5331 3.7022 -6.1087 -5.1039 -9.8144 1.6516 -2.725 3.982 7.769 8.88 (5 Replies)
Discussion started by: malandisa
5 Replies

6. Shell Programming and Scripting

Writing an algorithm to recode data points

I have a file that has been partially recoded so that data points that were formerly letter combinations are now -1, 0, or 1. I need to finish recoding the GG and CC data points. The file looks like this: ID 1 2 3 4 5 6 7 8 83845676 0 0 0 0 CC -1 CC CC 838469. -1 -1 1 GG CC 0 CC 1 83847041... (10 Replies)
Discussion started by: doobedoo
10 Replies

7. Shell Programming and Scripting

recoding data points using SED??

Hello all, I have a data file that needs some serious work...I have no idea how to implement the changes that are needed! The file is a genotypic file with >64,000 columns representing genetic markers, a header line, and >1100 rows that looks like this: ID 1 2 3 4 ... (7 Replies)
Discussion started by: doobedoo
7 Replies

8. Shell Programming and Scripting

how to omit data from a file created in a script

I am using the fallowing script. this script seems to work fine except the file has data I do not wish to have. Is there away to omit that data. I will first provide the scrip and then a sample of the data the way it looks and then a sample of how I would like the data to look. Thanks for any... (3 Replies)
Discussion started by: krisarmstrong
3 Replies

9. Shell Programming and Scripting

Comparing data inside file

Hi Everyone, I will try to explain my question please forgive my english here. I am looking for shell script or command that can compare data in the files. I have 50 files in one directory test1 test2 test3 ....so on. I want to compare data in each files with each other and output each... (4 Replies)
Discussion started by: email-lalit
4 Replies

10. Shell Programming and Scripting

to extarct data points

suppose u have a file which consist of many data points separated by asterisk Question is to extract third part in each line . 0.0002*0.003*-0.93939*0.0202*0.322*0.3332*0.2222*0.22020 0.003*0.3333*0.33322*-0.2220*0.3030*0.2222*0.3331*-0.3030 0.0393*0.3039*-0.03038*0.033*0.4033*0.30384*0.4048... (5 Replies)
Discussion started by: cdfd123
5 Replies
Login or Register to Ask a Question