Sponsored Content
Top Forums UNIX for Dummies Questions & Answers How to filter out almost dupicate X Y (Easting Northing) coordinates? Post 302454972 by kenneth.mcbride on Monday 20th of September 2010 04:38:55 PM
Old 09-20-2010
Thank you.

Jim,

{$1$2} was a typo.

rdcwayx,

My files are not Latitude Longitude Degrees, but State Plane Feet.
They typically contain 3 million records.

A file looks like, PointNumber Easting Northing Elevation, as follows:

PointNumber_0000001 1000000.123456 1000000.123456 10000.123456
PointNumber_0000010 1000001.234567 1000002.234567 10345.234567
PointNumber_0000100 1000010.345678 1000020.456789 10030.987654
PointNumber_0001000 1000050.345678 1000050.456789 10030.987654
PointNumber_0010000 1000123.123456 1000456.123456 10789.123456
PointNumber_0100000 1000123.123456 1000456.123456 10789.123456
PointNumber_1000000 1000000.123456 1000000.123456 10000.123456
PointNumber_2000000 1000011.345678 1000021.456789 10030.987654
PointNumber_3000000 1000051.000678 1000049.999000 10030.987654

Where, relative to fields 2 and 3:
PointNumber_1000000 is an "exact duplicate" of PointNumber_0000001
PointNumber_0100000 is an "exact duplicate" of PointNumber_0010000
Where, relative to fields 2 and 3, and within a user defined range of + or - 2.0:
PointNumber_2000000 is a "near duplicate" of PointNumber_0000100
PointNumber_3000000 is a "near duplicate" of PointNumber_0001000

So a point/record is a "near duplicate" when the easting and northing are within a user defined range. So if I use a value of 2.75 feet for a range, then if a record has easting and northing that are within 2.5 feet of any other record then it it to considered a "near duplicate" and deleted.

If possible, it would be great if I could get two files from the input file:
1. An output file with the near duplicates removed.
2. An output file with the near duplicates that were removed.

Thank you again,
Kenny.

---------- Post updated at 01:38 PM ---------- Previous update was at 09:25 AM ----------

Jim,

When I use your code on the sample data set in my previous post, it prints the whole file.

Kenny.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Removing dupicate lines in the file ..(they are not continuous)

I have duplicates records in a file, but they are not consecutive. I want to remove the duplicates , using script. Can some one help me in writing a ksh script to implement this task. Ex file is like below. 1234 5689 4556 1234 4444 (7 Replies)
Discussion started by: Srini75
7 Replies

2. Shell Programming and Scripting

Defining X and Y Coordinates Inside A Window

Hello, I am starting up an Xnest window and trying to place a program inside of it. I have the window inside of it now but it always spawns with the top left corner at (0, 0). I need to find a way to set the x and y coordinates to something other than (0, 0). I tried using the -geometry option... (1 Reply)
Discussion started by: lesnaubr
1 Replies

3. Shell Programming and Scripting

Calculating distance between two LAT long coordinates

hi, i have a pair of latitude and longitude and i want to calculate the distance between these two points. In vbscript i achieved in the following way...Now i want to implement this in unix shell scripting.... <% Dim lat1, lon1, lat2, lon2 const pi = 3.14159265358979323846 ... (8 Replies)
Discussion started by: aemunathan
8 Replies

4. Shell Programming and Scripting

Search for particular tag and arrange as coordinates

Hi I have a file whose sample contents are shown here, 1.2.3.4->2.4.2.4 a(10) b(20) c(30) 1.2.3.4->2.9.2.4 a(10) c(20) 2.3.4.3->3.6.3.2 b(40) d(50) c(20) 2.3.4.3->3.9.0.2 a(40) e(50) c(20) 1.2.3.4->3.4.2.4 a(10) c(30) 6.2.3.4->2.4.2.5 c(10) . . . . Here I need to search... (5 Replies)
Discussion started by: AKD
5 Replies

5. Shell Programming and Scripting

place cursor in specific coordinates

Hi, I have this problem on how to place the cursor in a text editor (for example: pico). I made this script that would attach comments to a script file then open the script file, I would like to know how to place the cursor in a specific place, for example at the end of the comments, ... (1 Reply)
Discussion started by: lechelle
1 Replies

6. Shell Programming and Scripting

Determination n points between two coordinates

Hi guys. Can anyone tell me how to determine points between two coardinates. For example: Which type of command line gives me 50 points between (8, -5, 7) and (2, 6, 9) points Thanks (5 Replies)
Discussion started by: rpf
5 Replies

7. Shell Programming and Scripting

Differential substring removal using coordinates

Hello all, this might be better suited for a bioinformatics forum, but I thought I'd try my luck here as well. I have several tabular text files of DNA sequence reads that appear as such: File_1.txt >H01BA45XW GATTACAGATTCGACATCCAACTGAGGCATT >H02BG78WR CCTTACAGACTGGGCATGAATATTGCATACC... (3 Replies)
Discussion started by: vectorborne5
3 Replies

8. UNIX for Dummies Questions & Answers

Length of a segment based on coordinates

Hi, I would like to have the length of a segment based on coordinates of its parts. Example input file: chr11 genes_good3.gtf aggregate_gene 1 100 gene1 chr11 genes_good3.gtf exonic_part 1 60 chr11 genes_good3.gtf exonic_part 70 100 chr11 genes_good3.gtf aggregate_gene 200 1000 gene2... (2 Replies)
Discussion started by: fadista
2 Replies

9. UNIX for Dummies Questions & Answers

overlapped genomic coordinates

Hi, I would like to know how can I get the ID of a feature if its genomic coordinates overlap the coordinates of another file. Example: Get the 4th column (ID) of this file1: chr1 10 100 gene1 chr2 3000 5000 gene2 chr3 200 1500 gene3 if it overlaps with a feature in this file2: chr2... (1 Reply)
Discussion started by: fadista
1 Replies

10. UNIX for Beginners Questions & Answers

Help with processing coordinates in a file.

I have a variation table (variation.txt) which is a very big file. The first column in the chromosome number and the second column is the position of the variation. I have a second file annotation.txt which has a list of 37,000 genes (1st column), their chromosome number(2nd column), their start... (1 Reply)
Discussion started by: Sanchari
1 Replies
Geography::NationalGrid(3pm)				User Contributed Perl Documentation			      Geography::NationalGrid(3pm)

NAME
Geography::NationalGrid - Base class to create an object for a point and to transform coordinate systems SYNOPSIS
Geography::NationalGrid is a factory class whose sole purpose is to give you an object for the right country. The first argument to new() is the ISO 2 letter country code, and it is followed by name-value pairs that are passed to the country- specific constructor. See the reference for the country-specific module - a country code of 'GB' corresponds to the module called Geography::NationalGrid::GB. use Geography::NationalGrid; my $point1 = new Geography::NationalGrid( 'GB', GridReference => 'TQ 289816', ); print "Latitude is " . $point1->latitude . " degrees north "; DESCRIPTION
You ask for an object for the correct country, described using the ISO 2-letter country code. You will need to supply information to the constructor. You may then call methods on that object to do whatever operations you need. Conceptually each object represents a point on the ground, although you some grid systems may take that point to be a corner of a defined area. E.g. a 6-figure OS National Grid reference may be thought of as the point at the south-west of a 100m by 100m square. METHODS
See the documentation for the country-specific module. This modules provides these generic methods which may or may not be used by the country-specific objects: latitude() / longitude() Returns the appropriate value in floating point degrees easting() / northing() Returns the appropriate value in metres, truncated to integer metres data(PARAMETER) Access the Userdata hash in the object, and retrieve whatever is keyed against PARAMETER. Typical use might be to store some long information about the point, such as the site name. deg2string(DEGREES) Returns a string of the form '52d 28m 34s' when given a number of degrees. You can also call this as a class method. deg2rad(DEGREES) The input number of degrees may be in one of 3 formats: a floating point number, such as 52.34543; a reference to an array of 3 values representing degrees, minutes and seconds, such as [52, 28, 34]; a string of the form '52d 28m 34s'. Returns the number of radians as a floating point number. You can also call this as a class method. rad2deg(RADIANS) Converts a floating point number of radians into a flaoting point number of degrees. You can also call this as a class method. OTHER COUNTRIES
The core distribution includes the GB and IE modules, allowing you to work with the National Grids of Britain and Ireland. Adding support for another country would require the module for that country to be installed - the naming convention is 'Geography::NationalGrid::' followed by the ISO 2-letter country code, in capitals. If you would like to provide support for another country please see the DEVELOPERS section below. ACCURACY
The routines used in this code may not give you completely accurate results for various mathematical and theoretical reasons. In tests the results appeared to be correct, but it may be that under certain conditions the output could be highly inaccurate. It is likely that output accuracy decreases further from the datum, and behaviour is probably divergent outside the intended area of the grid, but in any case accuracy is not guaranteed. This module has been coded in good faith but it may still get things wrong. Hence, it is recommended that this module is used for preliminary calculations only, and that it is NOT used under any circumstance where its lack of accuracy could cause any harm, loss or other problems of any kind. Beware! DEVELOPERS
This module was originally written for the OS National Grid of Great Britain, but built in a way to allow other countries to be easily plugged in. This module is the base class; it contains a lot of the functions that you'll need - most notably the transformations between transverse Mercator projections and Latitude/Longitude positions. Modules can use this class and override methods as needed. If you do write a module then why not keep the basic object interface similar to the 'GB' and 'IE' modules - for example, why not simply inherit the latitude() accessor method from here. There will probably be country-specific methods that you wish to add aswell, and features of the GB module may not apply to your grid. This module contains some object methods which you can inherit, and these are data(PARAMETER), northing(), easting(), latitude() and longitude(), and the _mercator2latlong() and _latlong2mercator() internal methods. All of these assume that your object has certain pieces of data in certain places. See the METHODS section above. In short, to write a module for a new country you simply need to write the new() routine to create a fully populated object. You may need to write a gridReference() accessor routine, and probably need to write the routines to convert raw eastings & northings to/from a grid reference. You'll also need the parameters of the ellipsoid used and the projection parameters. Most national grids are transverse Mercator projections, which means you can inherit the internal conversion routines from this class and you'll have an easy job. Otherwise you may need to implement your own conversion. AUTHOR AND COPYRIGHT
Copyright (c) 2002 P Kent. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. Equations for transforming latitude and longitude to, and from, rectangular grid coordinates appear on an Ordnance Survey webpage, although they are standard coordinate conversion equations - thanks to the OS for clarifying. $Revision: 1.6 $ perl v5.10.0 2007-12-06 Geography::NationalGrid(3pm)
All times are GMT -4. The time now is 04:33 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy