“Spatially clean” geometries with ST_SnapToGrid()

Recently I was faced with 2D-geometries that should be “spatially clean” but they weren’t. Postgres/Postgis to the rescue, the data could be cleaned and the problem was solved.

“Spatially clean” in the project meant:

  • no geometry should overlap another geometry
  • the geometries should touch at least one other geometry

Let’s have a look at an example. First we’ll create a table with polygons and insert 4 triangles that together build a rectangle:


CREATE TABLE translate_polygons (
  "id" SERIAL NOT NULL PRIMARY KEY,
  "name" char(1) NOT NULL,
  "the_geom" geometry NOT NULL
);
-- the data
INSERT INTO translate_polygons (name, the_geom) VALUES (
  'A',
  'POLYGON((2 1, 0 3, 2 3, 2 1))'::geometry
), (
  'B',
  'POLYGON((0 3, 2 5, 2 3, 0 3))'::geometry
), (
  'C',
  'POLYGON((2 5, 4 3, 2 3, 2 5))'::geometry
), (
  'D',
  'POLYGON((4 3, 2 1, 2 3, 4 3))'::geometry
);

Those geometries are displayed in the following image.

the geometries used in the example

the geometries used in the example

The data inserted matches the criterias noted above:


-- no overlapping
  SELECT t1.name || ' overlaps ' || t2.name || ' = ' || ST_Overlaps(t1.the_geom, t2.the_geom)::text AS overlap_test
    FROM translate_polygons AS t1,
         translate_polygons AS t2
   WHERE t1.name <> t2.name
ORDER BY t1.name, t2.name;
     overlap_test
----------------------
 A overlaps B = false
 A overlaps C = false
 A overlaps D = false
 B overlaps A = false
 B overlaps C = false
 B overlaps D = false
 C overlaps A = false
 C overlaps B = false
 C overlaps D = false
 D overlaps A = false
 D overlaps B = false
 D overlaps C = false

-- in this case every triangle touches all other triangles:
  SELECT t1.name || ' touches ' || t2.name || ' = ' || ST_Touches(t1.the_geom, t2.the_geom)::text AS touch_test
    FROM translate_polygons AS t1,
         translate_polygons AS t2
   WHERE t1.name <> t2.name
ORDER BY t1.name, t2.name;
 
     touch_test
--------------------
 A touches B = true
 A touches C = true
 A touches D = true
 B touches A = true
 B touches C = true
 B touches D = true
 C touches A = true
 C touches B = true
 C touches D = true
 D touches A = true
 D touches B = true
 D touches C = true

Let’s use ST_Translate to move two of the triangles:


-- triangle to the lower left
-- move it to "southwest"
-- this produces a gap
UPDATE translate_polygons
   SET the_geom = ST_Translate(the_geom, -0.4, -0.2)
 WHERE name = 'A';
-- triangle to the upper right
-- move it to "southwest"
-- this produces overlapping geometries
UPDATE translate_polygons
   SET the_geom = ST_Translate(the_geom, -0.1, -0.4)
 WHERE name = 'C';

Now the geometries look like this:

The altered geometries with gaps and overlapping

The altered geometries with gaps and overlapping

We now have geometries that overlap and are not touching at least one other geometry:

     touch_test
---------------------
 A touches B = false
 A touches C = false
 A touches D = false
 B touches A = false
 B touches C = false
 B touches D = true
 C touches A = false
 C touches B = false
 C touches D = false
 D touches A = false
 D touches B = true
 D touches C = false

       overlap_test
----------------------
 A overlaps B = false
 A overlaps C = false
 A overlaps D = false
 B overlaps A = false
 B overlaps C = true
 B overlaps D = false
 C overlaps A = false
 C overlaps B = true
 C overlaps D = true
 D overlaps A = false
 D overlaps B = false
 D overlaps C = true

How can we correct the geometries when we do not want to ST_Translate()-calls with (manually) adjusted Parameters? Postgis provides a handy function ST_SnapToGrid():

Function signatures:

  1. geometry ST_SnapToGrid(geometry geomA, float originX, float originY, float sizeX, float sizeY);
  2. geometry ST_SnapToGrid(geometry geomA, float sizeX, float sizeY);
  3. geometry ST_SnapToGrid(geometry geomA, float size);
  4. geometry ST_SnapToGrid(geometry geomA, geometry pointOrigin, float sizeX, float sizeY, float sizeZ, float sizeM);

ST_SnapToGrid — Snap all points of the input geometry to the grid defined by its origin and cell size. Remove consecutive points falling on the same cell, eventually returning NULL if output points are not enough to define a geometry of the given type. Collapsed geometries in a collection are stripped from it. Useful for reducing precision.

(From the Postgis-SVN-Manuals)

We can snap the geometries to a 1 by 1-grid when calling


UPDATE translate_polygons
   SET the_geom = ST_SnapToGrid(the_geom, 1)

Now the geometries look neat again:

the geometries used in the example

the corrected geometries.

2008-12-15: fixed typos.

Advertisement

A look at ST_Intersects(), ST_Overlaps() and ST_Intersection()

In this post I want to have a look at the behaviour of the OpenGIS functions ST_Intersects, ST_Overlaps and ST_Intersection. The database used is of course PostgreSQL which gives us all we need when used together with PostGIS.

First let’s create three tables with geographic data, one for points, lines and polygons each:

CREATE TABLE example_points (
  "id" SERIAL NOT NULL PRIMARY KEY,
  "name" char(1) NOT NULL,
  "the_geom" geometry NOT NULL
);
CREATE TABLE example_lines (
  "id" SERIAL NOT NULL PRIMARY KEY,
  "name" char(1) NOT NULL,
  "the_geom" geometry NOT NULL
);
CREATE TABLE example_polygons (
  "id" SERIAL NOT NULL PRIMARY KEY,
  "name" char(1) NOT NULL,
  "the_geom" geometry NOT NULL
);

Now let’s populate those tables with data, we’ll insert 3 points, 3 lines and 4 polygons:

-- some points
INSERT INTO example_points (name, the_geom) VALUES (
  'A',
  'POINT(6 1)'::geometry
), (
  'B',
  'POINT(5.5 4.5)'::geometry
), (
  'C',
  'POINT(4.5 5.5)'::geometry
);
-- some lines
INSERT INTO example_lines (name, the_geom) VALUES (
  'D',
  'LINESTRING(3 1, 5 1)'::geometry
), (
  'E',
  'LINESTRING(3 7, 5 5)'::geometry
), (
  'F',
  'LINESTRING(5 6, 5 8)'::geometry
);
-- some polygons
INSERT INTO example_polygons (name, the_geom) VALUES (
  'G',
  'POLYGON((0 0, 0 2, 2 2, 2 0, 0 0))'::geometry
), (
  'H',
  'POLYGON((2.5 2.5, 2.5 4.5, 4.5 4.5, 4.5 2.5, 2.5 2.5))'::geometry
), (
  'I',
  'POLYGON((4 4, 4 6, 6 6, 6 4, 4 4))'::geometry
), (
  'J',
  'POLYGON((6 5, 6 7, 8 7, 8 5, 6 5))'::geometry
), (
  'K',
  'POLYGON((6 2, 6 4, 8 4, 8 2, 6 2))'::geometry
);

For further examinations let’s create a view containing all the above geometries:

CREATE VIEW example_geometries AS (
	(
		SELECT name, the_geom
		  FROM example_points
	) UNION ALL (
		SELECT name, the_geom
		  FROM example_lines
	) UNION ALL (
		SELECT name, the_geom
		  FROM example_polygons
	)
);

If you cannot imagine the relationship of these entities, here is a picture showing the locations of each geometry:

The geometries we just inserted

The geometries we just inserted

Now where do these geometries intersect? Let’s create a table which we can use to diplay the intersection geometries:

CREATE TABLE example_intersections_a AS (
	SELECT ST_Intersection(part_1.the_geom, part_2.the_geom)
	  FROM example_geometries AS part_1,
	       example_geometries AS part_2
	 WHERE part_1.name <> part_2.name
	   AND ST_Intersects(part_1.the_geom, part_2.the_geom)
);
-- add a primary key so we can display the data within QGIS
ALTER TABLE example_intersections_a ADD COLUMN id SERIAL;
ALTER TABLE example_intersections_a ADD PRIMARY KEY (id);

Here are the intersections:

The inserted geometries and their intersections (in green)

The inserted geometries and their intersections (in green)

Note that we have 16 intersections of many kinds: Point with line, line with polygon, polygon with polygon, etc. All intersections are listed twice since if geometry X intersects with geometry Y, both X ∩ Y and Y ∩ X are counted.

ST_Intersection() actually gave us the geometries of the intersection, which can be of any geometry type, e.g. a polygon/polygon intersection can result in a point intersection, if the geometries share exactly one point (see the intersection of K with I).

Here is a tabular view of all the computation that took place to create the intersection table:

A visualisation of the computed iterations to find intersections

A visualisation of the computed iterations to find intersections

If you only want distinct intersections, use the DISTINCT SQL-Keyword:

CREATE TABLE example_intersections_b AS (
	SELECT DISTINCT ST_Intersection(part_1.the_geom, part_2.the_geom)
	  FROM example_geometries AS part_1,
	       example_geometries AS part_2
	 WHERE part_1.name <> part_2.name
	   AND ST_Intersects(part_1.the_geom, part_2.the_geom)
);
-- add a primary key so we can display the data within QGIS
ALTER TABLE example_intersections_b ADD COLUMN id SERIAL;
ALTER TABLE example_intersections_b ADD PRIMARY KEY (id);

In our case there are 7 distinct intersections. If you expected 8, thinking X ∩ Y and Y ∩ X would only be counted once, you’re only halfway right: since the intersection of C with I and C with E are the same, only one get counted of these.

If you are interested in intersection of the same dimension (see OGC SPEC s2.1.1.1 – dimension is 0 for points, 1 for lines, 2 for polygons) of the geometries you compare, use ST_Overlaps() in the WHERE-clause instead of ST_Intersects():

CREATE TABLE example_intersections_c as (
	SELECT DISTINCT ST_Intersection(part_1.the_geom, part_2.the_geom)
	  FROM example_geometries AS part_1,
	       example_geometries AS part_2
	 WHERE part_1.name <> part_2.name
	   AND ST_Overlaps(part_1.the_geom, part_2.the_geom)
);
-- add a primary key so we can display the data within QGIS
ALTER TABLE example_intersections_c ADD COLUMN id SERIAL;
ALTER TABLE example_intersections_c ADD PRIMARY KEY (id);

PostGIS: Substracting Geometries with ST_Difference()

outer

The outer polygon

In a recent PostgreSQL/PostGIS project I was faced with the following problem:

  • given two geometries A and B
  • calculate that part of A that is (spatially) outside of B
  • call this geometry C

Say we have two geometries “outer polygon” and “inner polygon”:
SELECT ST_AsText('POLYGON((0 0, 0 3, 3 3, 3 0, 0 0))'::geometry) AS "outer polygon";
SELECT ST_AsText('POLYGON((1 1, 1 2, 2 2, 2 1, 1 1))'::geometry) AS "inner polygon";

The outer and the inner polygon

The outer and the inner polygon

Using the function ST_Difference() it is easy two get the geometry of outer polygon without the inner polygon (a nice documentatioon about this and other sPatial Functions can be found in the DB2-Documentation):

SELECT ST_Difference(
  'POLYGON((0 0, 0 3, 3 3, 3 0, 0 0))'::geometry,
  'POLYGON((1 1, 1 2, 2 2, 2 1, 1 1))'::geometry
) AS "the difference";

Displayed as Well Known Text (WKT) this is: POLYGON((0 0,0 3,3 3,3 0,0 0),(1 1,2 1,2 2,1 2,1 1))

The resulting geometry as difference of the input geometries

The resulting geometry as difference of the input geometries

To test if the geometry is what we are expecting it to be lets test if the POINT(1.5 1.5) is contained in the resulting geometry… it should not:


SELECT ST_Contains(
  'POINT(1.5 1.5)'::geometry,
  ST_Difference(
    'POLYGON((0 0, 0 3, 3 3, 3 0, 0 0))'::geometry,
    'POLYGON((1 1, 1 2, 2 2, 2 1, 1 1))'::geometry
  )
) AS "Point is contained in difference (expect f)"; 

Update 2008-11-05, added images and links