Obscurify privacy sensitive data with PostGIS

If you want to show data in your GIS or webmap that is privacy sensitive, you should obscure the data to avoid privacy related problems. With PostGIS it’s easy to do by creating an altered table of the original data.

Example data

For this example I created a PostGIS table with points in a 100m x 100m grid. This makes it easier to show how the data is obscured. Of course your own data will probably be a table with address points or something like that. If you want to obscure your own data always make sure the table itself doesn’t contain privacy sensitive data.

The example table (public.example) has 2 fields: gid as a serial and geom as a geometry. The resulting table will have the same fields.

The green dots are the original example data. To explain the effect of the obscurification, they are spread equally over the map.

The plan

To obscure the geometry data we will move the original point to a random place inside a 100m x 100m rectangle around the original point and save it as a new table.

The plan is to create new points inside the 100m x 100m rectangles around the original points.

The code

We will create a new table public.example_obscured where the only thing that changed, is the location of the point. Each point will be randomly moved between 0 and 50 meter to the left or to the right and randomly between 0 and 50 meter up or down. The unit of this displacement is meter, because the unit of CRS of the source data is meter. If your table’s CRS is in feet, the displacement will be in feet.

The code to achieve this result is:

CREATE TABLE example_obscured AS
      st_point(st_x(geom)-50+random()*100, st_y(geom)-50+random()*100) geom
    FROM example; 

The green dots are the original example data and the red dots are the obscured points.

The result

The result is that your sensitive data is no longer linked to an exact location, but instead can be seen on the map somewhere in the area of the original point.

The red dots are the obscured points and they probably still show the general idea of the data but the points no longer can be linked to an exact location (address, person,…).

Other options

Not copying the data

Instead of creating a new table you could also create a view or a materialized view instead of a table. This could be an option if your original data changes often.

Changing the amount of obscurification

Depending on the type of data it can be useful to change the amount of obscurification.
In the example we used 100m x 100m rectangles and therefore used the following randomizing code:

st_point(st_x(geom)-50+random()*100, st_y(geom)-50+random()*100) geom 

If you would like to move the points inside a 500m x 500m rectangle you should use the following randomizing code:

st_point(st_x(geom)-250+random()*500, st_y(geom)-250+random()*500) geom 

Placing your points on a grid

If your original data is not a grid (unlike the example above) another option would be to round the coordinates so they end up in a grid. The disadvantage of this approach is that if you have multiple points near each other in the original data they will end up on top of each other in the obscured data.

The obscurification code for 100m x 100m grid to do this would be:

st_point(round(st_x(geom)/100)*100, round(st_y(geom)/100)*100) geom

Use the same technique in QGIS

You can use the same technique in QGIS with the use of a Geometry Generator Style. An example of this style can be found in my “QGIS Geometry Generator examples” repository on Gitlab.

One Reply to “Obscurify privacy sensitive data with PostGIS”

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.