6. Projecting Data¶
Setting up the conda env:
conda create -n sql python
conda activate sql
conda install ipython-sql sqlalchemy psycopg2 notebook pandas -c conda-forge
Sample dataset:
nyc_data.zip (Watch this video to load data into PostGIS)
References:
6.1. Connecting to the database¶
%load_ext sql
import os
host = "localhost"
database = "nyc"
user = os.getenv('SQL_USER')
password = os.getenv('SQL_PASSWORD')
connection_string = f"postgresql://{user}:{password}@{host}/{database}"
%sql $connection_string
%%sql
SELECT * from nyc_subway_stations LIMIT 5
* postgresql://postgres:***@localhost/nyc
5 rows affected.
id | geom | objectid | name | alt_name | cross_st | long_name | label | borough | nghbhd | routes | transfers | color | express | closed |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
376 | 010100002026690000371775B5C3CE2141CBD2347771315141 | 1 | Cortlandt St | None | Church St | Cortlandt St (R,W) Manhattan | Cortlandt St (R,W) | Manhattan | None | R,W | R,W | YELLOW | None | None |
2 | 010100002026690000CBE327F938CD21415EDBE1572D315141 | 2 | Rector St | None | None | Rector St (1) Manhattan | Rector St (1) | Manhattan | None | 1 | 1 | RED | None | None |
1 | 010100002026690000C676635D10CD2141A0ECDB6975305141 | 3 | South Ferry | None | None | South Ferry (1) Manhattan | South Ferry (1) | Manhattan | None | 1 | 1 | RED | None | None |
125 | 010100002026690000F4CF3E3654032241B5704681A73C5141 | 4 | 138th St | Grand Concourse | Grand Concourse | 138th St / Grand Concourse (4,5) Bronx | 138th St / Grand Concourse (4,5) | Bronx | None | 4,5 | 4,5 | GREEN | None | None |
126 | 01010000202669000084DADF7AED0422410C380E6E3A3D5141 | 5 | 149th St | Grand Concourse | Grand Concourse | 149th St / Grand Concourse (4) Bronx | 149th St / Grand Concourse (4) | Bronx | None | 4 | 2,4,5 | GREEN | express | None |
6.2. Checking SRID¶
The earth is not flat, and there is no simple way of putting it down on a flat paper map (or computer screen), so people have come up with all sorts of ingenious solutions, each with pros and cons. Some projections preserve area, so all objects have a relative size to each other; other projections preserve angles (conformal) like the Mercator projection; some projections try to find a good intermediate mix with only little distortion on several parameters. Common to all projections is that they transform the (spherical) world onto a flat Cartesian coordinate system, and which projection to choose depends on how you will be using the data.
We’ve already encountered projections when we
loaded our nyc data.
(Recall that pesky SRID 26918). Sometimes, however, you need to
transform and re-project between spatial reference systems. PostGIS
includes built-in support for changing the projection of data, using the
ST_Transform(geometry, srid)
function. For managing the spatial reference identifiers on geometries,
PostGIS provides the ST_SRID(geometry)
and ST_SetSRID(geometry, srid)
functions.
We can confirm the SRID of our data with the ST_SRID
function:
%%sql
SELECT ST_SRID(geom) FROM nyc_streets LIMIT 1;
* postgresql://postgres:***@localhost/nyc
1 rows affected.
st_srid |
---|
26918 |
And what is definition of “26918”? As we saw in
loading data section,
the definition is contained in the spatial_ref_sys
table. In fact,
two definitions are there. The “well-known text”
(WKT
) definition is in the srtext
column, and there is a second definition in “proj.4” format in the
proj4text
column.
%%sql
SELECT * FROM spatial_ref_sys WHERE srid = 26918
* postgresql://postgres:***@localhost/nyc
1 rows affected.
srid | auth_name | auth_srid | srtext | proj4text |
---|---|---|---|---|
26918 | EPSG | 26918 | PROJCS["NAD83 / UTM zone 18N",GEOGCS["NAD83",DATUM["North_American_Datum_1983",SPHEROID["GRS 1980",6378137,298.257222101,AUTHORITY["EPSG","7019"]],TOWGS84[0,0,0,0,0,0,0],AUTHORITY["EPSG","6269"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4269"]],PROJECTION["Transverse_Mercator"],PARAMETER["latitude_of_origin",0],PARAMETER["central_meridian",-75],PARAMETER["scale_factor",0.9996],PARAMETER["false_easting",500000],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],AUTHORITY["EPSG","26918"]] | +proj=utm +zone=18 +datum=NAD83 +units=m +no_defs |
In fact, for the internal PostGIS re-projection calculations, it is the
contents of the proj4text
column that are used. For our 26918
projection, here is the proj.4 text:
%%sql
SELECT proj4text FROM spatial_ref_sys WHERE srid = 26918
* postgresql://postgres:***@localhost/nyc
1 rows affected.
proj4text |
---|
+proj=utm +zone=18 +datum=NAD83 +units=m +no_defs |
In practice, both the srtext
and the proj4text
columns are
important: the srtext
column is used by external programs like
GeoServer, QGIS, and
FME and others; the proj4text
column is used
internally.
6.3. Comparing Data¶
Taken together, a coordinate and an SRID define a location on the globe. Without an SRID, a coordinate is just an abstract notion. A “Cartesian” coordinate plane is defined as a “flat” coordinate system placed on the surface of Earth. Because PostGIS functions work on such a plane, comparison operations require that both geometries be represented in the same SRID.
If you feed in geometries with differing SRIDs you will just get an error:
# %%sql
# SELECT ST_Equals(
# ST_GeomFromText('POINT(0 0)', 4326),
# ST_GeomFromText('POINT(0 0)', 26918)
# )
Be careful of getting too happy with using
ST_Transform
for on-the-fly
conversion. Spatial indexes are built using SRID of the stored
geometries. If comparison are done in a different SRID, spatial indexes
are (often) not used. It is best practice to choose one SRID for all
the tables in your database. Only use the transformation function when
you are reading or writing data to external applications.
6.4. Transforming Data¶
If we return to our proj4 definition for SRID 26918, we can see that our working projection is UTM (Universal Transverse Mercator) of zone 18, with meters as the unit of measurement.
+proj=utm +zone=18 +ellps=GRS80 +datum=NAD83 +units=m +no_defs
Let’s convert some data from our working projection to geographic coordinates – also known as “longitude/latitude”.
To convert data from one SRID to another, you must first verify that your geometry has a valid SRID. Since we have already confirmed a valid SRID, we next need the SRID of the projection to transform into. In other words, what is the SRID of geographic coordinates?
The most common SRID for geographic coordinates is 4326, which corresponds to “longitude/latitude on the WGS84 spheroid”. You can see the definition at the spatialreference.org site:
You can also pull the definitions from the spatial_ref_sys
table:
%%sql
SELECT srtext FROM spatial_ref_sys WHERE srid = 4326;
* postgresql://postgres:***@localhost/nyc
1 rows affected.
srtext |
---|
GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]] |
Let’s convert the coordinates of the ‘Broad St’ subway station into geographics:
%%sql
SELECT ST_AsText(geom)
FROM nyc_subway_stations
WHERE name = 'Broad St';
* postgresql://postgres:***@localhost/nyc
1 rows affected.
st_astext |
---|
POINT(583571.905921312 4506714.34119218) |
%%sql
SELECT ST_AsText(ST_Transform(geom,4326))
FROM nyc_subway_stations
WHERE name = 'Broad St';
* postgresql://postgres:***@localhost/nyc
1 rows affected.
st_astext |
---|
POINT(-74.0106714688734 40.7071048155876) |
If you load data or create a new geometry without specifying an SRID,
the SRID value will be 0. Recall in geometries
, that when we created our geometries
table we didn’t
specify an SRID. If we query our database, we should expect all the
nyc_
tables to have an SRID of 26918, while the geometries
table
defaulted to an SRID of 0.
To view a table’s SRID assignment, query the database’s
geometry_columns
table.
%%sql
SELECT f_table_name AS name, srid
FROM geometry_columns;
* postgresql://postgres:***@localhost/nyc
6 rows affected.
name | srid |
---|---|
nyc_homicides | 26918 |
nyc_census_blocks | 26918 |
nyc_neighborhoods | 26918 |
nyc_streets | 26918 |
nyc_subway_stations | 26918 |
geometries | 0 |
However, if you know what the SRID of the coordinates is supposed to be,
you can set it post-facto, using ST_SetSRID
on the geometry. Then you will be able to transform the
geometry into other systems.
%%sql
SELECT ST_AsText(
ST_Transform(
ST_SetSRID(geom,26918),
4326)
)
FROM geometries;
* postgresql://postgres:***@localhost/nyc
5 rows affected.
st_astext |
---|
POINT(-79.488743884387 0) |
LINESTRING(-79.488743884387 0,-79.4887349253904 0.000009019375921,-79.4887259663935 0.000009019376033,-79.4887259663937 0.000018038752065) |
POLYGON((-79.488743884387 0,-79.4887349253903 0,-79.4887349253904 0.000009019375921,-79.4887438843871 0.00000901937581,-79.488743884387 0)) |
POLYGON((-79.488743884387 0,-79.4886542944147 0,-79.4886542944202 0.000090193769243,-79.4887438843926 0.000090193758097,-79.488743884387 0),(-79.4887349253904 0.000009019375921,-79.4887349253905 0.000018038751842,-79.4887259663937 0.000018038752065,-79.4887259663935 0.000009019376033,-79.4887349253904 0.000009019375921)) |
GEOMETRYCOLLECTION(POINT(-79.4887259663935 0),POLYGON((-79.488743884387 0,-79.4887349253903 0,-79.4887349253904 0.000009019375921,-79.4887438843871 0.00000901937581,-79.488743884387 0))) |
6.5. Function List¶
ST_AsText: Returns the Well-Known Text (WKT) representation of the geometry/geography without SRID metadata.
ST_SetSRID(geometry, srid): Sets the SRID on a geometry to a particular integer value.
ST_SRID(geometry): Returns the spatial reference identifier for the ST_Geometry as defined in spatial_ref_sys table.
ST_Transform(geometry, srid): Returns a new geometry with its coordinates transformed to the SRID referenced by the integer parameter.