GeoPoint-to-GeoPoint 3d distance inner join

Supported in: Batch

Inner joins left and right datasets together based on the distance between point geometries. The geometries must represent points, and may optionally include a z-coordinate. Internally converts geometries into the given projected coordinate reference system prior to the join and back to WGS84. Non-point geometries are ignored, and the entire right dataset must be able to fit into driver and executor memory. A 3 gb executor should be able to handle up to 4 million points in the neighbors dataset.

Transform categories: Geospatial, Join

Declared arguments

  • Condition for columns to select on the left - All columns in the left input schema will be tested to see if they match this condition. If they match, the column will be selected in the output.
    ColumnPredicate
  • Condition for columns to select on the right - All columns in the right input schema will be tested to see if they match this condition. If they match, the column will be selected in the output.
    ColumnPredicate
  • Distance - The distance within which to join geometries, in the same units as the coordinate reference system.
    Literal<DefiniteNumeric>
  • Join key - The geojson columns from the left and right inputs on which to join.
    Tuple<Column<Geometry>, Column<Geometry>>
  • Left dataset - Left dataset to use in join.
    Table
  • Projected coordinate system - Input geometries will be converted to this coordinate system prior to the join, and distance will be measured in the units of the given coordinate system. Formatted as "authority", so for example UTM zone 18N could be identified by EPSG:32618.
    Literal<String>
  • Right dataset - Right dataset to use in join.
    Table
  • Use z-coordinate - Whether to include z-coordinates and calculate the 3 dimensional distance. If false, z-coordinates are ignored and 2 dimensional distances are calculated.
    Literal<Boolean>
  • optional Prefix for columns from right - Prefix to add to all columns on the right hand side.
    Literal<String>

Examples

Example 1: Base case

Argument values:

  • Condition for columns to select on the left:
    columnNameIsIn(
     columnNames: [geometryColLhs, lhs-1],
    )
  • Condition for columns to select on the right:
    columnNameIsIn(
     columnNames: [geometryCol, arrayCol],
    )
  • Distance: 2.5
  • Join key: (geometryColLhs, geometryCol)
  • Left dataset: ri.foundry.main.dataset.left
  • Projected coordinate system: EPSG:4326
  • Right dataset: ri.foundry.main.dataset.right
  • Use z-coordinate: false
  • Prefix for columns from right: rhs_

Inputs: ri.foundry.main.dataset.left

geometryColLhslhs-1
{"coordinates": [0.0, 0.0, 0.0], "type":"Point"}42.0
{"coordinates": [0.0, 0.0, 5.0], "type":"Point"}43.0
{"coordinates": [0.0, 0.0], "type":"Point"}44.0

ri.foundry.main.dataset.right

geometryColcol1arrayCol
{"coordinates": [0.0, 0.0, 2.0], "type":"Point"}rhsVal1[ 0.0, 1.0 ]
{"coordinates": [0.0, 1.0], "type":"Point"}rhsVal2[ 0.0, 1.0 ]

Output:

geometryColLhslhs-1rhs_geometryColrhs_arrayCol
{"coordinates": [0.0, 0.0, 0.0], "type":"Point"}42.0{"coordinates": [0.0, 0.0, 2.0], "type":"Point"}[ 0.0, 1.0 ]
{"coordinates": [0.0, 0.0, 0.0], "type":"Point"}42.0{"coordinates": [0.0, 1.0], "type":"Point"}[ 0.0, 1.0 ]
{"coordinates": [0.0, 0.0, 5.0], "type":"Point"}43.0{"coordinates": [0.0, 0.0, 2.0], "type":"Point"}[ 0.0, 1.0 ]
{"coordinates": [0.0, 0.0, 5.0], "type":"Point"}43.0{"coordinates": [0.0, 1.0], "type":"Point"}[ 0.0, 1.0 ]
{"coordinates": [0.0, 0.0], "type":"Point"}44.0{"coordinates": [0.0, 0.0, 2.0], "type":"Point"}[ 0.0, 1.0 ]
{"coordinates": [0.0, 0.0], "type":"Point"}44.0{"coordinates": [0.0, 1.0], "type":"Point"}[ 0.0, 1.0 ]

Example 2: Base case

Argument values:

  • Condition for columns to select on the left:
    columnNameIsIn(
     columnNames: [geometryColLhs, lhs-1],
    )
  • Condition for columns to select on the right:
    columnNameIsIn(
     columnNames: [geometryCol, arrayCol],
    )
  • Distance: 2.5
  • Join key: (geometryColLhs, geometryCol)
  • Left dataset: ri.foundry.main.dataset.left
  • Projected coordinate system: EPSG:4326
  • Right dataset: ri.foundry.main.dataset.right
  • Use z-coordinate: true
  • Prefix for columns from right: rhs_

Inputs: ri.foundry.main.dataset.left

geometryColLhslhs-1
{"coordinates": [0.0, 0.0, 0.0], "type":"Point"}42.0
{"coordinates": [0.0, 0.0, 5.0], "type":"Point"}43.0
{"coordinates": [0.0, 5.0], "type":"Point"}44.0

ri.foundry.main.dataset.right

geometryColcol1arrayCol
{"coordinates": [0.0, 0.0, 2.0], "type":"Point"}rhsVal1[ 0.0, 1.0 ]
{"coordinates": [1.0, 1.0, 6.0], "type":"Point"}rhsVal2[ 0.0, 1.0 ]
{"coordinates": [0.0, 0.0, 3.0], "type":"Point"}rhsVal3[ 0.0, 1.0 ]
{"coordinates": [0.0, 0.0], "type":"Point"}rhsVal4[ 0.0, 1.0 ]

Output:

geometryColLhslhs-1rhs_geometryColrhs_arrayCol
{"coordinates": [0.0, 0.0, 0.0], "type":"Point"}42.0{"coordinates": [0.0, 0.0, 2.0], "type":"Point"}[ 0.0, 1.0 ]
{"coordinates": [0.0, 0.0, 5.0], "type":"Point"}43.0{"coordinates": [0.0, 0.0, 3.0], "type":"Point"}[ 0.0, 1.0 ]
{"coordinates": [0.0, 0.0, 5.0], "type":"Point"}43.0{"coordinates": [1.0, 1.0, 6.0], "type":"Point"}[ 0.0, 1.0 ]

Example 3: Base case

Argument values:

  • Condition for columns to select on the left:
    columnNameIsIn(
     columnNames: [geometryColLhs, lhs-1],
    )
  • Condition for columns to select on the right:
    columnNameIsIn(
     columnNames: [geometryColRhs, rhs-1],
    )
  • Distance: 1641
  • Join key: (geometryColLhs, geometryColRhs)
  • Left dataset: ri.foundry.main.dataset.left
  • Projected coordinate system: epsg:2868
  • Right dataset: ri.foundry.main.dataset.right
  • Use z-coordinate: false
  • Prefix for columns from right: null

Inputs: ri.foundry.main.dataset.left

geometryColLhslhs-1
{"coordinates": [-112.14843750000001,33.440609443703586], "type":"Point"}42.0
null43.0

ri.foundry.main.dataset.right

geometryColRhsrhs-1
{"coordinates": [-112.14560508728029,33.44082430962016], "type":"Point"}rhsVal1
{"coordinates": [-112.11796760559083,33.440895931474124], "type":"Point"}rhsVal2

Output:

geometryColLhslhs-1geometryColRhsrhs-1
{"coordinates": [-112.14843750000001,33.440609443703586], "type":"Point"}42.0{"coordinates": [-112.14560508728029,33.44082430962016], "type":"Point"}rhsVal1

Example 4: Base case

Argument values:

  • Condition for columns to select on the left:
    columnNameIsIn(
     columnNames: [geometryColLhs, lhs-1],
    )
  • Condition for columns to select on the right:
    columnNameIsIn(
     columnNames: [],
    )
  • Distance: 10.0
  • Join key: (geometryColLhs, geometryCol)
  • Left dataset: ri.foundry.main.dataset.left
  • Projected coordinate system: EPSG:4326
  • Right dataset: ri.foundry.main.dataset.right
  • Use z-coordinate: false
  • Prefix for columns from right: rhs_

Inputs: ri.foundry.main.dataset.left

geometryColLhslhs-1
{"coordinates": [15.0, 5.0], "type":"Point"}42.0
{"coordinates": [55.0, 5.0], "type":"Point"}43.0

ri.foundry.main.dataset.right

geometryColcol1arrayCol
{"coordinates": [15.0, 5.0], "type":"Point"}rhsVal1[ 0.0, 1.0 ]

Output:

geometryColLhslhs-1
{"coordinates": [15.0, 5.0], "type":"Point"}42.0

Example 5: Base case

Argument values:

  • Condition for columns to select on the left:
    columnNameIsIn(
     columnNames: [],
    )
  • Condition for columns to select on the right:
    columnNameIsIn(
     columnNames: [geometryCol, arrayCol],
    )
  • Distance: 10.0
  • Join key: (geometryColLhs, geometryCol)
  • Left dataset: ri.foundry.main.dataset.left
  • Projected coordinate system: EPSG:4326
  • Right dataset: ri.foundry.main.dataset.right
  • Use z-coordinate: false
  • Prefix for columns from right: rhs_

Inputs: ri.foundry.main.dataset.left

geometryColLhslhs-1
{"coordinates": [15.0, 5.0], "type":"Point"}42.0
{"coordinates": [55.0, 5.0], "type":"Point"}43.0

ri.foundry.main.dataset.right

geometryColcol1arrayCol
{"coordinates": [55.0, 5.0], "type":"Point"}rhsVal1[ 0.0, 1.0 ]

Output:

rhs_geometryColrhs_arrayCol
{"coordinates": [55.0, 5.0], "type":"Point"}[ 0.0, 1.0 ]

Example 6: Null case

Argument values:

  • Condition for columns to select on the left:
    columnNameIsIn(
     columnNames: [geometryColLhs, lhs-1],
    )
  • Condition for columns to select on the right:
    columnNameIsIn(
     columnNames: [geometryCol, col1, arrayCol],
    )
  • Distance: 10.0
  • Join key: (geometryColLhs, geometryCol)
  • Left dataset: ri.foundry.main.dataset.left
  • Projected coordinate system: EPSG:4326
  • Right dataset: ri.foundry.main.dataset.right
  • Use z-coordinate: false
  • Prefix for columns from right: rhs_

Inputs: ri.foundry.main.dataset.left

geometryColLhslhs-1
{"coordinates": [15.0, 5.0], "type":"Point"}42.0
{"coordinates": [55.0, 5.0], "type":"Point"}43.0
{"coordinates": [[[20.0, 10.0], [27.0, 10.0], [27.0, 17.0], [20.0, 17.0], [20.0, 10.0]]], "type": "Polygon"}44.0
null45.0

ri.foundry.main.dataset.right

geometryColcol1arrayCol
{"coordinates": [15.0, 5.0], "type":"Point"}rhsVal1[ 0.0, 1.0 ]
{"coordinates": [[[21.0, 21.0], [27.0, 21.0], [27.0, 27.0], [21.0, 27.0], [21.0, 21.0]]], "type": "Polygon"}rhsVal2[ 0.0, 1.0 ]
{"coordinates": [[[20.0, 10.0], [27.0, 10.0], [27.0, 17.0], [20.0, 17.0], [20.0, 10.0]]], "type": "Polygon"}rhsVal3[ 0.0, 1.0 ]
nullrhsVal4[ 0.0, 1.0 ]

Output:

geometryColLhslhs-1rhs_geometryColrhs_col1rhs_arrayCol
{"coordinates": [15.0, 5.0], "type":"Point"}42.0{"coordinates": [15.0, 5.0], "type":"Point"}rhsVal1[ 0.0, 1.0 ]