Geometry knn left join

Supported in: Batch

Selects the k closest points from the neighbors dataset for each valid input geometry from the base dataset. Internally converts the input datasets to the given coordinate reference system, and back to WGS84. The entire neighbors dataset must be able to fit into driver and executor memory. A 3 gb executor should be able to handle up to 1 million points in the neighbors dataset.

Transform categories: Geospatial, Join

Declared arguments

  • Base dataset - Base dataset to use in join.
    Table
  • Condition for columns to select on the left - All columns in the left input schema will be tested to see if they match this condition. If they match, the column will be selected in the output.
    ColumnPredicate
  • Condition for columns to select on the right - All columns in the right input schema will be tested to see if they match this condition. If they match, the column will be selected in the output.
    ColumnPredicate
  • Join key - The GeoJSON column from the query dataset and the geopoint column from the neighbors dataset.
    Tuple<Column<Geometry>, Column<GeoPoint>>
  • K - The number of neighbors to select from the right dataset for each valid geometry in the left dataset.
    Literal<Integer>
  • Neighbors dataset - Dataset of potential neighbors to use in join.
    Table
  • Projected coordinate system - Input geometries will be converted to this coordinate system prior to the join, and distance will be measured in the units of the given coordinate system. Formatted as "authority", so for example UTM zone 18N could be identified by EPSG:32618.
    Literal<String>
  • optional Prefix for columns from right - Prefix to add to all column names on the right hand side.
    Literal<String>

Examples

Example 1: Base case

Argument values:

  • Base dataset: ri.foundry.main.dataset.left
  • Condition for columns to select on the left:
    columnNameIsIn(
     columnNames: [geometryCol, lhsCol],
    )
  • Condition for columns to select on the right:
    columnNameIsIn(
     columnNames: [geometryCol, col],
    )
  • Join key: (geometryCol, geometryCol)
  • K: 2
  • Neighbors dataset: ri.foundry.main.dataset.right
  • Projected coordinate system: epsg:2868
  • Prefix for columns from right: rhs_

Inputs: ri.foundry.main.dataset.left

geometryCollhsCol
{"coordinates": [-112.14843750000001,33.440609443703586], "type":"Point"}42.0

ri.foundry.main.dataset.right

geometryColcol
{
latitude: 33.440609443703586,
longitude: -112.14843750000001,
}
rhsVal1
{
latitude: 33.44082430962016,
longitude: -112.14560508728029,
}
rhsVal2
{
latitude: 33.440895931474124,
longitude: -112.11796760559083,
}
rhsVal3

Output:

geometryCollhsColrhs_geometryColrhs_col
{"coordinates": [-112.14843750000001,33.440609443703586], "type":"Point"}42.0{
latitude: 33.440609443703586,
longitude: -112.14843750000001,
}
rhsVal1
{"coordinates": [-112.14843750000001,33.440609443703586], "type":"Point"}42.0{
latitude: 33.44082430962016,
longitude: -112.14560508728029,
}
rhsVal2

Example 2: Base case

Argument values:

  • Base dataset: ri.foundry.main.dataset.left
  • Condition for columns to select on the left:
    columnNameIsIn(
     columnNames: [geometryColLhs, lhs-1],
    )
  • Condition for columns to select on the right:
    columnNameIsIn(
     columnNames: [geometryColRhs, rhs-1],
    )
  • Join key: (geometryColLhs, geometryColRhs)
  • K: 1
  • Neighbors dataset: ri.foundry.main.dataset.right
  • Projected coordinate system: epsg:2868
  • Prefix for columns from right: null

Inputs: ri.foundry.main.dataset.left

geometryColLhslhs-1

ri.foundry.main.dataset.right

geometryColRhsrhs-1

Output:

geometryColLhslhs-1geometryColRhsrhs-1

Example 3: Base case

Argument values:

  • Base dataset: ri.foundry.main.dataset.left
  • Condition for columns to select on the left:
    columnNameIsIn(
     columnNames: [geometryCol, lhsCol],
    )
  • Condition for columns to select on the right:
    columnNameIsIn(
     columnNames: [geometryCol, col],
    )
  • Join key: (geometryCol, geometryCol)
  • K: 2
  • Neighbors dataset: ri.foundry.main.dataset.right
  • Projected coordinate system: epsg:2868
  • Prefix for columns from right: rhs_

Inputs: ri.foundry.main.dataset.left

geometryCollhsCol
{"coordinates": [-112.14843750000001,33.440609443703586], "type":"Point"}42.0
{"coordinates": [-112.14843750000001,33.440609443703586], "type":"Point"}43.0

ri.foundry.main.dataset.right

geometryColcol
{
latitude: 33.440609443703586,
longitude: -112.14843750000001,
}
rhsVal1
{
latitude: 33.44082430962016,
longitude: -112.14560508728029,
}
rhsVal2
{
latitude: 33.440895931474124,
longitude: -112.11796760559083,
}
rhsVal3

Output:

geometryCollhsColrhs_geometryColrhs_col
{"coordinates": [-112.14843750000001,33.440609443703586], "type":"Point"}42.0{
latitude: 33.440609443703586,
longitude: -112.14843750000001,
}
rhsVal1
{"coordinates": [-112.14843750000001,33.440609443703586], "type":"Point"}43.0{
latitude: 33.440609443703586,
longitude: -112.14843750000001,
}
rhsVal1
{"coordinates": [-112.14843750000001,33.440609443703586], "type":"Point"}42.0{
latitude: 33.44082430962016,
longitude: -112.14560508728029,
}
rhsVal2
{"coordinates": [-112.14843750000001,33.440609443703586], "type":"Point"}43.0{
latitude: 33.44082430962016,
longitude: -112.14560508728029,
}
rhsVal2

Example 4: Base case

Argument values:

  • Base dataset: ri.foundry.main.dataset.left
  • Condition for columns to select on the left:
    columnNameIsIn(
     columnNames: [geometryCol, lhsCol],
    )
  • Condition for columns to select on the right:
    columnNameIsIn(
     columnNames: [geometryCol, col],
    )
  • Join key: (geometryCol, geometryCol)
  • K: 3
  • Neighbors dataset: ri.foundry.main.dataset.right
  • Projected coordinate system: epsg:2868
  • Prefix for columns from right: rhs_

Inputs: ri.foundry.main.dataset.left

geometryCollhsCol
{"coordinates": [-112.14843750000001,33.440609443703586], "type":"Point"}42.0

ri.foundry.main.dataset.right

geometryColcol
{
latitude: 33.440609443703586,
longitude: -112.14843750000001,
}
rhsVal1
{
latitude: 33.440609443703586,
longitude: -112.14843750000001,
}
rhsVal1
{
latitude: 33.44082430962016,
longitude: -112.14560508728029,
}
rhsVal2
{
latitude: 33.44082430962016,
longitude: -112.14560508728029,
}
rhsVal2
{
latitude: 33.440895931474124,
longitude: -112.11796760559083,
}
rhsVal3

Output:

geometryCollhsColrhs_geometryColrhs_col
{"coordinates": [-112.14843750000001,33.440609443703586], "type":"Point"}42.0{
latitude: 33.440609443703586,
longitude: -112.14843750000001,
}
rhsVal1
{"coordinates": [-112.14843750000001,33.440609443703586], "type":"Point"}42.0{
latitude: 33.440609443703586,
longitude: -112.14843750000001,
}
rhsVal1
{"coordinates": [-112.14843750000001,33.440609443703586], "type":"Point"}42.0{
latitude: 33.44082430962016,
longitude: -112.14560508728029,
}
rhsVal2

Example 5: Base case

Argument values:

  • Base dataset: ri.foundry.main.dataset.left
  • Condition for columns to select on the left:
    columnNameIsIn(
     columnNames: [geometryColLhs, lhs-1],
    )
  • Condition for columns to select on the right:
    columnNameIsIn(
     columnNames: [geometryColRhs, rhs-1],
    )
  • Join key: (geometryColLhs, geometryColRhs)
  • K: 1
  • Neighbors dataset: ri.foundry.main.dataset.right
  • Projected coordinate system: epsg:2868
  • Prefix for columns from right: null

Inputs: ri.foundry.main.dataset.left

geometryColLhslhs-1

ri.foundry.main.dataset.right

geometryColRhsrhs-1
{
latitude: 33.44082430962016,
longitude: -112.14560508728029,
}
null
{
latitude: 33.440895931474124,
longitude: -112.11796760559083,
}
rhsVal2
nullrhsVal3

Output:

geometryColLhslhs-1geometryColRhsrhs-1

Example 6: Base case

Argument values:

  • Base dataset: ri.foundry.main.dataset.left
  • Condition for columns to select on the left:
    columnNameIsIn(
     columnNames: [geometryColLhs, lhs-1],
    )
  • Condition for columns to select on the right:
    columnNameIsIn(
     columnNames: [geometryColRhs, rhs-1],
    )
  • Join key: (geometryColLhs, geometryColRhs)
  • K: 1
  • Neighbors dataset: ri.foundry.main.dataset.right
  • Projected coordinate system: epsg:2868
  • Prefix for columns from right: null

Inputs: ri.foundry.main.dataset.left

geometryColLhslhs-1
{"coordinates": [-112.14843750000001,33.440609443703586], "type":"Point"}42.0
{"coordinates": [-112.14560508728029,33.44082430962016], "type":"Point"}43.0

ri.foundry.main.dataset.right

geometryColRhsrhs-1

Output:

geometryColLhslhs-1geometryColRhsrhs-1
{"coordinates": [-112.14843750000001,33.440609443703586], "type":"Point"}42.0nullnull
{"coordinates": [-112.14560508728029,33.44082430962016], "type":"Point"}43.0nullnull

Example 7: Base case

Argument values:

  • Base dataset: ri.foundry.main.dataset.left
  • Condition for columns to select on the left:
    columnNameIsIn(
     columnNames: [geometryCol, lhsCol],
    )
  • Condition for columns to select on the right:
    columnNameIsIn(
     columnNames: [geometryCol, col1, arrayCol],
    )
  • Join key: (geometryCol, geometryCol)
  • K: 5
  • Neighbors dataset: ri.foundry.main.dataset.right
  • Projected coordinate system: epsg:4326
  • Prefix for columns from right: rhs_

Inputs: ri.foundry.main.dataset.left

geometryCollhsCol
{"coordinates": [[[0.0, 0.0], [10.0, 0.0], [10.0, 10.0], [0.0, 10.0], [0.0, 0.0]]], "type": "Polygon"}42.0

ri.foundry.main.dataset.right

geometryColcol1arrayColtoDrop
{
latitude: 33.440609443703586,
longitude: -112.14843750000001,
}
rhsVal1[ 0.0, 1.1 ]1.0
{
latitude: 33.44082430962016,
longitude: -112.14560508728029,
}
rhsVal2[ 0.0, 1.1 ]1.0
{
latitude: 33.440895931474124,
longitude: -112.11796760559083,
}
rhsVal3[ 0.0, 1.1 ]1.0

Output:

geometryCollhsColrhs_geometryColrhs_col1rhs_arrayCol
{"coordinates": [[[0.0, 0.0], [10.0, 0.0], [10.0, 10.0], [0.0, 10.0], [0.0, 0.0]]], "type": "Polygon"}42.0{
latitude: 33.440609443703586,
longitude: -112.14843750000001,
}
rhsVal1[ 0.0, 1.1 ]
{"coordinates": [[[0.0, 0.0], [10.0, 0.0], [10.0, 10.0], [0.0, 10.0], [0.0, 0.0]]], "type": "Polygon"}42.0{
latitude: 33.44082430962016,
longitude: -112.14560508728029,
}
rhsVal2[ 0.0, 1.1 ]
{"coordinates": [[[0.0, 0.0], [10.0, 0.0], [10.0, 10.0], [0.0, 10.0], [0.0, 0.0]]], "type": "Polygon"}42.0{
latitude: 33.440895931474124,
longitude: -112.11796760559083,
}
rhsVal3[ 0.0, 1.1 ]

Example 8: Base case

Argument values:

  • Base dataset: ri.foundry.main.dataset.left
  • Condition for columns to select on the left:
    columnNameIsIn(
     columnNames: [geometryColLhs, lhs-1],
    )
  • Condition for columns to select on the right:
    columnNameIsIn(
     columnNames: [geometryColRhs, rhs-1],
    )
  • Join key: (geometryColLhs, geometryColRhs)
  • K: 1
  • Neighbors dataset: ri.foundry.main.dataset.right
  • Projected coordinate system: epsg:2868
  • Prefix for columns from right: null

Inputs: ri.foundry.main.dataset.left

geometryColLhslhs-1
{"coordinates": [-112.14843750000001,33.440609443703586], "type":"Point"}42.0
null43.0

ri.foundry.main.dataset.right

geometryColRhsrhs-1
{
latitude: 33.44082430962016,
longitude: -112.14560508728029,
}
null
{
latitude: 33.440895931474124,
longitude: -112.11796760559083,
}
rhsVal2
nullrhsVal3

Output:

geometryColLhslhs-1geometryColRhsrhs-1
{"coordinates": [-112.14843750000001,33.440609443703586], "type":"Point"}42.0{
latitude: 33.44082430962016,
longitude: -112.14560508728029,
}
null
null43.0nullnull

Example 9: Base case

Argument values:

  • Base dataset: ri.foundry.main.dataset.left
  • Condition for columns to select on the left:
    columnNameIsIn(
     columnNames: [geometryColLhs, lhs-1],
    )
  • Condition for columns to select on the right:
    columnNameIsIn(
     columnNames: [],
    )
  • Join key: (geometryColLhs, geometryColRhs)
  • K: 1
  • Neighbors dataset: ri.foundry.main.dataset.right
  • Projected coordinate system: epsg:2868
  • Prefix for columns from right: null

Inputs: ri.foundry.main.dataset.left

geometryColLhslhs-1
{"coordinates": [-112.14843750000001,33.440609443703586], "type":"Point"}42.0
{"coordinates": [-112.14560508728029,33.44082430962016], "type":"Point"}43.0

ri.foundry.main.dataset.right

geometryColRhsrhs-1
{
latitude: 33.44082430962016,
longitude: -112.14560508728029,
}
null
{
latitude: 33.440895931474124,
longitude: -112.11796760559083,
}
rhsVal2
nullrhsVal3

Output:

geometryColLhslhs-1
{"coordinates": [-112.14843750000001,33.440609443703586], "type":"Point"}42.0
{"coordinates": [-112.14560508728029,33.44082430962016], "type":"Point"}43.0

Example 10: Base case

Argument values:

  • Base dataset: ri.foundry.main.dataset.left
  • Condition for columns to select on the left:
    columnNameIsIn(
     columnNames: [],
    )
  • Condition for columns to select on the right:
    columnNameIsIn(
     columnNames: [geometryColRhs, rhs-1],
    )
  • Join key: (geometryColLhs, geometryColRhs)
  • K: 1
  • Neighbors dataset: ri.foundry.main.dataset.right
  • Projected coordinate system: epsg:2868
  • Prefix for columns from right: null

Inputs: ri.foundry.main.dataset.left

geometryColLhslhs-1
{"coordinates": [-112.14843750000001,33.440609443703586], "type":"Point"}42.0
{"coordinates": [-112.14560508728029,33.44082430962016], "type":"Point"}43.0

ri.foundry.main.dataset.right

geometryColRhsrhs-1
{
latitude: 33.44082430962016,
longitude: -112.14560508728029,
}
null
{
latitude: 33.440895931474124,
longitude: -112.11796760559083,
}
rhsVal2
nullrhsVal3

Output:

geometryColRhsrhs-1
{
latitude: 33.44082430962016,
longitude: -112.14560508728029,
}
null
{
latitude: 33.44082430962016,
longitude: -112.14560508728029,
}
null

Example 11: Base case

Argument values:

  • Base dataset: ri.foundry.main.dataset.left
  • Condition for columns to select on the left:
    allColumns(

    )
  • Condition for columns to select on the right:
    columnNameIsIn(
     columnNames: [geometryCol, col1, arrayCol],
    )
  • Join key: (geometryCol, geometryCol)
  • K: 1
  • Neighbors dataset: ri.foundry.main.dataset.right
  • Projected coordinate system: epsg:4326
  • Prefix for columns from right: rhs_

Inputs: ri.foundry.main.dataset.left

geometryCollhsCol
{"coordinates": [[[0.0, 0.0], [10.0, 0.0], [10.0, 10.0], [0.0, 10.0], [0.0, 0.0]]], "type": "Polygon"}42.0
{"coordinates": [55.0, 5.0], "type":"Point"}43.0
{"coordinates": [[40.0, 0.0], [0.0, 40.0]], "type":"LineString"}44.0
{"coordinates": [[[20.0, 10.0], [27.0, 10.0], [27.0, 17.0], [20.0, 17.0], [20.0, 10.0]]], "type": "Polygon"}45.0
{"coordinates": [[[21.0, 21.0], [27.0, 21.0], [27.0, 27.0], [21.0, 27.0], [21.0, 21.0]]], "type": "Polygon"}46.0
{"coordinates": [[[[2.0, 2.0], [7.0, 2.0], [7.0, 7.0], [2.0, 7.0], [2.0, 2.0]]], [[[12.0, 12.0], [17.0, 12.0], [17.0, 17.0], [12.0, 17.0], [12.0, 12.0]]]], "type":"MultiPolygon"}47.0
{"coordinates": [[[[170.0, 170.0], [190.0, 170.0], [190.0, 190.0], [170.0, 190.0], [170.0, 170.0]]], [[[12.0, 12.0], [17.0, 12.0], [17.0, 17.0], [12.0, 17.0], [12.0, 12.0]]]], "type":"MultiPolygon"}48.0

ri.foundry.main.dataset.right

geometryColcol1arrayColtoDrop
{
latitude: 5.0,
longitude: 5.0,
}
rhsVal1[ 0.0, 1.1 ]1.0
{
latitude: 100.0,
longitude: 100.0,
}
rhsVal2[ 0.0, 1.1 ]1.0

Output:

geometryCollhsColrhs_geometryColrhs_col1rhs_arrayCol
{"coordinates": [[[0.0, 0.0], [10.0, 0.0], [10.0, 10.0], [0.0, 10.0], [0.0, 0.0]]], "type": "Polygon"}42.0{
latitude: 5.0,
longitude: 5.0,
}
rhsVal1[ 0.0, 1.1 ]
{"coordinates": [55.0, 5.0], "type":"Point"}43.0{
latitude: 5.0,
longitude: 5.0,
}
rhsVal1[ 0.0, 1.1 ]
{"coordinates": [[40.0, 0.0], [0.0, 40.0]], "type":"LineString"}44.0{
latitude: 5.0,
longitude: 5.0,
}
rhsVal1[ 0.0, 1.1 ]
{"coordinates": [[[20.0, 10.0], [27.0, 10.0], [27.0, 17.0], [20.0, 17.0], [20.0, 10.0]]], "type": "Polygon"}45.0{
latitude: 5.0,
longitude: 5.0,
}
rhsVal1[ 0.0, 1.1 ]
{"coordinates": [[[21.0, 21.0], [27.0, 21.0], [27.0, 27.0], [21.0, 27.0], [21.0, 21.0]]], "type": "Polygon"}46.0{
latitude: 5.0,
longitude: 5.0,
}
rhsVal1[ 0.0, 1.1 ]
{"coordinates": [[[[2.0, 2.0], [7.0, 2.0], [7.0, 7.0], [2.0, 7.0], [2.0, 2.0]]], [[[12.0, 12.0], [17.0, 12.0], [17.0, 17.0], [12.0, 17.0], [12.0, 12.0]]]], "type":"MultiPolygon"}47.0{
latitude: 5.0,
longitude: 5.0,
}
rhsVal1[ 0.0, 1.1 ]
{"coordinates": [[[[170.0, 170.0], [190.0, 170.0], [190.0, 190.0], [170.0, 190.0], [170.0, 170.0]]], [[[12.0, 12.0], [17.0, 12.0], [17.0, 17.0], [12.0, 17.0], [12.0, 12.0]]]], "type":"MultiPolygon"}48.0{
latitude: 5.0,
longitude: 5.0,
}
rhsVal1[ 0.0, 1.1 ]