Parse JSON string

Supported in: Batch, Streaming

Parses JSON string following the given schema definition, ignoring any fields not in the schema.

Expression categories: Data preparation, Popular, String, Struct

Declared arguments

  • JSON - JSON to be parsed using the schema.
    Expression<String>
  • Schema - Schema definition used when parsing the JSON strings.
    Type<Array<AnyType> | Map<String, String> | Struct>
  • optional Output mode - The 'simple' output mode will treat fields that fail to parse as null. The 'with errors' output mode will return a parsable struct with any errors found during parsing in an 'error' field and a valid parsed json in the 'ok' field.
    Enum<Simple, With errors>

Output type: Array<AnyType> | Map<String, String> | Struct

Examples

Example 1: Base case

Argument values:

  • JSON: json
  • Schema: Struct<airline, airport<id, miles>>
  • Output mode: null
jsonOutput
{
 "airline": "XB-112",
 "airport": {
  "id": "JFK",
  "miles": 2000
 }
}
{
airline: XB-112,
airport: {
id: JFK,
miles: 2000,
},
}

Example 2: Base case

Argument values:

  • JSON: json
  • Schema: Struct<airline, airport<id, miles>>
  • Output mode: WITH_ERRORS
jsonOutput
{
 "airline": "XB-112",
 "airport": {
  "id": "JFK",
  "miles": 2000
 }
}
{
error: null,
ok: {
airline: XB-112,
airport: {
id: JFK,
miles: 2000,
},
},
}

Example 3: Null case

Description: When a requested field is missing in the input JSON the field becomes null. Argument values:

  • JSON: json
  • Schema: Struct<airline, airport<id, miles>>
  • Output mode: null
jsonOutput
{
 "airline": "XB-112",
 "airport": {
  "id": "JFK"
 }
}
{
airline: XB-112,
airport: {
id: JFK,
miles: null,
},
}

Example 4: Null case

Argument values:

  • JSON: json
  • Schema: Struct<airline, airport<id, miles>>
  • Output mode: null
jsonOutput
nullnull

Example 5: Null case

Argument values:

  • JSON: json
  • Schema: Struct<airline, airport<id, miles>>
  • Output mode: WITH_ERRORS
jsonOutput
null{
error: JSON input is null or empty,
ok: null,
}

Example 6: Null case

Description: When a requested field is null in the input JSON the field becomes null. Argument values:

  • JSON: json
  • Schema: Struct<airline, airport<id, miles>>
  • Output mode: null
jsonOutput
{
 "airline": "XB-112",
 "airport": {
  "id": "JFK",
  "miles": null
 }
}
{
airline: XB-112,
airport: {
id: JFK,
miles: null,
},
}

Example 7: Null case

Description: Test field of struct being an array. Argument values:

  • JSON: json
  • Schema: Struct<airline, airport<id, countries<String>>>
  • Output mode: null
jsonOutput
{
 "airline": "XB-112",
 "airport": {
  "id": "JFK",
  "countries": ["USA", "Canada"]
 }
}
{
airline: XB-112,
airport: {
countries: [ USA, Canada ],
id: JFK,
},
}

Example 8: Null case

Description: Test field of struct being empty string. Argument values:

  • JSON: json
  • Schema: Struct<airline, airport<id, countries<String>>>
  • Output mode: null
jsonOutput
{
 "airline": "XB-112",
 "airport": {
  "id": "",
  "countries": ["USA", "Canada"]
 }
}
{
airline: XB-112,
airport: {
countries: [ USA, Canada ],
id: empty string,
},
}

Example 9: Null case

Description: Test field of struct being an array with null element. Argument values:

  • JSON: json
  • Schema: Struct<airline, airport<id, countries<String>>>
  • Output mode: null
jsonOutput
{
 "airline": "XB-112",
 "airport": {
  "id": "JFK",
  "countries": ["USA", null]
 }
}
{
airline: XB-112,
airport: {
countries: [ USA, null ],
id: JFK,
},
}

Example 10: Null case

Description: Test field of struct being a null string. Argument values:

  • JSON: json
  • Schema: Struct<airline, airport<id, countries<String>>>
  • Output mode: null
jsonOutput
{
 "airline": "XB-112",
 "airport": {
  "id": null,
  "countries": ["USA", "Canada"]
 }
}
{
airline: XB-112,
airport: {
countries: [ USA, Canada ],
id: null,
},
}

Example 11: Null case

Description: Test struct with one field being a map. Argument values:

  • JSON: json
  • Schema: Struct<airline, airport<id, countries<String, Integer>>>
  • Output mode: null
jsonOutput
{
 "airline": "XB-112",
 "airport": {
  "id": "JFK",
  "countries": {"USA": 4}
 }
}
{
airline: XB-112,
airport: {
countries: {
 USA -> 4,
},
id: JFK,
},
}

Example 12: Null case

Description: Parse struct with double field. Argument values:

  • JSON: json
  • Schema: Struct<airline, airport<id, miles>>
  • Output mode: null
jsonOutput
{
 "airline": "XB-112",
 "airport": {
  "id": "JFK",
  "miles": 4.2
 }
}
{
airline: XB-112,
airport: {
id: JFK,
miles: 4.2,
},
}

Example 13: Null case

Description: Ints parsed as doubles should return doubles. Argument values:

  • JSON: json
  • Schema: Struct<airline, airport<id, miles>>
  • Output mode: null
jsonOutput
{
 "airline": "XB-112",
 "airport": {
  "id": "JFK",
  "miles": 4
 }
}
{
airline: XB-112,
airport: {
id: JFK,
miles: 4.0,
},
}

Example 14: Null case

Description: When a map has a null value, the resultant struct will have a null value. Argument values:

  • JSON: json
  • Schema: Struct<airline, airport<id, countries<String, Integer>>>
  • Output mode: null
jsonOutput
{
 "airline": "XB-112",
 "airport": {
  "id": "JFK",
  "countries": {"USA": null}
 }
}
{
airline: XB-112,
airport: {
countries: {
 USA -> null,
},
id: JFK,
},
}

Example 15: Edge case

Argument values:

  • JSON: json
  • Schema: Struct<airline, airport<id, miles>>
  • Output mode: WITH_ERRORS
jsonOutput
invalid{
error: The JSON content is invalid or malformed,
ok: null,
}

Example 16: Edge case

Argument values:

  • JSON: json
  • Schema: Struct<boolVal, byteVal, shortVal, intVal, longVal, doubleVal, floatVal, dateVal, timestampVal, decimalVal(9, 4), myMap<String, Integer>, myArray<Integer>>
  • Output mode: WITH_ERRORS
jsonOutput
{
 "timestampVal": "This is a string."
}
{
error: The JSON content does not match the expected type,
ok: null,
}
{"boolVal": 5}
{
error: The JSON content does not match the expected type,
ok: null,
}
{"byteVal": "This is not a byte."}
{
error: The JSON content does not match the expected type,
ok: null,
}
{"shortVal": "This is not a short."}
{
error: The JSON content does not match the expected type,
ok: null,
}
{"longVal": "This is not a long."}
{
error: The JSON content does not match the expected type,
ok: null,
}
{"intVal": 5.2}
{
error: The JSON content does not match the expected type,
ok: null,
}
{"doubleVal": "This is not a double."}
{
error: The JSON content does not match the expected type,
ok: null,
}
{"floatVal": "This is not a float."}
{
error: The JSON content does not match the expected type,
ok: null,
}
{"dateVal": "32/13/2020"}
{
error: The JSON content does not match the expected type,
ok: null,
}
{"decimalVal": "This is not a decimal."}
{
error: The JSON content does not match the expected type,
ok: null,
}
{"myMap": {"a": "str"}}
{
error: The JSON content does not match the expected type,
ok: null,
}
{"myArray": ["a", "b"]}
{
error: The JSON content does not match the expected type,
ok: null,
}

Example 17: Edge case

Argument values:

  • JSON: json
  • Schema: Struct<airport<id, miles>, boolVal>
  • Output mode: WITH_ERRORS
jsonOutput
{
 "boolVal": true
}
{
error: null,
ok: {
airport: null,
boolVal: true,
},
}
{
 "boolVal": "This is a string."
}
{
error: The JSON content does not match the expected type,
ok: null,
}

Example 18: Edge case

Argument values:

  • JSON: json
  • Schema: Struct<airline, airport<id, miles>>
  • Output mode: WITH_ERRORS
jsonOutput
{
 "arrival_time":
{
error: There was an unexpected EOF while parsing the JSON,
ok: null,
}