Drop duplicates

Supported in: Batch

Drops duplicate rows from the input.

Transform categories: Other

Declared arguments

  • Dataset - Dataset to deduplicate rows.
    Table
  • optional Column subset - If any columns are specified only those will be used when determining uniqueness.
    Set<Column<AnyType>>

Examples

Example 1: Base case

Argument values:

  • Dataset: ri.foundry.main.dataset.aggregate
  • Column subset: {tail_number}

Input:

tail_numberairlinemilesfactor
XB-123foundry air1242
MT-222new airline11235
XB-123foundry airline3355
MT-222new air5654
KK-452new air2221
XB-123foundry airline11343

Output:

tail_numberairlinemilesfactor
XB-123foundry air1242
MT-222new airline11235
KK-452new air2221

Example 2: Base case

Description: No subset looks for exact duplicates. Argument values:

  • Dataset: ri.foundry.main.dataset.aggregate
  • Column subset: {}

Input:

tail_numberairlinemilesfactor
XB-123foundry air1242
XB-123foundry air1242
XB-123foundry air1242
MT-222new airline11235
MT-222new airline11235

Output:

tail_numberairlinemilesfactor
XB-123foundry air1242
MT-222new airline11235