Migrating from HyperAuto V1 to V2

With the general availability of HyperAuto V2, V1 will be sunset. The creation of new V1 pipelines is discouraged, and over the next 12 months, users should migrate from HyperAuto V1 to V2 as detailed below. HyperAuto V2 is a significant upgrade from HyperAuto V1 and offers enhanced performance and functionality, including:

  • An easier configuration process using a point-and-click wizard.
  • The ability to generate Pipeline Builder pipelines, which provides full transparency on how the data is being processed, a comprehensive change management workflow, and significant performance improvements.
  • Real-time streaming of input data, enabling time critical operational applications.

Notable feature differences between HyperAuto V1 and V2

Significant HyperAuto V2 updates and changes are described below.

Source-type support

As of 29 April 2024, HyperAuto V2 only supports SAP data. Users of V1 with SAP data are strongly encouraged to start migrating their usage to V2 (see Getting started).

Multi-source support

In HyperAuto V1, users could connect a single pipeline to multiple sources and perform a wide union at the end. However, this approach could produce unexpected results and is now discouraged. In particular, primary and foreign keys which were using the source as a prefix could break if a source name change occurred.

In HyperAuto V2, each pipeline can only be connected to one source. As a consequence, the source column is not produced in output datasets and is no longer used as a prefix in primary or foreign keys generation.

Users requiring this feature are encouraged to re-implement a pipeline performing a union downstream of HyperAuto V2.

Foreign key generation

HyperAuto V1 implemented a permissive approach to foreign key generation, which often resulted in the creation of foreign keys between tables that did not accurately reflect the underlying data relationships, leading to potential inaccuracies and misleading interpretations.

The logic to generate keys in HyperAuto V2 has been updated to use a more conservative approach to improve accuracy; as such, the list of foreign key columns is different. If you believe that a foreign key has been mistakenly omitted in V2, contact your Palantir representative.

Column renaming

HyperAuto V2 uses richer metadata to rename columns, which may generate different column names in output datasets compared to HyperAuto V1.

Custom cleaning functions

HyperAuto V2 does not support the implementation of custom cleaning functions to be applied as part of the pipeline. Users are advised to create a pipeline downstream of HyperAuto to implement their custom logic.

Batch unioning of inputs

HyperAuto V2 does not support the configuration of multiple syncs linking to the same output table (known as batch union components in V1). Users are advised to union their inputs prior to HyperAuto V2, and then configure a folder-based pipeline to consume from HyperAuto.

Migrating existing HyperAuto V1 pipelines to HyperAuto V2

Users are encouraged to gradually migrate their pipelines from HyperAuto V1 to V2 by:

  1. Creating a new HyperAuto V2 pipeline that replicates existing V1 configurations and consumes the same inputs.
  2. Identifying downstream consumers of the pipeline (repositories, analyses, applications) and gradually pointing them to the new HyperAuto V2 outputs.

In cases when a decision has been made to not migrate to HyperAuto V2, existing V1 repositories will be left intact but “severed” from the original template. This means that the repository will be converted to a regular Python Transforms repository and will be owned by users just like any other custom repository.

After severing a HyperAuto V1 repository from the original template, the automatic pull request creation process will be discontinued, and users will have to manually create pull requests to update their V1 configurations.