The Directory connector is a sunset connector documented here for historical reference. It only works with an agent worker and can not be used with a Foundry worker.
We recommend always using alternative file-sharing connectors when available, like SFTP, SMB, or FTP. If the files can only be accessed via the host itself, we recommend using external transforms with a REST API source instead of a Directory source.
The Directory connector allows you to ingest files located directly on the host where a Data Connection agent is running. This connector is useful for scenarios where files are generated or stored locally on the agent machine and need to be synced into Foundry.
| Capability | Status |
|---|---|
| Exploration | 🟡 Sunset |
| Batch syncs | 🟡 Sunset |
| Incremental | 🟡 Sunset |
The connector can transfer files of any type into Foundry datasets. File formats are preserved, and no schemas are applied during or after the transfer. Apply any necessary schema to the output dataset, or write a downstream transformation to access the data.
Learn more about setting up a connector in Foundry.
| Option | Required? | Description |
|---|---|---|
Root directory | Yes | The directory on the agent host that will be used as the starting directory for all requests via this connection. |
The Directory connector uses the file-based sync interface.
For more flexibility and control, you can ingest files from an agent host using external transforms. This approach allows you to run the sync logic on a Foundry worker while still accessing files on a remote agent host.
22 (SSH).The following example demonstrates how to connect to an agent host via SSH and read files into a Foundry dataset using the Paramiko ↗ Python library.
Copied!1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50from transforms.api import transform, Output, Input, LightweightOutput, LightweightInput, lightweight from transforms.external.systems import external_systems, Source, ResolvedSource import paramiko @lightweight @external_systems( agent_source=Source("<source_rid>") # Replace with your REST API source RID ) @transform( output_dataset=Output("<output_dataset_rid>"), # Replace with your output dataset RID files_to_read=Input("<input_dataset_rid>"), # Dataset containing file paths to read ) def compute( agent_source: ResolvedSource, output_dataset: LightweightOutput, files_to_read: LightweightInput, ): """ Read files from a remote agent host via SSH and write them to a Foundry dataset. """ # 1. SSH connection setup hostname = "<agent_hostname>" # Replace with your agent hostname username = "<ssh_username>" # Replace with your SSH username password = agent_source.get_secret("<password_secret_name>") # Replace with your secret name # 2. Establish SSH connection client = paramiko.SSHClient() client.set_missing_host_key_policy(paramiko.AutoAddPolicy()) client.connect(hostname, username=username, password=password) # 3. Read file paths from input dataset remote_file_paths = files_to_read.pandas()["remote_file_path"].tolist() # 4. Open SFTP connection sftp = client.open_sftp() # 5. Read each file and write to output dataset for remote_path in remote_file_paths: with sftp.open(remote_path, "rb") as remote_file: file_binary_data = remote_file.read() # Extract filename from path and write to output filename = remote_path.split("/")[-1] with output_dataset.filesystem().open(filename, "wb") as f: f.write(file_binary_data) # 6. Close connections sftp.close() client.close()
Ensure that the paramiko library is installed in your Python transforms repository. You can add it via the Libraries tab in the left side panel of your code repository.