Sharepoint Online

Connect to SharePoint Online to import files from specified SharePoint libraries into Foundry.

Supported capabilities

CapabilityStatus
Exploration🟢 Generally available
Bulk import🟢 Generally available
Incremental🟢 Generally available
Export tasks🟡 Sunset
File exports🟢 Generally available

Data model

The connector can transfer files of any type into Foundry datasets. File formats are preserved, and no schemas are applied during or after the transfer. Apply any necessary schema to the output dataset, or write a downstream transformation to access the data.

Performance and limitations

There is no limit to the size of transferable files. However, network issues can result in failures of large-scale transfers. In particular, direct cloud syncs that take more than two days to run will be interrupted. To avoid network issues, we recommend using smaller file sizes and limiting the number of files that are ingested in every execution of the sync. Syncs can be scheduled to run frequently.

Connections to on-premise SharePoint servers are not supported.

Setup

  1. Open the Data Connection application and select + New Source in the upper right corner of the screen.
  2. Select SharePoint Online from the available connector types.
  3. Choose to use a direct connection over the Internet or to connect through an intermediary agent.
  4. Follow the additional configuration prompts to continue the setup of your connector using the information in the sections below.

Learn more about setting up a connector in Foundry.

Authentication

Authentication for the SharePoint Online source requires an application in Microsoft Entra ID (formerly known as Azure Active Directory). If you are not an Entra ID administrator, contact your IT department to request access.

Follow the initial steps below to access Azure application credentials:

  1. Create an application registration in Azure by following the instructions in the Microsoft documentation ↗.
    • At Step 5, select Accounts in this organizational directory only and skip Redirect URL (optional).
  2. Note the client ID and tenant ID once registration is complete.

Then, choose between two available authentication method:

  • Client credentials: Recommended when a wide range of access is required for every SharePoint site.
  • Username/password: Recommended for limiting access to one or a few SharePoint sites.

Client credentials

In your Microsoft Entra admin center, complete the following steps:

  1. Go to API Permissions in the left sidebar.

  2. Select Add a Permission.

  3. Select Microsoft Graph.

  4. Select Application Permissions.

    • If you would like your application to read all SharePoint sites add Sites.Read.All.
      • If you plan to configure export tasks, use Sites.ReadWrite.All instead.
    • If you would like your application to read selected SharePoint sites add Sites.Selected.
  5. If you are an Entra Administrator, select Grant admin consent for [tenant].

  6. If you added Sites.Selected above, add your application to specific sites ↗.

    • The available options for the "roles" array parameter are "write" and/or "read". The "read" option is sufficient to ingest files from the SharePoint site.
    • To easily send a POST with proper authentication, use the Graph Explorer ↗.
    • You can receive metadata about a site by sending a GET to https://graph.microsoft.com/v1.0/sites/[tenantName]:/sites/[siteName] (for example: https://graph.microsoft.com/v1.0/sites/contoso.sharepoint.com:/sites/mySite). This request will return an ID that is a composite of several values: Site collection hostname, Site collection unique ID, and Site unique ID where the middle value is the siteId needed to run the permissions POST.
  7. Generate a client secret. ↗.

Set the following source configurations in Data Connection:

OptionRequired?Description
Azure Client IDYesThe ID of the app registration; also called Application ID.
Azure Tenant IDYesthe unique identifier of the Microsoft Entra ID instance.
Client secretYesThe secret generated in the app registration.

Username/password

The username/password flow involves creating a user account that can sign in to Microsoft 365. The Graph API does not support two-factor authentication for the username/password authentication method. Because of this, we strongly recommend creating a randomly generated password of at least 32 characters in length.

In your Entra admin center, complete the following steps:

  1. Go to API Permissions in the left sidebar.
  2. Select Add a Permission.
  3. Select Microsoft Graph.
  4. Select Delegated Permissions.
  5. Add the Sites.Read.All permission;.
    • If you plan to configure export tasks, use Sites.ReadWrite.All instead.
  6. If you are an Azure Administrator, select Grant admin consent for [tenant].
  7. Go to Authentication in the left sidebar.
  8. Change Allow public client flows to Yes.
  9. Create a user in Microsoft Entra ID with a randomly generated password of at least 32 characters.
  10. Add that user to any SharePoint sites that you would like it to read or write.

Set the following source configurations in Data Connection:

OptionRequired?Description
Azure Client IDYesThe ID of the app registration; also called Application ID.
UsernameYesThe user's email address.
PasswordYesThe generated password.

XML-based permissioning for SharePoint Add-ins

If you are using SharePoint Add-ins for authorization and authentication ↗, and your SharePoint Add-in uses XML for permission management, you must ensure that the correct scope is set in the scope URI to avoid access issues when connecting to SharePoint.

Follow the steps below to verify and configure the correct scope:

  1. Locate the AppManifest.xml file containing the permission settings for your SharePoint Add-in.
  2. In the AppManifest.xml file, identify the scope URI within the XML file, which should look similar to this:

<AppPermissionRequests AllowAppOnlyPolicy="true"> <AppPermissionRequest Scope="http://sharepoint/content/sitecollection/web" Right="FullControl" /> </AppPermissionRequests>.

  1. Verify that the scope value (in this example, http://sharepoint/content/sitecollection/web) matches the SharePoint site to which you are connecting; if the scope value does not match, adjust the scope value accordingly.

Networking

The SharePoint Online connector requires network access to the following domains on port 443:

  • login.microsoftonline.com
  • graph.microsoft.com
  • Your SharePoint URL; for example, contoso.sharepoint.com

If you are using a GovCloud Sharepoint instance, use the following domains on port 443 instead:

  • login.microsoftonline.us
  • graph.microsoft.us
  • Your SharePoint URL; for example, contoso.sharepoint.us

Configuration options

The following configuration options are available for the SharePoint Online connector:

OptionRequired?Description
SharePoint Library URLYesA single SharePoint site may have several document libraries; your URL must point to a specific library. Must be in the format https://[tenant].sharepoint.com/sites/[site]/[library].
Credentials settingsYesConfigure using the Authentication guidance shown above.
Proxy settingsNoEnable to use a proxy while connecting to SharePoint Online.

Sync data from Sharepoint Online

The SharePoint Online connector uses the file-based sync interface.

Export data to SharePoint Online

To export to a SharePoint site, first enable exports for your SharePoint Online connector. Then, create a new export.

Export configuration options

OptionRequired?DefaultDescription
Directory pathYes/The path to the folder in the SharePoint library where files should be exported. The full path for an exported file is calculated as <SharePoint Library URL>/Directory Path>/<Exported File Path>