Connect to SharePoint Online to import files from specified SharePoint libraries into Foundry.
Capability | Status |
---|---|
Exploration | 🟢 Generally available |
Bulk import | 🟢 Generally available |
Incremental | 🟢 Generally available |
Export tasks | 🟡 Sunset |
File exports | 🟢 Generally available |
The connector can transfer files of any type into Foundry datasets. File formats are preserved, and no schemas are applied during or after the transfer. Apply any necessary schema to the output dataset, or write a downstream transformation to access the data.
There is no limit to the size of transferable files. However, network issues can result in failures of large-scale transfers. In particular, direct cloud syncs that take more than two days to run will be interrupted. To avoid network issues, we recommend using smaller file sizes and limiting the number of files that are ingested in every execution of the sync. Syncs can be scheduled to run frequently.
Connections to on-premise SharePoint servers are not supported.
Learn more about setting up a connector in Foundry.
Authentication for the SharePoint Online source requires an application in Microsoft Entra ID (formerly known as Azure Active Directory). If you are not an Entra ID administrator, contact your IT department to request access.
Follow the initial steps below to access Azure application credentials:
Then, choose between two available authentication method:
In your Microsoft Entra admin center, complete the following steps:
Go to API Permissions in the left sidebar.
Select Add a Permission.
Select Microsoft Graph.
Select Application Permissions.
Sites.Read.All
.
Sites.ReadWrite.All
instead.Sites.Selected
.If you are an Entra Administrator, select Grant admin consent for [tenant].
If you added Sites.Selected
above, add your application to specific sites ↗.
"roles"
array parameter are "write"
and/or "read"
. The "read"
option is sufficient to ingest files from the SharePoint site.https://graph.microsoft.com/v1.0/sites/[tenantName]:/sites/[siteName]
(for example: https://graph.microsoft.com/v1.0/sites/contoso.sharepoint.com:/sites/mySite). This request will return an ID that is a composite of several values: Site collection hostname, Site collection unique ID, and Site unique ID where the middle value is the siteId needed to run the permissions POST.Set the following source configurations in Data Connection:
Option | Required? | Description |
---|---|---|
Azure Client ID | Yes | The ID of the app registration; also called Application ID. |
Azure Tenant ID | Yes | the unique identifier of the Microsoft Entra ID instance. |
Client secret | Yes | The secret generated in the app registration. |
The username/password flow involves creating a user account that can sign in to Microsoft 365. The Graph API does not support two-factor authentication for the username/password authentication method. Because of this, we strongly recommend creating a randomly generated password of at least 32 characters in length.
In your Entra admin center, complete the following steps:
Sites.Read.All
permission;.
Sites.ReadWrite.All
instead.Yes
.Set the following source configurations in Data Connection:
Option | Required? | Description |
---|---|---|
Azure Client ID | Yes | The ID of the app registration; also called Application ID. |
Username | Yes | The user's email address. |
Password | Yes | The generated password. |
If you are using SharePoint Add-ins for authorization and authentication ↗, and your SharePoint Add-in uses XML for permission management, you must ensure that the correct scope is set in the scope URI to avoid access issues when connecting to SharePoint.
Follow the steps below to verify and configure the correct scope:
AppManifest.xml
file containing the permission settings for your SharePoint Add-in.AppManifest.xml
file, identify the scope URI within the XML file, which should look similar to this:<AppPermissionRequests AllowAppOnlyPolicy="true"> <AppPermissionRequest Scope="http://sharepoint/content/sitecollection/web" Right="FullControl" /> </AppPermissionRequests>
.
http://sharepoint/content/sitecollection/web
) matches the SharePoint site to which you are connecting; if the scope value does not match, adjust the scope value accordingly.The SharePoint Online connector requires network access to the following domains on port 443:
login.microsoftonline.com
graph.microsoft.com
contoso.sharepoint.com
If you are using a GovCloud Sharepoint instance, use the following domains on port 443 instead:
login.microsoftonline.us
graph.microsoft.us
contoso.sharepoint.us
The following configuration options are available for the SharePoint Online connector:
Option | Required? | Description |
---|---|---|
SharePoint Library URL | Yes | A single SharePoint site may have several document libraries; your URL must point to a specific library. Must be in the format https://[tenant].sharepoint.com/sites/[site]/[library] . |
Credentials settings | Yes | Configure using the Authentication guidance shown above. |
Proxy settings | No | Enable to use a proxy while connecting to SharePoint Online. |
The SharePoint Online connector uses the file-based sync interface.
To export to a SharePoint site, first enable exports for your SharePoint Online connector. Then, create a new export.
Option | Required? | Default | Description |
---|---|---|---|
Directory path | Yes | / | The path to the folder in the SharePoint library where files should be exported. The full path for an exported file is calculated as <SharePoint Library URL>/Directory Path>/<Exported File Path> |