Updated: Mar 29, 2019
Let us look how to integrate Azure Function, Cognitive services into Microsoft Flow for extracting tags/categories and update the SharePoint document item.
This article series helps us to work on a special use case of extracting information of word documents uploaded to Office 365 SharePoint libraries and then analyze/classify the document content using Azure Cognitive Services. Then update the document with classified data as tags/categories. The article links are shown below.
Using Azure Functions, Cognitive Services and Flow for classifying Office 365 SharePoint Word Documents - Part I (Previous Article) - Azure function is briefly explained with code to understand how the data is being extracted using open XML formats/references available.
Extract Code From Github
The Azure function created in the previous article is available on Github repository (https://github.com/nakkeerann/analyze-sp-word-documents).
Clone the code from the github repository to the local.
Open in visual studio and make necessary changes, like updating user credentials and SharePoint site and details.
Test Code Locally
Run the solution to test the solution on localhost by pressing F5. The solution can be tested locally using localhost path. http://localhost:7071/api/readspdocuments?filePath=https://nakkeerann.sharepoint.com/sites/teamsite/Shared Documents/testingflow.docx
Creating Azure Function
Let us first look at deploying the azure function with the necessary code we have created in the previous article.
Login to Azure Portal.
Click on new resource and find for function App. Then create a new Azure Function.
Download the profile of Azure Function to deploy the code from local.
Deploy Code to Azure Function
Open the code in visual studio.
Right click on the solution and click publish.
Click on New Profile -> Import Profile. Select the downloaded profile from Azure.
Once done, click on publish.
The function is ready to be used across the sites.
Then extract the function URL from the site and access it on the browser directly (like we have tested on localhost.
Text Analytics API
Create Text Analytics API service under Azure Cognitive Service for analyzing or classifying content. The account name, account key and hosted site URL will be configured later on the Microsoft Flow.
Microsoft Flow Configuration
Let us see the end to end flow using Microsoft Flow. Create Trigger - The trigger “When a file is created in folder” is created to listen to any document/file uploads.
Provide the site and folder details.
Get the file and item properties – An action “Get file metadata using path” is created to retrieve the item details, which file is being associated to.
Get File Content – HTTP action is created to get the file content using Azure function. This function was created in the above steps. In the azure function URL, file path is being sent additionally to get the file content.
Get the key phrases – Key Phrases action is created to analyze and classify the content.
Provide the connection name, account key and site URL details of the Azure Cognitive Service created.
Update the file Properties – Create “Update File Properties” action (under SharePoint connector) to update the key phrases column into the SharePoint item, in which file is being associated with.
Testing the Flow
Login to Office 365 SharePoint site and then to document library (The site and library which has been configured in the above steps for triggers).
From a document library, upload a word document.
The flow runs in the background.
Navigate to the SharePoint library to see the update properties of word document item.
We have configured Microsoft Flow which ran successfully,
to extract content from word document upload on library,
and analyzed the content using Azure cognitive service,
and updated document item properties with tags identified.
Limitations on this usecase:
Few other scenarios which were not factored in this article series are,
Limitation of number of characters being analyzed on Azure Cognitive Service - Text Analytics API. This can be taken care by sending the text partially in loops.
Other document types were not considered. If its the case, Microsoft Flow has to be modified to take only word documents. To analyze other type of documents, it will be entirely different logic.