There are plenty of excellent tutorials on how to train your model using Azure Machine Learning. There are also a lot of references on how to deploy your trained model usuing Azure Container Instances (development environment) or Azure Kubernetes Services (production environment).
However, consuming the deployed model may turn into a challenge, as the official documentation doesn't include the pandas dataframe. As the majority of developers apply their trained models onto a pandas dataframe, I find it important to share some code snippets with the community. As usually, up we go!
Here're the steps we need to reproduce beforehand
Create a workspace
Get started in Azure Machine Learning studio
Create and load dataset
Configure experiment run
Explore models
Deploy the best model
Here I've taken the example from the previous article and trained my Fake News classifier. The input is a text string, and the output is the label, saying whether the text is fake or not.
So, after the model have been deployed successfully, go the Endpoints, and save the RESTFul endpoint as a variable.
import requests
import json
import pandas as pd
SCORING_URI = "your-scoring-url"
def get_scored_label(text):
web_input = text
data = {"data": web_input}
# Convert to JSON string
input_data = json.dumps(data)
# Set the content type
headers = {'Content-Type': 'application/json'}
# Make the request and display the response
resp = requests.post(SCORING_URI, input_data, headers=headers)
output = json.loads(resp.json())['result']
return output
Great, we are now able to consume the web service. However, I've got about 4k lines in my validation set, and, as the opening an http request is long, it will take ages to finish my experiment. So ideally, we need to find a way to send our data in a "batch" mode, i.e. sending multiple requests at once.
But, as my input string may be relatively long (news are longer than tweets or comments), there's a limit of items that may be sent at a time. Consequently, we need to split our data into chunks and treat our news chunk by chunk. This is slower thant simply sending all the lines at once, which is impossible due to the memory limit, but faster than requesting the ML Service for each line. The chunk size has been defined empirically and equals to 100.
Here's the code
import requests
import json
import pandas as pd
def get_scored_label(text_array):
web_input = [[text] for text in text_array]
# Two sets of data to score, so we get two results back
data = {"data": web_input}
# Convert to JSON string
input_data = json.dumps(data)
# Set the content type
headers = {'Content-Type': 'application/json'}
# Make the request and display the response
resp = requests.post(scoring_uri, input_data, headers=headers)
output = json.loads(resp.json())['result']
return output
# URL for the web service
scoring_uri = 'http://<id>.<region>.azurecontainer.io/score'
input_df = pd.read_csv("test_clean.csv")
input_list = list(input_df["text"])
chunk_size = 100
array_size = len(input_list)
i = 0
temp = []
while i <= array_size:
temp.append(get_scored_label(input_list[i: i + chunk_size]))
temp_df = pd.DataFrame({"predicted_label":temp})
i += chunk_size
print("processed: " + str(i))
temp_df.to_csv("temp"+str(i)+".csv", index=False)
input_df['predicted_label'] = temp
input_df.to_csv("final_output_test.csv", index=False)
As you see, we also save temporary datasets into a .csv file to avoid losing all the data due to the request failure. Hope this was useful.
Comments