[CARLOTTA]: Great, so I think we can start since the meeting is recorded, so if everyone, uh, jump-jumps in later, they can watch the recording. So, hi everyone and welcome to this um, Cloud Skill Challenge study session around a create classification models with Azure Machine learning designer. So today I'm thrilled to be here with John. Uh, John do you mind introduce briefly yourself? [JOHN]: Uh, thank you Carlotta. Hello everyone. Welcome to our workshop today. I hope that you are all excited for it. I am John Aziz, a gold Microsoft Learn student ambassador, and I will be here with, uh, Carlotta to do the practical part about this module of the Cloud Skills Challenge. Thank you for having me. [CARLOTTA]: Perfect, thanks John. So for those who don't know me, I'm Carlotta Castelluccio, based in Italy and focused on AI machine learning technologies and about the use in education. Um, so, um this Cloud Skill Challenge study session is based on a learn module, a dedicated learn module. I sent to you, uh the link to this module, uh, in the chat in a way that you can follow along the module if you want, or just have a look at the module later at your own pace. Um... So, before starting I would also like to remember to remember you, uh, the code of conduct and guidelines of our student ambassadors community. So please during this meeting be respectful and inclusive and be friendly, open, and welcoming and respectful of other-each other differences. If you want to learn more about the code of conduct, you can use this link in the deck: aka.ms/SACoC. And now we are, um, we are ready to to start our session. So as we mentioned it we are going to focus on classification models and Azure ML, uh, today. So, first of all, we are going to, um, identify, uh, the kind of um, of scenarios in which you should choose to use a classification model. We're going to introduce Azure Machine Learning and Azure Machine Designer. We're going to understand, uh, which are the steps to follow, to create a classification model in Azure Machine Learning, and then John will, um, lead an amazing demo about training and publishing a classification model in Azure ML Designer. So, let's start from the beginning. Let's start from identifying classification machine learning scenarios. So, first of all, what is classification? Classification is a form of machine learning that is used to predict which category or class an item belongs to. For example, we might want to develop a classifier able to identify if an incoming email should be filtered or not according to the style, the sender, the length of the email, etc. In this case, the characteristics of the email are the features. And the label is a classification of either a zero or one, representing a spam or non-spam for the incoming email. So this is an example of a binary classifier. If you want to assign multiple categories to the incoming email like work letters, love letters, complaints, or other categories, in this case a binary classifier is no longer enough, and we should develop a multi-class classifier. So classification is an example of what is called supervised machine learning in which you train a model using data that includes both the features and known values for label so that the model learns to fit the feature combinations to the label. Then, after training has been completed, you can use the train model to predict labels for new items for-for which the label is unknown. But let's see some examples of scenarios for classification machine learning models. So, we already mentioned an example of a solution in which we would need a classifier, but let's explore other scenarios for classification in other industries. For example, you can use a classification model for a health clinic scenario, and use clinical data to predict whether patient will become sick or not. You can use, um... [NO AUDIO] [JOHN]: Carlotta, you are muted. [CARLOTTA]: Oh, sorry. So, when I became muted, it's a long time, or? [JOHN]: You can use-you can use, uh some models for classification. For example, you can use... You were saying this. [CARLOTTA]: Uh, so I was in this deck, or the previous one? [JOHN]: This one, you have been muted for, uh, one second [LAUGHS]. [CARLOTTA]: Okay, okay perfect, perfect. Uh, yeah I was talking...sorry for that. So, I was talking about the possible scenarios in which you, you can use a classification model. Like have clinic scenario, financial scenario, or the third one is business type of scenario. You can use characteristics of small business to predict if a new venture will succeed or not, for example. And these are all types of binary classification. Uh, but today we are also going to talk about Azure Machine Learning. So let's see. What is Azure Machine Learning? So training and deploying an effective machine learning model involves a lot of work, much of it time-consuming and resource intensive. So, Azure Machine Learning is a cloud-based service that helps simplify some of the tasks it takes to prepare data, train a model, and also deploy it as a predictive service. So it helps that the scientists increase their efficiency by automating many of the time-consuming tasks associated to creating and training a model. And it enables them also to use cloud-based compute resources that scale effectively to handle large volumes of data while incurring costs only when actually used. To use Azure Machine Learning, you, first thing's first, you need to create a workspace resource in your Azure subscription, and you can then use these workspace to manage data, compute resources, code models and other artifacts after you have created an Azure Machine Learning workspace, you can develop solutions with the Azure Machine Learning service, either with developer tools or the Azure Machine Learning studio web portal. In particular, Azure Machine Learning studio is a web portal for machine learning solutions in Azure, and it includes a wide range of features and capabilities that help data scientists prepare data, train models, publish predictive services, and monitor also their usage. So to begin using the web portal, you need to assign the workspace you created in the Azure portal to the Azure Machine Learning studio. At its core, Azure Machine Learning is a service for training and managing machine learning models for which you need compute resources on which to run the training process. Compute targets are, um, one of the main basic concepts of Azure Machine Learning. They are cloud-based resources on which you can run model training and data exploration processes. So in Azure Machine Learning studio, you can manage the compute targets for your data science activities, and there are four kinds of of compute targets you can create. We have the compute instances, which are vital machine set up for running machine learning code during development, so they are not designed for production. Then we have compute clusters, which are a set of virtual machines that can scale up automatically based on traffic. We have inference clusters, which are similar to compute clusters, but they are designed for deployment, so they are deployment targets for predictive services that use trained models. And finally, we have attached compute, which are any compute target that you manage yourself outside of Azure ML, like, for example, virtual machines or Azure data bricks clusters. So we talked about Azure Machine Learning, but we also mentioned- mentioned Azure Machine Learning designer. What is Azure Machine Learning designer? So, in Azure Machine Learning Studio, there are several ways to author classification machine learning models. One way is to use a visual interface, and this visual interface is called designer, and you can use it to train, test, and also deploy machine learning models. And the drag-and-drop interface makes use of clearly defined inputs and outputs that can be shared, reused, and also version control. And using the designer, you can identify the building blocks or components needed for your model, place and connect them on your canvas, and run a machine learning job. So, each designer project, so each project in the designer is known as a pipeline. And in the design, we have a left panel for navigation and a canvas on your right hand side in which you build your pipeline visually. So pipelines let you organize, manage, and reuse complex machine learning workflows across projects and users. A pipeline starts with the data set from which you want to train the model because all begins with data when talking about data science and machine learning. And each time you run a pipeline, the configuration of the pipeline and its results are stored in your workspace as a pipeline job. So the second main concept of Azure Machine Learning is a component. So, going hierarchically from the pipeline, we can say that each building block of a pipeline is called a component. In other words, an Azure Machine Learning component encapsulates one step in a machine learning pipeline. So, it's a reusable piece of code with inputs and outputs, something very similar to a function in any programming language. And in a pipeline project, you can access data assets and components from the left panels Asset Library tab, as you can see here in the screenshot in the deck. So you can create data assets on using an ADOC page called Data Page. And a data asset is a reference to a data source location. So this data source location could be a local file, a data store, a web file or even an Azure open asset. And these data assets will appear along with standard sample data set in the designers Asset Library. Um. Another basic concept of Azure ML is Azure Machine Learning jobs. So, basically, when you submit a pipeline, you create a job which will run all the steps in your pipeline. So a job executes a task against a specified compute target. Jobs enable systematic tracking for your machine learning experimentation in Azure ML. And once a job is created, Azure ML maintains a run record, uh, for the job. Um, but, let's move to the classification steps. So, um, let's introduce how to create a classification model in Azure ML, but you will see it in more details in a handsome demo that John will guide through in a few minutes. So, you can think of the steps to train and evaluate a classification machine learning model as four main steps. So first of all, you need to prepare your data. So, you need to identify the features and the label in your data set, you need to pre-process, so you need to clean and transform the data as needed. Then, the second step, of course, is training the model. And for training the model, you need to split the data into two groups: a training and a validation set. Then you train a machine learning model using the training data set and you test the machine learning model for performance using the validation data set. The third step is performance evaluation, which means comparing how close the model's predictions are to the known labels and these lead us to compute some evaluation performance metrics. And then finally... So, these three steps are not, um, not performed every time in a linear manner. It's more an iterative process. But once you obtain, you achieve a performance with which you are satisfied, so you are ready to, let's say go into production, and you can deploy your train model as a predictive service into a real-time, uh, to a real-time endpoint. And to do so, you need to convert the training pipeline into a real-time inference pipeline, and then you can deploy the model as an application on a server or device so that others can consume this model. So let's start with the first step, which is prepare data. Real-world data can contain many different issues that can affect the utility of the data and our interpretation of the results. So also the machine learning model that you train using this data. For example, real- world data can be affected by a bad recording or a bad measurement, and it can also contain missing values for some parameters. And Azure Machine Learning designer has several pre-built components that can be used to prepare data for training. These components enable you to clean data, normalize features, join tables, and more. Let's come to training. So, to train a classification model you need a data set that includes historical features, so the characteristics of the entity for which one to make a prediction, and known label values. The label is the class indicator we want to train a model to predict. And it's common practice to train a model using a subset of the data while holding back some data with which to test the train model. And this enables you to compare the labels that the model predicts with the actual known labels in the original data set. This operation can be performed in the designer using the split data component as shown by the screenshot here in the... in the deck. There's also another component that you should use, which is the score model component to generate the predicted class label value using the validation data as input. So once you connect all these components, the component specifying the model we are going to use, the split data component, the trained model component, and the score model component, you want to run a new experiment in Azure ML, which will use the data set on the canvas to train and score a model. After training a model, it is important, we say, to evaluate its performance, to understand how bad-how good sorry our model is performing. And there are many performance metrics and methodologies for evaluating how well a model makes predictions. The component to use to perform evaluation in Azure ML designer is called, as intuitive as it is, Evaluate Model. Once the job of training and evaluation of the model is completed, you can review evaluation metrics on the completed job page by right clicking on the component. In the evaluation results, you can also find the so-called confusion Matrix that you can see here in the right side of this deck A confusion matrix shows cases where both the predicted and actual values were one, the so-called true positives at the top left and also cases where both the predicted and the actual values were zero, the so-called true negatives at the bottom right. While the other cells show cases where the predicting and actual values differ, called false positive and false negatives, and this is an example of a confusion matrix for a binary classifier. While for a multi-class classification model the same approach is used to tabulate each possible combination of actual and predictive value counts. So for example, a model with three possible classes would result in three times three matrix. The confusion matrix is also useful for the matrix that can be derived from it, like accuracy, recall, or precision. We say that the last step is deploying the train model to a real-time endpoint as a predictive service. And in order to automate your model into a service that makes continuous predictions, you need, first of all, to create and then deploy an inference pipeline. The process of converting the training pipeline into a real-time inference pipeline removes training components and adds web service inputs and outputs to handle requests. And the inference pipeline performs...they seem that the transformation is the first pipeline, but for new data. Then it uses the train model to infer or predict label values based on its feature. So, I think I've talked a lot for now I would like to let John show us something in practice with the hands-on demo, so please, John, go ahead, share your screen and guide us through this demo of creating a classification with the Azure Machine Learning designer. [JOHN]: Thank you so much Carlotta for this interesting explanation of the Azure ML designer. And now, um, I'm going to start with you in the practical demo part, so if you want to follow along, go to the link that Carlotta sent in the chat so you can do the demo or the practical part with me. I'm just going to share my screen... and... ...go here. So, uh... Where am I right now? I'm inside the Microsoft Learn documentation. This is the exercise part of this module, and we will start by setting two things, which are a prequisite for us to work inside this module, which are the users group and the Azure Machine Learning workspace, and something extra which is the compute cluster that Carlotta talked about. So I just want to make sure that you all have a resource group created inside your portal inside your Microsoft Azure platform. So this is my resource group. Inside this is this Resource Group. I have created an Azure Machine Learning workspace. So I'm just going to access the workspace that I have created already from this link. I am going to open it, which is the studio web URL, and I will follow the steps. So what is this? This is your machine learning workspace, or machine learning studio. You can do a lot of things here, but we are going to focus mainly on the designer and the data and the compute. So another prerequisite here, as Carlotta told you, we need some resources to power up the classification, the processes that will happen. So, we have created this computing cluster, and we have set some presets for it. So where can you find this preset? You go here. Under the create compute, you'll find everything that you need to do. So the size is the Standard DS11 Version 2, and it's a CPU not GPU, because we don't know the GPU, and we don't need a GPU. Uh, it is ready for us to use. The next thing which we will look into is the designer. How can you access the designer? You can either click on this icon or click on the navigation menu and click on the designer for me. Now I am inside my designer. What we are going to do now is the pipeline that Carlotta told you about. And from where can I know these steps? If you follow along in the learn module, you will find everything that I'm doing right now in detail, with screenshots of course. So I'm going to create a new pipeline, and I can do so by clicking on this plus button. It's going to redirect me to the designer authoring the pipeline, uh, where I can drag and drop data and components that Carlotta told you the difference between. And here I am going to do some changes to the settings. I am going to connect this with my compute cluster that I created previously so I can utilize it. From here I'm going to choose this compute cluster demo that I have showed you before in the clusters here, and I am going to change the name to something more meaningful. Instead of byline and the date of today I'm going to name it Diabetes... uh... let's just check this training. Let's say Training 0.1 or 01, okay? And I am going to close this tab in order to have a bigger place to work inside because this is where we will work, where everything will happen. So I will click on close from here, and I will go to the data and I will create a new data set. How can I create a new data set? There is multiple options here you can find, from local files, from data store, from web files, from open data set, but I'm going to choose from web files, as this is the way we're going to create our data. From here, the information of my data set I'm going to get them from the Microsoft Learn module. So if we go to the step that says "Create a dataset", under it, it illustrates that you can access the data from inside the asset library, and inside your asset library, you'll find the data and find the component. And I'm going to select this link because this is where my data is stored. If you open this link, you will find this is a CSV file, I think. Yeah. And you can...like, all the data are here. Now let's get back.. Um... And you are going to do something meaningful, but because I have already created it before twice, so I'm gonna add a number to the name The data set is tabular and there is the file, but this is a table, so we're going to choose the table. Data type for data set type. Now we will click on "Next". That's gonna review, or display for you the content of this file that you have imported to this workspace. And for these settings, these are related to our file format. So this is a delimited file, and it's not plain text, it's not a Jason. The delimiter is common, as we have seen that they [INDISTINGUISHABLE] So I'm choosing common errors because the only the first five... [INDISTINGUISHABLE] ...for example. Okay, uh, if you have any doubts, if you have any problems, please don't hesitate to write me in the chat, like, what is blocking you, and me and Carlotta will try to help you, like whenever possible. And now this is the new preview for my data set. I can see that I have an ID, I have patient ID, I have pregnancies, I have the age of the people, I have the body mass, I think whether they have diabetes or not, as a zero and one. Zero indicates a negative, the person doesn't have diabetes, and one indicates a positive, that this person has diabetes. Okay. Now I'm going to click on "Next". Here I am defining my schema. All the data types inside my columns, the column names, which columns to include, which to exclude. And here we will include everything except the path of the bath color. And we are going to review the data types of each column. So let's review this first one. This is numbers, numbers, numbers, then it's the integer. And this is, um, like decimal.. ...dotted... decimal number. So we are going to choose this data type. And for this one it says diabetic, and it's a zero under one, and we are going to make it as integers. Now we are going to click on "Next" and move to reviewing everything. This is everything that we have defined together. I will click on "Create". And... now the first step has ended. We have gotten our data ready. Now...what now? We're going to utilize the designer... um...power. We're going to drag and drop our data set to create the pipeline. So I have clicked on it and dragged it to this space. It's gonna appear to you. And we can inspect it by right clicking and choose "Preview data" to see what we have created together. From here, you can see everything that we have seen previously, but in more details. And we are just going to close this. Now what? Now we are gonna do the processing that Carlota mentioned. These are some instructions about the data, about how you can look at them, how you can open them but we are going to move to the transformation or the processing. So as Carlotta told you, like any data for us to work on we have to do some processing to it to make it easy easier for the model to be trained and easier to work with. So, uh, we're gonna do the normalization. And normalization meaning is, uh, to scale our data, either down or up, but we're going to scale them down, and we are going to decrease, uh, relatively decrease the values, all the values, to work with lower numbers. And if we are working with larger numbers, it's going to take more time. If we're working with smaller numbers, it's going to take less time to calculate them, and that's it. So where can I find the normalized data? I can find it inside my component. So I will choose the component and search for "Normalized data". I will drag and drop it as usual and I will connect between these two things by clicking on this spot, this, uh, circuit, and drag and drop onto the next circuit. Now we are going to define our normalization method. So I'm going to double click on the normalized data. It's going to open the settings for the normalization as a better transformation method, which is a mathematical way that is going to scale our data according to. We're going to choose min-max, and for this one, we are going to choose "Use Zero", for constant column we are going to choose "True", and we are going to define which columns to normalize. So we are not going to normalize the whole data set. We are going to choose a subset from the data set to normalize. So we're going to choose everything except for the patient ID and the diabetic, because the patient ID is a number, but it's a categorical data. It describes a patient, it's not a number that I can sum. I can't say "patient ID number one plus patient ID number two". No, this is a patient and another patient, it's not a number that I can do mathematical operations on, so I'm not going to choose it. So we will choose everything as I said, except for the diabetic and the patient ID. I will click on "Save". And it's not showing me a warning again, everything is good. Now I can click on "Submit" and review my normalization output. Um. So, if you click on "Submit" here, you will choose "Create new" and set the name that is mentioned here inside the notebook. So it tells you to create a job and name it, name the experiment "MS Learn Diabetes Training", because you will continue working on and building component later. I have it already created, I am the, uh, we can review it together. So let me just open this in another tab. I think I have it... here. Okay. So, these are all the jobs that I have created. All the jobs there. Let's do this over. These are all the jobs that I have submitted previously. And I think this one is the normalization job, so let's see the output of it. As you can see, it says, uh, "Check mark", yes, which means that it worked, and we can preview it. How can I do that? Right click on it, choose "Preview data", and as you can see all the data are scaled down so everything is between zero and, uh, one I think. So everything is good for us. Now we can move forward to the next step which is to create the whole pipeline. So, uh, Carlota told you that we're going to use a classification model to create this data set, so let me just drag and drop everything to get runtime and we're doing [INDISTINGUISHABLE] about everything by [INDISTINGUISHABLE] So, as a result, we are going to explain [INDISTINGUISHABLE] Yeah. So, I'm going to give this split data. I'm going to take the transformation data to split data and connect it like that. I'm going to get three model components because I want to train my model, and I'm going to put it right here. Okay. Let's just move it down there. Okay. And we are going to use a classification model, a two class logistic regression model. So I'm going to give this algorithm to enable my model to work This is the untrained model, this is... here. The left... the left, uh, circuit, I'm going to connect it to the data set, and the right one, we are going to connect it to evaluate model. Evaluate model...so let's search for "Evaluate model" here. So because we want to do what...we want to evaluate our model and see how it it has been doing. Is it good, is it bad? Um, sorry... This is... this is down there after the score model. So we have to get the score model first, so let's get it. And this will take the trained model and the data set to score our model and see if it's performing good or bad. And... um... after that, we have finished everything. Now, we are going to do the what? The presets for everything. As a starter, we will be splitting our data. So how are we going to do this, according to what? To the split rules. So I'm going to double-click on it and choose "Split rules". And the percentage is 70 percent for the [INSISTINGUASHABLE] and 30 percent of the data for the valuation or for the scoring, okay? I'm going to make it a randomization, so I'm going to split data randomly and the seat is, uh, 132, uh 23 I think...yeah. And I think that's it. The split says why this holds, and that's good. Now for the next one, which is the train model we are going to connect it as mentioned here. And we have done that and...then why am I having here? Let's double click on it...yeah. It has...it needs the label column that I am trying to predict. So from here, I'm going to choose diabetic. I'm going to save. I'm going to close this one. So it says here, the diabetic label, the model, it will predict the zero and one, because this is a binary classification algorithm, so it's going to predict either this or that. And... um... I think that's everything to run the the pipeline. So everything is done, everything is good for this one. We're just gonna leave it for now, because this is the next step. Um, this will be put instead of the score model, but let's... let's delete it for now. Okay. Now we have to submit the job in order to see the output of it. So I can click on "Submit" and choose the previous job which is the one that I have showed you before. And then let's review its output together here. So if I go to the jobs, if I go to MS Learn, maybe it is training? I think it's the one that lasted the longest, this one here. So here I can see the job output, what happened inside the model, as you can see. So the normalization we have seen before, the split data, I can preview it. The result one or the result two as it splits the data to 70 here and thirty percent here. Um, I can see the score model, which is something that we need to review. Inside the scroll model, uh, from here, we can see that... let's get back here. This is the data that the model has been scored and this is a scoring output. So it says "code label true", and he is not diabetic, so this is, um, a wrong prediction, let's say. For this one it's true and true, and this is a good, like, what do you say, prediction, and the probabilities of this score, which means the certainty of our model of that this is really true. It's 80 percent. For this one it's 75 percent. So these are some cool metrics that we can review to understand how our model is performing. It's performing good for now. Let's check our evaluation model. So this is the extra one that I told you about. Instead of the score model only, we are going to add what evaluate model after it. So here we're going to go to our Asset Library and we are going to choose the evaluate model, and we are going to put it here, and we are going to connect it, and we are going to submit the job using the same name of the job that we used previously. Let's review it. Also, so, after it finishes, you will find it here. So I have already done it before, this is how I'm able to see the output. So let's see what is the output of this evaluation process. Here it mentioned to you that there are some matrix, like the confusion matrix, which Carlotta told you about, there is the accuracy, the precision, the recall, and F1 Score. Every matrix gives us some insight about our model. It helps us to understand it more, and, um, understand if it's overfitting, if it's good, if it's bad, and really really, like, understand how it's working. Now I'm just waiting for the job to load. Until it loads, um, we can continue to work on our model. So I will go to my designer. I'm just going to confirm this. And I'm going to continue working on it from where we have stopped. Where have we stopped? we have stopped on the evaluate model. So I'm going to choose this one. And it says here "select experiment", "create inference pipeline", so I am going to go to the jobs, I'm going to select my experiment. I hope this works. Okay. Finally, now we have our evaluate model output. Let's preview evaluation results and, uh... come on. Finally. Now we can create our inference pipeline. So, I think it says that... um... select the experiment, then select MS Learn. So, I am just going to select it, and finally. Now we can, the ROC curve, we can see it, that the true positive rate and the force was integrate. The false positive rate is increasing with time, and also the true positive rate. True positive is something that it predicted, that it is, uh, positive it has diabetes, and it's really...it's really true. The person really has diabetes. Okay. And for the false positive, it predicted that someone has diabetes and someone doesn't have it. This is what true position and false positive means. This is the record curve, so we can review the metrics of our model. This is the lift curve. I can change the threshold of my confusion matrix here and if Carlotta wants to add anything about the...the graphs, you can do so. [CARLOTTA]: Um, yeah, so I just wanted to...if you go...yeah. I just wanted to comment for the RSC curve, that actually from this graph, the metric which usually we're going to compute is the area under under the curve. And this coefficient or metric, it's a coefficient— it's a value that could span from zero to one and the the highest is... ...the highest is the the score. So the closest one, so the the highest is the amount of area under this curve. The highest performance we've got from from our model. And another thing is what John is playing with. So this threshold for the logistic regression is the threshold used by the model to, um, to predict if the category is zero or one. So if the probability—the probability score is above the threshold, then the category will be predicted as one, while if the probability is below the threshold, in this case, for example, 0.5, the category is predicted as zero. So that's why it's very important to choose the threshold, because the performance really can vary, um, with this threshold value. [JOHN]: Thank you so much, Carlotta, and as I mentioned now, we are going to create our inference pipeline. So we are going to select the latest one, which I already have it opened here. This is the one that we were reviewing together. This is where we have stopped, and we're going to create an inference pipeline. We are going to choose a real-time inference pipeline, okay? From where I can find this? Here, as it says, "Real-time inference pipeline". So it's gonna add some things to my workspace. It's going to add the web service input, it's gonna have the web service output, because we will be creating it as a web service to access it from the internet. What are we going to do? We're going to remove this diabetes data, okay? And we are going to get a component called "Web input" and...let me check it's "enter data manually". We have...we already have that with input present. So we are going to get the entire data manually, and we're going to collect it—to connect it as it was connected before, like that. And also, I am not going to directly take the web service—sorry, escort model to the web service output like that. I'm going to delete this and I'm going to execute a python script before I display my result. So, this will be connected like... So... the other way around. And from here, I am going to connect this with that and there is some data that we will be getting from the node, or from the explanation here, and this is the data that will be entered to our website manually. Okay? This is instead of the data that we have been getting from our data set that we created. So I'm just going to double click on it and choose CSV, and I will choose "it has headers", and I will take or copy this content and put it there, okay? So let's do it. I think I have to click on edit code, now I can click on "Save", and I can close it. Another thing which is the python script that we will be executing. Um, yeah. We are going to remove this, also. We don't need the evaluate model anymore, so we are going to remove it. The python script that I will be executing, I can find it here. Um, yeah. This is the python script that we will execute. And it says to you that this code selects only the patient's ID the score label, the score probability and return—returns them to the web service output. So we don't want to return all the columns, as we have seen previously, that determines everything, so we want to return certain stuff, the stuff that we will use inside our endpoint. So I'm just going to select everything and delete it, and paste the code that I have gotten from the, uh, the Microsoft Learn docs. Now I can click on "Save", and I can close this. Let me check something, I don't think it saved. It's saved, but the display is wrong, okay. And now I think everything is good to go. I'm just gonna double-check everything. So, uh, yeah. We are gonna change the name of this pipeline, and we are gonna call it "Predict diabetes", okay? Now let's close it, and I think that we are good to go. So, um, Okay, I think everything is good for us. I just want to make sure of something. Is the data... it's correct, the data is...yeah, it's correct. Okay, now I can run the pipeline. Let's submit. Select an "existing" pipeline, and we're going to choose the "ms-learn-diabetes-training", which is the pipeline that we have been working on from the beginning of this module. I don't think that this is going to take much time. So we have submitted the job and it's running. Until the job ends, we are going to set everything for deploying a service. In order to deploy a service, um, I have to have the job ready, so until it's ready, you can't deploy it. So let's go to the job—the job details from here, okay? And until it finishes, Carlotta, do you think that we can have the questions, and then we can get back to the job I'm deploying it? [CARLOTTA]: Yeah, yeah, yeah. So yeah, guys, if you have any questions on what you just saw here or into introductions, feel free. This is a good moment, we can...we can discuss now, while we wait for this job to finish. [JOHN]: Uh, and.... can... we have the knowledge check one? Or, like, what do you think? [CARLOTTA]: Yeah, we can also go to the knowledge check. Um... Yeah, okay. So let me share my screen. Yeah, so if you have not any questions for us, we can maybe propose some questions to you that you can, um, check our knowledge so far and you can maybe answer to these questions via chat. So we have...do you see my screen, can you see my screen? [JOHN]: Yes. [CARLOTTA]: So, John, I think I will read this question aloud and ask it to you, okay? So are you ready to answer? [JOHN:] Yes I am. [CARLOTTA]: So... you're using Azure Machine Learning designer to create a training pipeline for a binary classification model, so what we were doing in our demo, right? And you have added a data set containing features and labels, a Two- Class Decision Forest module. So we used a logistic regression model our... um, in our example. Here, we're using a Two- Class Decision Forest model. And, of course, a Train Model module. You plan now to use score model and evaluate model modules to test the train model with the subset of the data set that wasn't used for training. But what are we missing? So what's another model you should add? We have three options: we have Join Data, we have Split Data, or we have Select Columns in Dataset. So while John thinks about the answer, go ahead and, um, answer yourself. So give us your guess. Put it in the chat, or just come off mute and answer. "A", "B". [JOHN]: Yeah, what do you is the correct answer for this one? I need something to uh...I have to score my model, and I have to evaluate it, so I need something to enable me to do these two things. [CARLOTTA]: I think it's something you showed us in your pipeline, right John? [JOHN]: Of course I did. [CARLOTTA]: Uh, we have no guesses in the chat? [JOHN]: Can someone... Someone want to guess? [CARLOTTA]: We have a "B". [JOHN]: Uh, maybe. So, in order to do this, I mentioned the the module that is going to help me to divide my data into two things: 70 percent for the the training and 30 percent for the evaluation. So what did I use? I used split data, because this is what is going to split my data randomly into training data and validation data. So the correct answer is "B", and good job. Thank you for participating. Next question, please. [CARLOTTA]: Yes, "B" is the correct answer, so thanks, John, for explaining to us the correct one. And we want to go with question two? [JOHN]: Yeah, so, I'm going to ask you now, Carlotta. You use Azure Machine Learning designer to create a training pipeline for your classification model. What must you do before you deploy this model as a service? You have to do something before you deploy it. What do you think is the correct answer? Is it "A", "B", or "C"? Share your thoughts with— with us in the chat and and I'm also going to give you some minutes to think of it before I tell you about it. [CARLOTTA]: Yeah so let me go through the possible answers, right? So we have A: "Create an inference pipeline from the training pipeline"; B: we have "Add an Evaluate Model module to the training pipeline; and then three, we have "Clone the training pipeline with a different name". So what do you think is the correct answer? "A", "B", or "C"? Also this time, I think it's something we mentioned both in the decks and in the demo right? [JOHN]: Yes it is, it's something that I have done like two, like five minutes ago. It's real-time, real-time. [CARLOTTA]: Um, yeah, so, think about...you need to deploy the model as a service. So if I'm going to deploy model, I cannot evaluate the model after deploying it, right, because I cannot go into production if I'm not sure, I'm not satisfied with my model, and I'm not sure that my model is performing well. So that's why I would go with, um, I would...exclude "B" from my answer. While thinking about "C", uh, I don't see you—I didn't see you, John, cloning the training Pipeline with a different name, so I don't think this is the right answer. While I've seen you creating an inference pipeline from the training pipeline, and you just converted it using a one-click button, right? [JOHN]: Yeah, that's correct. So this is the right answer. Good job. So I created an inference real-time pipeline, and it has done. It finished—it finished, the job is finished. So we can now deploy. And... Yeah [LAUGHS]. Exactly, like, on time. Like, it finished two seconds... three, four seconds ago [LAUGHS]. So, uh, until, um... This is my job review, so this is the job details that I have already submitted, it's just opening, and once it opens... um... I don't know why it's so heavy today, it's not like that usually. [CARLOTTA]: Yeah, it's probably because you are also showing your your screen on Teams, so that's the bandwidth of your connection. [JOHN]: Let me do something here because...yeah finally. I can switch to my mobile internet if it did it again. So I will click on "Deploy", it's that simple. I'll just click on "Deploy" and... I am going to deploy a new real-time endpoint. So what I'm going to name it? Description and the compute type. Everything is already mentioned for me here, so I'm just gonna copy and paste it, because we...we are running out of time. So it's all Azure Container Instance, not Azure Kubernetes Service, which is a containerization service also. Both are for containerization, but this gives you something, and this gives you something else. For the advanced options, it doesn't say for us to do anything, so we are just gonna click on "Deploy", and now we can test our endpoint from the endpoints that we can find here, so it's in progress. If I go here under the assets, I can find something called "Endpoints", and I can find the real-time ones and the batch endpoints. And we have created a real-time endpoint, so we are going to find it under this title. So if I click on it, I should be able to test it once it's ready. It's still loading, but this is the input, and this is the output that we will get back, so if I click on "Test"... and from here, I will input some data to the endpoint, which are: the patient information; the columns that we have already seen in our data set; the patient ID; the pregnancies. And of course, of course I'm not gonna enter the label that I'm trying to predict, so I'm not going to give him if the patient is diabetic or not. This endpoint is to tell me this. The endpoint, or the URL, is going to give me back this information, whether someone has diabetes, or he doesn't. So if I input this data, I'm just going to copy it, and go to my endpoint, and click on "Test", I'm gonna give the result pack, which are the three columns that we have defined inside our python script: the patient ID, the diabetic prediction, and the probability—the certainty of whether someone is diabetic or not based on the... uh...based on the prediction. So that's it. And, uh, I think that this is a really simple step to do, you can do it on your own, you can test it. And I think that I have finished, so thank you. [CARLOTTA]: Uh, yes, we are running out of time I just wanted to thank you, John, for this demo, for going through all these steps to um, create, train a classification model, and also deploy it as a predictive service. And I encourage you all to go back to the learn module and, um, deepen all these topics at your own pace, and also maybe uh do this demo on your own, on your subscription on your Azure for Student subscription. Um... And I would also like to recall that this is part of a series of study sessions of Cloud Skill Challenge study sessions, so you will have more in the... in the following days, and this is for you to prepare, let's say, to help you in taking the Cloud Skills Challenge, which collect a very interesting learn module that you can use to scale up on various topics, and some of them are focused on AI and ML. So if you are interested in these topics, you can select these these learn modules. So let me also copy the link, the short link to the challenge in the chat. Remember that you have time until the 13th of September to take the challenge. And also remember that in October, on the 7th of October, you have the—you can join the student—the Student Developer Summit, which is, uh, which will be a virtual or in...for some for some cases a hybrid event, so stay tuned, because you will have some surprises in the following days. And if you want to learn more about this event you can check the Microsoft Imaging Cap Twitter page and stay tuned. So thank you everyone for joining this session today, and thank you very much, John, for co-hosting with this session with me. It was a pleasure. [JOHN]: Thank you so much, Carlotta, for having me with you today, and thank you for giving me this opportunity to be with you here. [CARLOTTA]: Great, thank you. [JOHN]: Yeah, I hope that we work again in the future. [CARLOTTA]: Sure, I hope so as well. Um, so, thank you everyone. And have a nice rest of your day. Bye-bye. Speak to you soon. [JOHN]: Bye.