[CARLOTTA]: Great, so I think we can start

since the meeting is recorded, so if

everyone, uh, jump-jumps in later, they

can watch the recording.

So, hi everyone and welcome to this

um, Cloud Skill Challenge study session

around a create classification models

with Azure Machine learning designer.

So today I'm thrilled to be here with

John. Uh, John do you mind

introduce briefly yourself?

[JOHN]: Uh, thank you Carlotta.

Hello everyone.

Welcome to our workshop today. I hope

that you are all excited for it. I am

John Aziz, a gold Microsoft Learn student

ambassador, and I will be here with, uh,

Carlotta to do the practical part

about this module of the Cloud Skills

Challenge. Thank you for having me.

[CARLOTTA]: Perfect, thanks John.

So for those who

don't know me, I'm Carlotta Castelluccio,

based in Italy and focused on AI

machine learning technologies and about

the use in education.

Um, so,

um this Cloud Skill Challenge study

session is based on a learn module, a

dedicated learn module. I sent to you, uh

the link to this module, uh, in the chat

in a way that you can follow along the

module if you want, or just have a look at

the module later at your own pace.

Um...

So, before starting I would also like to

remember to remember you, uh, the code of

conduct and guidelines of our student

ambassadors community. So please during this

meeting be respectful and inclusive and

be friendly, open, and welcoming and

respectful of other-each other

differences.

If you want to learn more about the code

of conduct, you can use this link in the

deck: aka.ms/SACoC.

And now we are,

um, we are ready to to start our session.

So as we mentioned it we are going to

focus on classification models and Azure ML,

uh, today. So, first of all, we are going

to, um, identify, uh, the kind of

um, of scenarios in which you should

choose to use a classification model.

We're going to introduce Azure Machine

Learning and Azure Machine Designer.

We're going to understand, uh, which are

the steps to follow, to create a

classification model in Azure Machine

Learning, and then John will,

um,

lead an amazing demo about training and

publishing a classification model in

Azure ML Designer.

So, let's start from the beginning. Let's

start from identifying classification

machine learning scenarios.

So, first of all, what is classification?

Classification is a form of machine

learning that is used to predict which

category or class an item belongs to. For

example, we might want to develop a

classifier able to identify if an

incoming email should be filtered or not

according to the style, the sender, the

length of the email, etc.

In this case, the

characteristics of the email are the

features.

And the label is a classification of

either a zero or one, representing a spam

or non-spam for the incoming email. So

this is an example of a binary

classifier. If you want to assign

multiple categories to the incoming

email like work letters, love letters,

complaints, or other categories, in this

case a binary classifier is no longer

enough, and we should develop a

multi-class classifier. So classification

is an example of what is called

supervised machine learning

in which you train a model using data

that includes both the features and

known values for label

so that the model learns to fit the

feature combinations to the label. Then,

after training has been completed, you

can use the train model to predict

labels for new items for-for which the

label is unknown.

But let's see some examples of scenarios

for classification machine learning

models. So, we already mentioned an

example of a solution in which we would

need a classifier, but let's explore

other scenarios for classification in

other industries. For example, you can use

a classification model for a health

clinic scenario, and use clinical data to

predict whether patient will become sick

or not.

You can use, um...

[NO AUDIO]

[JOHN]: Carlotta, you are muted.

[CARLOTTA]: Oh, sorry. 
So, when I became muted, it's a

long time, or?

[JOHN]: You can use-you can use, uh

some models for classification.

For example, you can use...

You were saying this.

[CARLOTTA]: Uh, so I was in this deck,

or the previous one?

[JOHN]: This one, you have been muted

for, uh, one second [LAUGHS].

[CARLOTTA]: Okay, okay perfect, perfect.

Uh, yeah I was talking...sorry for

that. So, I was talking about the possible

scenarios in which you,

you can use a classification model. Like

have clinic scenario, financial scenario,

or the third one is business type of

scenario. You can use characteristics of

small business to predict if a new

venture will succeed or not, for

example. And these are all types of

binary classification.

Uh, but today we are also going to talk

about Azure Machine Learning. So let's

see.

What is Azure Machine Learning? So

training and deploying an effective

machine learning model involves a lot of

work, much of it time-consuming and

resource intensive. So, Azure Machine

Learning is a cloud-based service that

helps simplify some of the tasks it

takes to prepare data, train a model, and

also deploy it as a predictive service.

So it helps that the scientists increase

their efficiency by automating many of

the time-consuming tasks associated to

creating and training a model.

And it enables them also to use

cloud-based compute resources that scale

effectively to handle large volumes of

data while incurring costs only when

actually used.

To use Azure Machine Learning, you,

first thing's first, you need to create a

workspace resource in your Azure

subscription, and you can then use these

workspace to manage data, compute

resources, code models and other

artifacts after you have created an

Azure Machine Learning workspace,

you can develop solutions with the

Azure Machine Learning service,

either with developer

tools or the Azure Machine Learning

studio web portal.

In particular, 
Azure Machine Learning studio

is a web portal for machine

learning solutions in Azure, and it

includes a wide range of features and

capabilities that help data scientists

prepare data, train models, publish

predictive services, and monitor also

their usage.

So to begin using the web portal, you need

to assign the workspace

you created in the Azure portal

to the Azure Machine

Learning studio.

At its core, Azure Machine Learning is a

service for training and managing

machine learning models for which you

need compute resources on which to run

the training process.

Compute targets are, um, one of the main

basic concepts of Azure Machine Learning.

They are cloud-based resources on which

you can run model training and data

exploration processes.

So in Azure Machine Learning studio, you

can manage the compute targets for your

data science activities, and there are

four kinds of of compute targets you can

create. We have the compute instances,

which are vital machine set up for

running machine learning code during

development, so they are not designed for

production.

Then we have compute clusters, which are

a set of virtual machines that can scale

up automatically based on traffic.

We have inference clusters, which are

similar to compute clusters, but they are

designed for deployment, so they are

deployment targets for predictive

services that use trained models.

And finally, we have attached compute,

which are any compute target that you

manage yourself outside of Azure ML, like,

for example, virtual machines or Azure

data bricks clusters.

So we talked about Azure Machine

Learning, but we also mentioned-

mentioned Azure Machine Learning

designer. What is Azure Machine Learning

designer? So, in Azure Machine Learning

Studio, there are several ways to author

classification machine learning models.

One way is to use a visual interface, and

this visual interface is called designer,

and you can use it to train, test, and

also deploy machine learning models. And

the drag-and-drop interface makes use of

clearly defined inputs and outputs that

can be shared, reused, and also version

control.

And using the designer, you can identify

the building blocks or components needed

for your model, place and connect them on

your canvas, and run a machine learning

job.

So,

each designer project, so each project

in the designer is known as a pipeline.

And in the design, we have a left panel

for navigation and a canvas on your

right hand side in which you build your

pipeline visually. So pipelines let you

organize, manage, and reuse complex

machine learning workflows across

projects and users.

A pipeline starts with the data set from

which you want to train the model

because all begins with data when

talking about data science and machine

learning. And each time you run a

pipeline, the configuration of the

pipeline and its results are stored in

your workspace as a pipeline job.

So the second main concept of Azure

Machine Learning is a component. So, going

hierarchically from the pipeline, we can

say that each building block of a

pipeline is called a component.

In other words, an Azure Machine

Learning component encapsulates one step

in a machine learning pipeline. So, it's a

reusable piece of code with inputs and

outputs, something very similar to a

function in any programming language.

And in a pipeline project, you can access

data assets and components from the left

panels

Asset Library tab, as you can see

here in the screenshot in the deck.

So you can create data assets on using

an ADOC page called Data Page. And a data

asset is a reference to a data source

location.

So this data source location could be a

local file, a data store, a web file or

even an Azure open asset.

And these data assets will appear along

with standard sample data set in the

designers Asset Library.

Um.

Another basic concept of Azure ML is

Azure Machine Learning jobs.

So, basically, when you submit a pipeline,

you create a job which will run all the

steps in your pipeline. So a job executes

a task against a specified compute

target.

Jobs enable systematic tracking for your

machine learning experimentation in

Azure ML.

And once a job is created, Azure ML

maintains a run record, uh, for the

job.

Um, but, let's move to the classification

steps. So,

um, let's introduce how to create a

classification model in Azure ML, but you

will see it in more details in a

handsome demo that John will guide

through in a few minutes.

So, you can think of the steps to train

and evaluate a classification machine

learning model as four main steps. So

first of all, you need to prepare your

data. So, you need to identify the

features and the label in your data set,

you need to pre-process, so you need to

clean and transform the data as needed.

Then, the second step, of course, is

training the model.

And for training the model, you need to

split the data into two groups: a

training and a validation set.

Then you train a machine learning model

using the training data set and you test

the machine learning model for

performance using the validation data

set.

The third step is performance evaluation,

which means comparing how close the

model's predictions are to the known

labels and these lead us to compute some

evaluation performance metrics.

And then finally...

So, these three steps are not,

um, not performed every time in a

linear manner. It's more an iterative

process. But once you obtain, you achieve

a performance with which you are

satisfied, so you are ready to, let's say

go into production, and you can deploy

your train model as a predictive service

into a real-time, uh, to a real-time

endpoint. And to do so, you need to

convert the training pipeline into a

real-time inference pipeline, and then

you can deploy the model as an

application on a server or device so

that others can consume this model.

So let's start with the first step, which

is prepare data. Real-world data can contain

many different issues that can affect

the utility of the data and our

interpretation of the results. So also

the machine learning model that you

train using this data. For example, real-

world data can be affected by a bad

recording or a bad measurement, and it

can also contain missing values for some

parameters. And Azure Machine Learning

designer has several pre-built

components that can be used to prepare

data for training. These components

enable you to clean data, normalize

features, join tables, and more.

Let's come to training. So, to train a

classification model you need a data set

that includes historical features, so the

characteristics of the entity for which

one to make a prediction, and known label

values. The label is the class indicator

we want to train a model to predict.

And it's common practice to train a

model using a subset of the data while

holding back some data with which to

test the train model. And this enables

you to compare the labels that the model

predicts with the actual known labels in

the original data set.

This operation can be performed in the

designer using the split data component

as shown by the screenshot here in the...

in the deck.

There's also another component that you

should use, which is the score model

component to generate the predicted

class label value using the validation

data as input. So once you connect all

these components,

the component specifying the

model we are going to use, the split data

component, the trained model component,

and the score model component, you want

to run a new experiment in

Azure ML, which will use the data set

on the canvas to train and score a model.

After training a model, it is important,

we say, to evaluate its performance, to

understand how bad-how good sorry

our model is performing.

And there are many performance metrics

and methodologies for evaluating how

well a model makes predictions. The

component to use to perform evaluation

in Azure ML designer is called, as

intuitive as it is, Evaluate Model.

Once the job of training and evaluation

of the model is completed, you can review

evaluation metrics on the completed job

page by right clicking on the component.

In the evaluation results, you can also

find the so-called confusion Matrix that

you can see here in the right side of

this deck

A confusion matrix shows cases where

both the predicted and actual values

were one, the so-called true positives

at the top left and also cases where

both the predicted and the actual values

were zero, the so-called true negatives

at the bottom right. While the other

cells show cases where the predicting

and actual values differ,

called false positive and false

negatives, and this is an example of a

confusion matrix for a binary classifier.

While for a multi-class classification

model the same approach is used to

tabulate each possible combination of

actual and predictive value counts. So

for example, a model with three possible

classes would result in three times

three matrix.

The confusion matrix is also useful for

the matrix that can be derived from it,

like accuracy, recall, or precision.

We say that the last step is

deploying the train model to a real-time

endpoint as a predictive service. And in

order to automate your model into a

service that makes continuous

predictions, you need, first of all, to

create and then deploy an

inference pipeline. The process of

converting the training pipeline into a

real-time inference pipeline removes

training components and adds web service

inputs and outputs to handle requests.

And the inference pipeline performs...they

seem that the transformation is the

first pipeline, but for new data. Then it

uses the train model to infer or predict

label values based on its feature.

So, I think I've talked a lot for now

I would like to let John show us

something in practice with

the hands-on demo, so please, John, go

ahead, share your screen and guide us

through this demo of creating a

classification with

the Azure Machine Learning designer.

[JOHN]: Thank you so much Carlotta for

this interesting explanation of the

Azure ML designer. And now,

um, I'm going to start with you in the

practical demo part, so if you want to

follow along, go to the link that Carlotta

sent in the chat so you can do

the demo or the practical part with me.

I'm just going to share my screen...

and...

...go here. So, uh...

Where am I right now? I'm inside the

Microsoft Learn documentation. This is

the exercise part of this module, and we

will start by setting two things, which

are a prequisite for us to work inside

this module, which are the users group

and the Azure Machine Learning workspace,

and something extra which is the compute

cluster that Carlotta talked about. So I

just want to make sure that you all have

a resource group created inside your

portal inside your Microsoft Azure

platform. So this is my resource group.

Inside this is this Resource Group. I

have created an Azure Machine Learning

workspace. So I'm just going to access

the workspace that I have created

already from this link. I am going to

open it, which is the studio web URL, and

I will follow the steps. So what is this?

This is your machine learning workspace,

or machine learning studio. You can do a

lot of things here, but we are going to

focus mainly on the designer and the

data and the compute. So another

prerequisite here, as Carlotta told you,

we need some resources to power up the

classification, the processes that

will happen.

So, we have created this computing

cluster,

and we have set some presets for

it. So

where can you find this preset? You go

here. Under the create compute, you'll

find everything that you need to do. So

the size is the Standard DS11 Version 2,

and it's a CPU not GPU, because we don't

know the GPU, and we don't need a GPU.

Uh, it is ready for us to use.

The next thing which we will look into

is the designer. How can you access the

designer?

You can either click on this icon or

click on the navigation menu and click

on the designer for me.

Now I am inside my designer.

What we are going to do now is the

pipeline that Carlotta told you about.

And from where can I know these steps? If

you follow along in the learn module, you

will find everything that I'm doing

right now in detail, with screenshots

of course. So I'm going to create a new

pipeline, and I can do so by clicking on

this plus button.

It's going to redirect me to the

designer authoring the pipeline, uh, where

I can drag and drop data and components

that Carlotta told you the difference

between.

And here I am going to do some changes

to the settings. I am going to connect

this with my compute cluster that I

created previously so I can utilize it.

From here I'm going to choose this

compute cluster demo that I have showed

you before in the clusters here,

and I am going to change the name to

something more meaningful. Instead of

byline and the date of today I'm going

to name it Diabetes...

uh...

let's just check this training.

Let's say Training 0.1 or 01, okay?

And I am going to close this tab in

order to have a bigger place to work

inside because this is where we will

work, where everything will happen. So I

will click on close from here,

and I will go to the data and I will

create a new data set.

How can I create a new data set? There is

multiple options here you can find, from

local files, from data store, from web

files, from open data set, but I'm going

to choose from web files, as this is the

way we're going to create our data.

From here, the information of my data set

I'm going to get them from the Microsoft

Learn module. So if we go to the step

that says "Create a dataset",

under it, it illustrates that you can

access the data from inside the asset

library, and inside your asset library,

you'll find the data and find the

component. And I'm going to select

this link because this is where my data

is stored. If you open this link, you will

find this is a CSV file, I think.

Yeah. And you can...like, all the data are

here.

Now let's get back..

Um...

And you are going to do something

meaningful, but because I have already

created it before twice, so I'm gonna

add a number to the name

The data set is tabular and there is

the file, but this is a table, so we're

going to choose the table.

Data type

for data set type.

Now we will click on "Next". That's gonna

review, or display for you the content

of this file that you have

imported to this workspace.

And for these settings, these are

related to our file format.

So this is a delimited file, and it's not

plain text, it's not a Jason. The delimiter

is common, as we have seen that they

[INDISTINGUISHABLE]

So I'm choosing common

errors because the only the first five...

[INDISTINGUISHABLE]

...for example. Okay, uh, if you have any

doubts, if you have any problems, please

don't hesitate to write me

in the chat,

like, what is blocking you, and

me and Carlotta will try to help you,

like whenever possible.

And now this is the new preview for my

data set. I can see that I have an ID, I

have patient ID, I have pregnancies, I

have the age of the people,

I have the body mass, I think

whether they have diabetes or not, as a

zero and one. Zero indicates a negative,

the person doesn't have diabetes, and one

indicates a positive, that this person

has diabetes. Okay.

Now I'm going to click on "Next". Here I am

defining my schema. All the data types

inside my columns, the column names, which

columns to include, which to exclude. And

here we will include everything except

the path of the bath color. And we are

going to review the data types of each

column. So let's review this first one.

This is numbers, numbers, numbers, then it's the

integer. And this is,

um, like decimal..

...dotted...

decimal number. So we are going to choose

this data type.

And for this one

it says diabetic, and it's a zero under

one, and we are going to make it as

integers.

Now we are going to click on "Next" and

move to reviewing everything. This is

everything that we have defined together.

I will click on "Create".

And...

now the first step has ended. We have

gotten our data ready.

Now...what now? We're going to utilize the

designer...

um...power. We're going to drag and drop

our data set to create the pipeline.

So I have clicked on it and dragged it

to this space. It's gonna appear to you.

And we can inspect it by right clicking and

choose "Preview data"

to see what we have created together.

From here, you can see everything that we

have seen previously, but in more

details. And we are just going to close

this. Now what? Now we are gonna do the

processing that Carlota mentioned.

These are some instructions about the

data, about how you can look at them, how you

can open them but we are going to move

to the transformation or the processing.

So as Carlotta told you, like any data

for us to work on we have to do some

processing to it

to make it easy easier for the model to

be trained and easier to work with. So, uh,

we're gonna do the normalization. And

normalization meaning is, uh,

to scale our data, either down or up, but

we're going to scale them down,

and we are going to decrease, uh,

relatively decrease

the values, all the values, to work

with lower numbers. And if we are working

with larger numbers, it's going to take

more time. If we're working with smaller

numbers, it's going to take less time to

calculate them, and that's it. So

where can I find the normalized data? I

can find it inside my component.

So I will choose the component and

search for "Normalized data".

I will drag and drop it as usual and I

will connect between these two things

by clicking on this spot, this, uh,

circuit, and

drag and drop onto the next circuit.

Now we are going to define our

normalization method.

So I'm going to double click on the

normalized data.

It's going to open the settings for the

normalization

as a better transformation method, which is

a mathematical way

that is going to scale our data

according to.

We're going to choose min-max, and for

this one, we are going to choose "Use Zero",

for constant column we are going to

choose "True",

and we are going to define which columns

to normalize. So we are not going to

normalize the whole data set. We are

going to choose a subset from the data

set to normalize. So we're going to

choose everything except for the patient

ID and the diabetic, because the patient

ID is a number, but it's a categorical

data. It describes a patient, it's not a

number that I can sum. I can't say "patient

ID number one plus patient ID number two".

No, this is a patient and another

patient, it's not a number that I can do

mathematical operations on, so I'm not

going to choose it. So we will choose

everything as I said, except for the

diabetic and the patient ID. I will

click on "Save".

And it's not showing me a warning again,

everything is good.

Now I can click on "Submit"

and review my normalization output.

Um.

So, if you click on "Submit" here,

you will choose "Create new" and

set the name that is mentioned here

inside the notebook. So it tells you

to create a job and name it, name

the experiment "MS Learn Diabetes

Training", because you will continue

working on and building component later.

I have it already created, I am the, uh,

we can review it together. So let

me just open this in another tab. I think

I have it...

here.

Okay.

So, these are all the jobs that I have

created.

All the jobs there. Let's do this over.

These are all the jobs that I have

submitted previously.

And I think this one is the

normalization job, so let's see the

output of it.

As you can see, it says, uh, "Check mark", yes,

which means that it worked, and we can

preview it. How can I do that? Right click

on it, choose "Preview data",

and as you can see all the data are

scaled down

so everything is between zero

and, uh, one I think.

So everything is good for us. Now we

can move forward to the next step

which is to create the whole pipeline.

So, uh, Carlota told you that

we're going to use a classification

model to create this data set, so let

me just drag and drop everything

to get runtime and we're doing

[INDISTINGUISHABLE]

about everything by

[INDISTINGUISHABLE]

So,

as a result, we are going to explain

[INDISTINGUISHABLE]

Yeah. So, I'm going to give this split

data. I'm going to take the

transformation data to split data and

connect it like that.

I'm going to get three model

components because I want to train my

model,

and I'm going to put it right here.

Okay.

Let's just move it down there. Okay.

And we are going to use a classification

model,

a two class

logistic regression model.

So I'm going to give this algorithm to

enable my model to work

This is the untrained model, this is...

here.

The left...

the left, uh, circuit, I'm going to

connect it to the data set, and the right

one, we are going to connect it to

evaluate model.

Evaluate model...so let's search for

"Evaluate model" here.

So because we want to do what...we want to

evaluate our model and see how it it has

been doing. Is it good, is it bad?

Um, sorry...

This is...

this is down there

after the score model.

So we have to get the score model first,

so let's get it.

And this will take the trained model and

the data set

to score our model and see if it's

performing good or bad.

And...

um...

after that, we have finished

everything. Now, we are going to do the what?

The presets for everything.

As a starter, we will be splitting our

data. So

how are we going to do this, according to

what? To the split rules. So I'm going to

double-click on it and choose "Split rules".

And the percentage is

70 percent for the [INSISTINGUASHABLE]

and 30 percent of the

data for

the valuation or for the scoring, okay?

I'm going to make it a randomization, so

I'm going to split data randomly and the

seat is, uh,

132, uh 23 I think...yeah.

And I think that's it.

The split says why this holds, and that's

good.

Now for the next one, which is the train

model we are going to connect it as

mentioned here.

And we have done that and...then why

am I having here? Let's double click

on it...yeah. It has...it needs the

label column that I am trying to predict.

So from here, I'm going to choose

diabetic. I'm going to save.

I'm going to close this one.

So it says here,

the diabetic label, the model, it will

predict the zero and one, because this is

a binary classification algorithm, so

it's going to predict either this or

that.

And...

um...

I think that's everything to run the the

pipeline.

So everything is done, everything is good

for this one. We're just gonna leave it

for now, because this is the next

step.

Um, this will be put instead of the

score model, but let's...

let's delete it for now.

Okay.

Now we have to submit the job in order

to see the output of it. So I can click

on "Submit" and choose the previous job

which is the one that I have showed you

before.

And then let's review its output

together here.

So if I go to the jobs,

if I go to MS Learn, maybe it is training?

I think it's the one that lasted the

longest, this one here.

So here I can see

the job output, what happened inside

the model, as you can see.

So the normalization we have seen

before, the split data, I can preview it.

The result one or the result two as it

splits the data to 70 here and

thirty percent here.

Um, I can see the score model, which is

something that we need

to review.

Inside the scroll model, uh, from

here,

we can see that...

let's get back here.

This is the data that the model has

been scored and this is a scoring output.

So it says "code label true", and he is

not diabetic, so this is,

um,

a wrong prediction, let's say.

For this one it's true and true, and this

is a good, like, what do you say,

prediction, and the probabilities of this

score,

which means the certainty of our model

of that this is really true. It's 80 percent.

For this one it's 75 percent.

So these are some cool metrics that we

can review to understand how our model

is performing. It's performing good for

now.

Let's check our evaluation model.

So this is the extra one that I told you

about. Instead of the

score model only, we are going to add

what evaluate model

after it. So here

we're going to go to our Asset Library

and we are going to choose the evaluate

model,

and we are going to put it here, and we

are going to connect it, and we are going

to submit the job using the same name of

the job that we used previously.

Let's review it. Also, so, after it

finishes, you will find it here. So I have

already done it before, this is how I'm

able to see the output.

So let's see

what is the output of this

evaluation process.

Here it mentioned to you that there are

some matrix,

like the confusion matrix, which Carlotta

told you about, there is the accuracy, the

precision, the recall, and F1 Score.

Every matrix gives us some insight about

our model. It helps us to understand it

more, and, um,

understand if it's overfitting, if

it's good, if it's bad, and really really,

like, understand how it's working.

Now I'm just waiting for the job to load.

Until it loads,

um,

we can continue

to work on our

model. So I will go to my designer. I'm

just going to confirm this.

And I'm going to continue working on it

from

where we have stopped. Where have we

stopped?

we have stopped on the evaluate model. So

I'm going to choose this one.

And it says here

"select experiment", "create inference

pipeline", so

I am going to go to the jobs,

I'm going to select my experiment.

I hope this works.

Okay. Finally, now we have our

evaluate model output.

Let's preview evaluation results

and, uh...

come on.

Finally. Now we can create our inference

pipeline. So,

I think it says that...

um...

select the experiment, then select MS

Learn. So,

I am just going to select it,

and finally. Now we can, the ROC curve, we

can see it, that the true positive rate

and the force was integrate. The false

positive rate is increasing with time,

and also the true positive rate. True

positive is something that it predicted,

that it is, uh, positive it has diabetes,

and it's really...it's really true.

The person really has diabetes. Okay. And

for the false positive, it predicted that

someone has diabetes and someone doesn't

have it. This is what true position and

false positive means. This is the record

curve, so we can review the metrics

of our model. This is the lift curve. I

can change the threshold of my confusion

matrix here

and if Carlotta wants to add

anything about the...the graphs,

you can do so.

[CARLOTTA]: Um, yeah, so I just

wanted to...if you go...yeah.

I just wanted to comment for the

RSC curve, that actually from this

graph, the metric which usually we're

going to compute is the area under

under the curve. And this coefficient or

metric,

it's a coefficient—

it's a value that could span from

zero to one and the the highest is...

...the highest is the the score.

So the closest one,

so the the highest is the amount of

area under this curve.

The highest performance

we've got from from our model.

And another thing is what John is

playing with. So this threshold for

the logistic

regression is the threshold used by the

model to, um,

to predict if the category is zero or

one. So if the probability—the

probability score is above the threshold,

then the category will be predicted as

one, while if the probability is

below the threshold, in this case, for

example, 0.5, the category is predicted

as zero. So that's why it's very

important to choose the threshold,

because the performance really can vary,

um,

with this threshold value.

[JOHN]: Thank you so much, Carlotta, and

as I mentioned now, we are going to

create our inference pipeline. So we are

going to select the latest one, which I

already have it opened here. This is the

one that we were reviewing together. This

is where we have stopped, and we're going

to create an inference pipeline. We are

going to choose a real-time inference

pipeline, okay?

From where I can find this? Here, as it

says, "Real-time inference pipeline".

So it's gonna add some things to my

workspace. It's going to add the

web service input, it's gonna

have the web service output,

because we will be creating

it as a web service to access

it from the internet.

What are we going to do? We're going

to remove this diabetes data, okay?

And we are going to get a component

called "Web

input" and...let me check

it's "enter data manually".

We have...we already have that with input

present.

So we are going to get the entire data

manually,

and we're going to collect it—to connect

it as it was connected before, like that.

And also, I am not going to directly take

the web service—sorry, escort model to

the web service output like that.

I'm going to delete this

and I'm going to execute a python script

before

I display my result.

So,

this will be connected like...

So...

the other way around.

And from here, I am going to connect this

with that and there is some data that

we will be getting from the node, or from

the explanation here, and this is the

data that will be entered to our

website manually. Okay? This is instead of

the data that we have been getting from

our data set that we created. So I'm just

going to double click on it and choose

CSV, and I will choose "it has headers",

and I will take or copy this content and

put it there, okay?

So let's do it.

I think I have to click on edit code, now

I can click on "Save", and I can close it.

Another thing which is the python script

that we will be executing.

Um, yeah. We

are going to remove this, also.

We don't need the evaluate model

anymore, so we are going to remove it.

The python script

that I will be executing,

I can find it here.

Um, yeah.

This is the python script that we will

execute. And it says to you that this

code selects only the patient's ID

the score label, the score

probability and return—returns them to

the web service output. So we don't want

to return all the columns, as we have

seen previously,

that determines everything,

so

we want to return certain stuff, the

stuff that we will use inside our

endpoint. So I'm just going to select

everything and delete it, and

paste the code that I have gotten from

the, uh,

the Microsoft Learn docs.

Now I can click on "Save", and I can close

this.

Let me check something,

I don't think it saved.

It's saved, but the display is

wrong, okay.

And now I think everything is good to go.

I'm just gonna double-check everything.

So, uh, yeah. We are gonna change the name

of this

pipeline, and we are gonna call it

"Predict

diabetes", okay?

Now let's close it, and

I think that we are good to go. So,

um,

Okay, I think everything is good for us.

I just want to make sure of something.

Is the data...

it's correct, the data is...yeah,

it's correct.

Okay, now I can run the pipeline. Let's

submit.

Select an "existing" pipeline, and we're

going to choose

the "ms-learn-diabetes-training",

which is the pipeline

that we have been working on

from the beginning of this module.

I don't think that this is going to take

much time. So we have submitted the job

and it's running.

Until the job ends, we are going to set

everything

for deploying a service.

In order to deploy a service,

um,

I have to have the job ready, so

until it's ready, you can't deploy it. So

let's go to the job—the job details from

here, okay?

And until it finishes,

Carlotta, do you think that we can have

the questions, and then we can get back

to the job I'm deploying it?

[CARLOTTA]: Yeah, yeah, yeah.

So yeah, guys, if you

have any questions

on what you just saw here

or into introductions, feel free. This is

a good moment, we can...we can discuss

now, while we wait for this job to

finish.

[JOHN]: Uh, and....

can...

we have the knowledge check one? Or, like,

what do you think?

[CARLOTTA]: Yeah, we can also go

to the knowledge check.

Um...

Yeah, okay. So let me share my screen.

Yeah, so if you have not any questions

for us, we can maybe propose some

questions to you that you can,

um,

check our knowledge so far and you

can maybe answer to these questions

via chat.

So we have...do you see my screen, can

you see my screen?

[JOHN]: Yes.

[CARLOTTA]: So, John, I think I will

read this

question aloud and ask it to you, okay? So

are you ready to answer?

[JOHN:] Yes I am.

[CARLOTTA]: So...

you're using Azure Machine Learning

designer to create a training pipeline

for a binary classification model, so

what we were doing in our demo,

right? And you have added a data set

containing features and labels, a Two-

Class Decision Forest module. So we used

a logistic regression model our...

um, in our example.

Here, we're using a Two-

Class Decision Forest model.

And, of course, a Train Model module. You

plan now to use score model and evaluate

model modules to test the train model

with the subset of the data set that

wasn't used for training.

But what are we missing? So what's

another model you should add? We have

three options: we have Join Data, we have

Split Data, or we have Select Columns

in Dataset.

So

while John thinks about the answer,

go ahead and,

um,

answer yourself. So give us your

guess.

Put it in the chat, or just come off mute

and answer.

"A", "B".

[JOHN]: Yeah, what do you

is the correct

answer for this one? I need something to

uh...I have to score my model, and I

have to evaluate it, so I need

something to enable me to do these two

things.

[CARLOTTA]: I think it's something

you showed us in your pipeline,

right John?

[JOHN]: Of course I did.

[CARLOTTA]: Uh, we have no guesses

in the chat?

[JOHN]: Can someone...

Someone want to guess?

[CARLOTTA]: We have a "B".

[JOHN]: Uh, maybe.

So, in order to do this,

I mentioned the

the module that is going to help me

to divide my data into two things:

70 percent for the

the training and 30 percent for the

evaluation. So what did I use? I used

split data, because this is what is going

to split my data randomly into training

data and validation data. So the correct

answer is "B", and good job. Thank you

for participating.

Next question, please.

[CARLOTTA]: Yes, "B" is the correct

answer, so thanks, John,

for explaining to us the correct

one.

And we want to go with question two?

[JOHN]: Yeah, so, 
I'm going to ask you now,

Carlotta. You use Azure Machine Learning

designer to create a training pipeline

for your classification model.

What must you do before you deploy this

model as a service?
You have to do

something before

you deploy it.

What do you think is the correct answer?

Is it "A", "B", or "C"?

Share your thoughts with—

with us in the chat and

and I'm also going to give you some

minutes to think of it before I

tell you about it.

[CARLOTTA]: Yeah so let me go

through the possible

answers, right? So we have A: "Create an

inference pipeline from the training

pipeline";

B: we have "Add an Evaluate Model

module to the training pipeline; and then

three, we have "Clone the training

pipeline with a different name".

So what do you think is the correct

answer? "A", "B", or "C"?

Also this time, I think it's something

we mentioned both in the decks and in

the demo right?

[JOHN]: Yes it is,

it's something that I have done

like two, like five minutes ago.

It's real-time, real-time.

[CARLOTTA]: Um,

yeah, so, think about...you need to deploy

the model as a service. So if I'm

going to deploy model,

I cannot evaluate the model

after deploying it, right, because I

cannot go into production if I'm not

sure, I'm not satisfied with my model, and

I'm not sure that my model is performing

well.

So that's why I would go with,

um,

I would...exclude "B" from my

answer.

While

thinking about "C", uh, I don't see you—I

didn't see you, John, cloning the

training Pipeline with a different name,

so I don't think this is the

right answer.

While I've seen you creating an

inference pipeline from the

training pipeline, and you just converted

it using a one-click button, right?

[JOHN]: Yeah, that's correct.

So this is the right answer.

Good job. So I created an inference

real-time pipeline, and it has done.

It finished—it finished, the job is

finished. So we can now deploy.

And...

Yeah [LAUGHS].

Exactly, like, on time.

Like, it finished two seconds...

three, four seconds ago [LAUGHS].

So, uh,

until, um...

This is my job review, so

this is the job details that I

have already submitted, it's just opening,

and once it opens...

um...

I don't know why it's so heavy

today, it's not like that usually.

[CARLOTTA]: Yeah, it's probably because

you are also

showing your your screen on Teams,

so that's the bandwidth of your

connection.

[JOHN]: Let me do something here

because...yeah finally.

I can switch to my mobile internet if it

did it again. So I will click on "Deploy",

it's that simple. I'll just click on

"Deploy" and...

I am going to deploy a new real-time

endpoint.

So what I'm going to name it?

Description and the compute type.

Everything is already mentioned

for me here,

so I'm just gonna copy and paste it,

because we...we are running

out of time.

So it's all Azure Container Instance,

not Azure Kubernetes Service,

which is a containerization service also.

Both are for containerization, but this

gives you something, and this gives you
something else.

For the advanced options,

it doesn't say for us to do anything, so

we are just gonna click on "Deploy",

and now we can test our endpoint from

the endpoints that we can find here, so

it's in progress. If I go here

under the assets, I can find something

called "Endpoints", and I can find the

real-time ones and the batch endpoints.

And we have created a real-time endpoint,

so we are going to find it under this

title. So if I click on it, I should

be able to test it once it's ready.

It's still loading, but this is the

input, and this is the output that we

will get back, so if I click on "Test"...

and from here,

I will input some data to the

endpoint,

which are: the patient information; the

columns that we have already seen in our

data set; the patient ID; the pregnancies.

And of course, of course I'm not gonna

enter the label that I'm trying to

predict, so I'm not going to give him if

the patient is diabetic or not. This

endpoint is to tell me this.

The endpoint, or the URL,

is going to give me

back this information, whether someone

has diabetes, or he doesn't. So if I input

this data, I'm just going to copy it,

and go to my endpoint, and click on

"Test", I'm gonna give the result pack,

which are the three columns that we have

defined inside our python script: the

patient ID, the diabetic prediction, and

the probability—the certainty of whether

someone is diabetic or not based on the...

uh...based on the prediction.

So that's it.

And, uh, I think that this is a really

simple step to do, you can do it on your

own, you can test it.

And I think that I have finished, so

thank you.

[CARLOTTA]: Uh, yes,

we are running out of time

I just wanted to thank you, John, for

this demo, for going through all these

steps to

um, create, train a classification model,

and also deploy it as a predictive

service. And I encourage you all to go

back to the learn module

and, um, deepen all these topics

at your own pace, and also maybe

uh do this demo on your own, on your

subscription on your Azure for Student

subscription. Um...

And I would also like to recall that

this is part of a series of study

sessions of Cloud Skill Challenge study

sessions,

so you will have more in the...

in the following days, and this is for

you to prepare, let's say, to help you

in taking the Cloud Skills Challenge,

which collect

a very interesting learn module that you

can use to scale up on various topics,

and some of them are focused on AI and

ML. So if you are interested in these

topics, you can select these these learn

modules.

So let me also copy

the link, the short link to the

challenge in the chat. Remember that

you have time until the 13th of

September to take the challenge. And also

remember that in October, on the 7th of

October, you have the—you can join the

student—the Student Developer Summit,

which is, uh, which will be a virtual or

in...for some for some cases a hybrid

event, so stay tuned, because you will

have some surprises in the following

days. And if you want to learn more about

this event you can check the Microsoft

Imaging Cap Twitter page and stay tuned.

So thank you everyone for joining

this session today, and thank you very

much, John, for co-hosting with this

session with me. It was a pleasure.

[JOHN]: Thank you so much,

Carlotta, for having me

with you today, and thank you for

giving me this opportunity to

be with you here.

[CARLOTTA]: Great, thank you.

[JOHN]: Yeah, I hope that we

work again in the future.

[CARLOTTA]: Sure, I hope so as well.

Um, so, thank you everyone.

And have a nice rest of your day.

Bye-bye. Speak to you soon.

[JOHN]: Bye.