-
[CARLOTTA]: Great, so I think we can start
-
since the meeting is recorded, so if
-
everyone, uh, jump-jumps in later, they
-
can watch the recording.
-
So, hi everyone and welcome to this
-
um, Cloud Skill Challenge study session
-
around a create classification models
-
with Azure Machine learning designer.
-
So today I'm thrilled to be here with
-
John. Uh, John do you mind
-
introduce briefly yourself?
-
[JOHN]: Uh, thank you Carlotta.
-
Hello everyone.
-
Welcome to our workshop today. I hope
-
that you are all excited for it. I am
-
John Aziz, a gold Microsoft Learn student
-
ambassador, and I will be here with, uh,
-
Carlotta to do the practical part
-
about this module of the Cloud Skills
-
Challenge. Thank you for having me.
-
[CARLOTTA]: Perfect, thanks John.
-
So for those who
-
don't know me, I'm Carlotta Castelluccio,
-
based in Italy and focused on AI
-
machine learning technologies and about
-
the use in education.
-
Um, so,
-
um this Cloud Skill Challenge study
-
session is based on a learn module, a
-
dedicated learn module. I sent to you, uh
-
the link to this module, uh, in the chat
-
in a way that you can follow along the
-
module if you want, or just have a look at
-
the module later at your own pace.
-
Um...
-
So, before starting I would also like to
-
remember to remember you, uh, the code of
-
conduct and guidelines of our student
-
ambassadors community. So please during this
-
meeting be respectful and inclusive and
-
be friendly, open, and welcoming and
-
respectful of other-each other
-
differences.
-
If you want to learn more about the code
-
of conduct, you can use this link in the
-
deck: aka.ms/SACoC.
-
And now we are,
-
um, we are ready to to start our session.
-
So as we mentioned it we are going to
-
focus on classification models and Azure ML,
-
uh, today. So, first of all, we are going
-
to, um, identify, uh, the kind of
-
um, of scenarios in which you should
-
choose to use a classification model.
-
We're going to introduce Azure Machine
-
Learning and Azure Machine Designer.
-
We're going to understand, uh, which are
-
the steps to follow, to create a
-
classification model in Azure Machine
-
Learning, and then John will,
-
um,
-
lead an amazing demo about training and
-
publishing a classification model in
-
Azure ML Designer.
-
So, let's start from the beginning. Let's
-
start from identifying classification
-
machine learning scenarios.
-
So, first of all, what is classification?
-
Classification is a form of machine
-
learning that is used to predict which
-
category or class an item belongs to. For
-
example, we might want to develop a
-
classifier able to identify if an
-
incoming email should be filtered or not
-
according to the style, the sender, the
-
length of the email, etc.
-
In this case, the
-
characteristics of the email are the
-
features.
-
And the label is a classification of
-
either a zero or one, representing a spam
-
or non-spam for the incoming email. So
-
this is an example of a binary
-
classifier. If you want to assign
-
multiple categories to the incoming
-
email like work letters, love letters,
-
complaints, or other categories, in this
-
case a binary classifier is no longer
-
enough, and we should develop a
-
multi-class classifier. So classification
-
is an example of what is called
-
supervised machine learning
-
in which you train a model using data
-
that includes both the features and
-
known values for label
-
so that the model learns to fit the
-
feature combinations to the label. Then,
-
after training has been completed, you
-
can use the train model to predict
-
labels for new items for-for which the
-
label is unknown.
-
But let's see some examples of scenarios
-
for classification machine learning
-
models. So, we already mentioned an
-
example of a solution in which we would
-
need a classifier, but let's explore
-
other scenarios for classification in
-
other industries. For example, you can use
-
a classification model for a health
-
clinic scenario, and use clinical data to
-
predict whether patient will become sick
-
or not.
-
You can use, um...
-
[NO AUDIO]
-
[JOHN]: Carlotta, you are muted.
-
[CARLOTTA]: Oh, sorry.
So, when I became muted, it's a
-
long time, or?
-
[JOHN]: You can use-you can use, uh
-
some models for classification.
-
For example, you can use...
-
You were saying this.
-
[CARLOTTA]: Uh, so I was in this deck,
-
or the previous one?
-
[JOHN]: This one, you have been muted
-
for, uh, one second [LAUGHS].
-
[CARLOTTA]: Okay, okay perfect, perfect.
-
Uh, yeah I was talking...sorry for
-
that. So, I was talking about the possible
-
scenarios in which you,
-
you can use a classification model. Like
-
have clinic scenario, financial scenario,
-
or the third one is business type of
-
scenario. You can use characteristics of
-
small business to predict if a new
-
venture will succeed or not, for
-
example. And these are all types of
-
binary classification.
-
Uh, but today we are also going to talk
-
about Azure Machine Learning. So let's
-
see.
-
What is Azure Machine Learning? So
-
training and deploying an effective
-
machine learning model involves a lot of
-
work, much of it time-consuming and
-
resource intensive. So, Azure Machine
-
Learning is a cloud-based service that
-
helps simplify some of the tasks it
-
takes to prepare data, train a model, and
-
also deploy it as a predictive service.
-
So it helps that the scientists increase
-
their efficiency by automating many of
-
the time-consuming tasks associated to
-
creating and training a model.
-
And it enables them also to use
-
cloud-based compute resources that scale
-
effectively to handle large volumes of
-
data while incurring costs only when
-
actually used.
-
To use Azure Machine Learning, you,
-
first thing's first, you need to create a
-
workspace resource in your Azure
-
subscription, and you can then use these
-
workspace to manage data, compute
-
resources, code models and other
-
artifacts after you have created an
-
Azure Machine Learning workspace,
-
you can develop solutions with the
-
Azure Machine Learning service,
-
either with developer
-
tools or the Azure Machine Learning
-
studio web portal.
-
In particular,
Azure Machine Learning studio
-
is a web portal for machine
-
learning solutions in Azure, and it
-
includes a wide range of features and
-
capabilities that help data scientists
-
prepare data, train models, publish
-
predictive services, and monitor also
-
their usage.
-
So to begin using the web portal, you need
-
to assign the workspace
-
you created in the Azure portal
-
to the Azure Machine
-
Learning studio.
-
At its core, Azure Machine Learning is a
-
service for training and managing
-
machine learning models for which you
-
need compute resources on which to run
-
the training process.
-
Compute targets are, um, one of the main
-
basic concepts of Azure Machine Learning.
-
They are cloud-based resources on which
-
you can run model training and data
-
exploration processes.
-
So in Azure Machine Learning studio, you
-
can manage the compute targets for your
-
data science activities, and there are
-
four kinds of of compute targets you can
-
create. We have the compute instances,
-
which are vital machine set up for
-
running machine learning code during
-
development, so they are not designed for
-
production.
-
Then we have compute clusters, which are
-
a set of virtual machines that can scale
-
up automatically based on traffic.
-
We have inference clusters, which are
-
similar to compute clusters, but they are
-
designed for deployment, so they are
-
deployment targets for predictive
-
services that use trained models.
-
And finally, we have attached compute,
-
which are any compute target that you
-
manage yourself outside of Azure ML, like,
-
for example, virtual machines or Azure
-
data bricks clusters.
-
So we talked about Azure Machine
-
Learning, but we also mentioned-
-
mentioned Azure Machine Learning
-
designer. What is Azure Machine Learning
-
designer? So, in Azure Machine Learning
-
Studio, there are several ways to author
-
classification machine learning models.
-
One way is to use a visual interface, and
-
this visual interface is called designer,
-
and you can use it to train, test, and
-
also deploy machine learning models. And
-
the drag-and-drop interface makes use of
-
clearly defined inputs and outputs that
-
can be shared, reused, and also version
-
control.
-
And using the designer, you can identify
-
the building blocks or components needed
-
for your model, place and connect them on
-
your canvas, and run a machine learning
-
job.
-
So,
-
each designer project, so each project
-
in the designer is known as a pipeline.
-
And in the design, we have a left panel
-
for navigation and a canvas on your
-
right hand side in which you build your
-
pipeline visually. So pipelines let you
-
organize, manage, and reuse complex
-
machine learning workflows across
-
projects and users.
-
A pipeline starts with the data set from
-
which you want to train the model
-
because all begins with data when
-
talking about data science and machine
-
learning. And each time you run a
-
pipeline, the configuration of the
-
pipeline and its results are stored in
-
your workspace as a pipeline job.
-
So the second main concept of Azure
-
Machine Learning is a component. So, going
-
hierarchically from the pipeline, we can
-
say that each building block of a
-
pipeline is called a component.
-
In other words, an Azure Machine
-
Learning component encapsulates one step
-
in a machine learning pipeline. So, it's a
-
reusable piece of code with inputs and
-
outputs, something very similar to a
-
function in any programming language.
-
And in a pipeline project, you can access
-
data assets and components from the left
-
panels
-
Asset Library tab, as you can see
-
here in the screenshot in the deck.
-
So you can create data assets on using
-
an ADOC page called Data Page. And a data
-
asset is a reference to a data source
-
location.
-
So this data source location could be a
-
local file, a data store, a web file or
-
even an Azure open asset.
-
And these data assets will appear along
-
with standard sample data set in the
-
designers Asset Library.
-
Um.
-
Another basic concept of Azure ML is
-
Azure Machine Learning jobs.
-
So, basically, when you submit a pipeline,
-
you create a job which will run all the
-
steps in your pipeline. So a job executes
-
a task against a specified compute
-
target.
-
Jobs enable systematic tracking for your
-
machine learning experimentation in
-
Azure ML.
-
And once a job is created, Azure ML
-
maintains a run record, uh, for the
-
job.
-
Um, but, let's move to the classification
-
steps. So,
-
um, let's introduce how to create a
-
classification model in Azure ML, but you
-
will see it in more details in a
-
handsome demo that John will guide
-
through in a few minutes.
-
So, you can think of the steps to train
-
and evaluate a classification machine
-
learning model as four main steps. So
-
first of all, you need to prepare your
-
data. So, you need to identify the
-
features and the label in your data set,
-
you need to pre-process, so you need to
-
clean and transform the data as needed.
-
Then, the second step, of course, is
-
training the model.
-
And for training the model, you need to
-
split the data into two groups: a
-
training and a validation set.
-
Then you train a machine learning model
-
using the training data set and you test
-
the machine learning model for
-
performance using the validation data
-
set.
-
The third step is performance evaluation,
-
which means comparing how close the
-
model's predictions are to the known
-
labels and these lead us to compute some
-
evaluation performance metrics.
-
And then finally...
-
So, these three steps are not,
-
um, not performed every time in a
-
linear manner. It's more an iterative
-
process. But once you obtain, you achieve
-
a performance with which you are
-
satisfied, so you are ready to, let's say
-
go into production, and you can deploy
-
your train model as a predictive service
-
into a real-time, uh, to a real-time
-
endpoint. And to do so, you need to
-
convert the training pipeline into a
-
real-time inference pipeline, and then
-
you can deploy the model as an
-
application on a server or device so
-
that others can consume this model.
-
So let's start with the first step, which
-
is prepare data. Real-world data can contain
-
many different issues that can affect
-
the utility of the data and our
-
interpretation of the results. So also
-
the machine learning model that you
-
train using this data. For example, real-
-
world data can be affected by a bad
-
recording or a bad measurement, and it
-
can also contain missing values for some
-
parameters. And Azure Machine Learning
-
designer has several pre-built
-
components that can be used to prepare
-
data for training. These components
-
enable you to clean data, normalize
-
features, join tables, and more.
-
Let's come to training. So, to train a
-
classification model you need a data set
-
that includes historical features, so the
-
characteristics of the entity for which
-
one to make a prediction, and known label
-
values. The label is the class indicator
-
we want to train a model to predict.
-
And it's common practice to train a
-
model using a subset of the data while
-
holding back some data with which to
-
test the train model. And this enables
-
you to compare the labels that the model
-
predicts with the actual known labels in
-
the original data set.
-
This operation can be performed in the
-
designer using the split data component
-
as shown by the screenshot here in the...
-
in the deck.
-
There's also another component that you
-
should use, which is the score model
-
component to generate the predicted
-
class label value using the validation
-
data as input. So once you connect all
-
these components,
-
the component specifying the
-
model we are going to use, the split data
-
component, the trained model component,
-
and the score model component, you want
-
to run a new experiment in
-
Azure ML, which will use the data set
-
on the canvas to train and score a model.
-
After training a model, it is important,
-
we say, to evaluate its performance, to
-
understand how bad-how good sorry
-
our model is performing.
-
And there are many performance metrics
-
and methodologies for evaluating how
-
well a model makes predictions. The
-
component to use to perform evaluation
-
in Azure ML designer is called, as
-
intuitive as it is, Evaluate Model.
-
Once the job of training and evaluation
-
of the model is completed, you can review
-
evaluation metrics on the completed job
-
page by right clicking on the component.
-
In the evaluation results, you can also
-
find the so-called confusion Matrix that
-
you can see here in the right side of
-
this deck
-
A confusion matrix shows cases where
-
both the predicted and actual values
-
were one, the so-called true positives
-
at the top left and also cases where
-
both the predicted and the actual values
-
were zero, the so-called true negatives
-
at the bottom right. While the other
-
cells show cases where the predicting
-
and actual values differ,
-
called false positive and false
-
negatives, and this is an example of a
-
confusion matrix for a binary classifier.
-
While for a multi-class classification
-
model the same approach is used to
-
tabulate each possible combination of
-
actual and predictive value counts. So
-
for example, a model with three possible
-
classes would result in three times
-
three matrix.
-
The confusion matrix is also useful for
-
the matrix that can be derived from it,
-
like accuracy, recall, or precision.
-
We say that the last step is
-
deploying the train model to a real-time
-
endpoint as a predictive service. And in
-
order to automate your model into a
-
service that makes continuous
-
predictions, you need, first of all, to
-
create and then deploy an
-
inference pipeline. The process of
-
converting the training pipeline into a
-
real-time inference pipeline removes
-
training components and adds web service
-
inputs and outputs to handle requests.
-
And the inference pipeline performs...they
-
seem that the transformation is the
-
first pipeline, but for new data. Then it
-
uses the train model to infer or predict
-
label values based on its feature.
-
So, I think I've talked a lot for now
-
I would like to let John show us
-
something in practice with
-
the hands-on demo, so please, John, go
-
ahead, share your screen and guide us
-
through this demo of creating a
-
classification with
-
the Azure Machine Learning designer.
-
[JOHN]: Thank you so much Carlotta for
-
this interesting explanation of the
-
Azure ML designer. And now,
-
um, I'm going to start with you in the
-
practical demo part, so if you want to
-
follow along, go to the link that Carlotta
-
sent in the chat so you can do
-
the demo or the practical part with me.
-
I'm just going to share my screen...
-
and...
-
...go here. So, uh...
-
Where am I right now? I'm inside the
-
Microsoft Learn documentation. This is
-
the exercise part of this module, and we
-
will start by setting two things, which
-
are a prequisite for us to work inside
-
this module, which are the users group
-
and the Azure Machine Learning workspace,
-
and something extra which is the compute
-
cluster that Carlotta talked about. So I
-
just want to make sure that you all have
-
a resource group created inside your
-
portal inside your Microsoft Azure
-
platform. So this is my resource group.
-
Inside this is this Resource Group. I
-
have created an Azure Machine Learning
-
workspace. So I'm just going to access
-
the workspace that I have created
-
already from this link. I am going to
-
open it, which is the studio web URL, and
-
I will follow the steps. So what is this?
-
This is your machine learning workspace,
-
or machine learning studio. You can do a
-
lot of things here, but we are going to
-
focus mainly on the designer and the
-
data and the compute. So another
-
prerequisite here, as Carlotta told you,
-
we need some resources to power up the
-
classification, the processes that
-
will happen.
-
So, we have created this computing
-
cluster,
-
and we have set some presets for
-
it. So
-
where can you find this preset? You go
-
here. Under the create compute, you'll
-
find everything that you need to do. So
-
the size is the Standard DS11 Version 2,
-
and it's a CPU not GPU, because we don't
-
know the GPU, and we don't need a GPU.
-
Uh, it is ready for us to use.
-
The next thing which we will look into
-
is the designer. How can you access the
-
designer?
-
You can either click on this icon or
-
click on the navigation menu and click
-
on the designer for me.
-
Now I am inside my designer.
-
What we are going to do now is the
-
pipeline that Carlotta told you about.
-
And from where can I know these steps? If
-
you follow along in the learn module, you
-
will find everything that I'm doing
-
right now in detail, with screenshots
-
of course. So I'm going to create a new
-
pipeline, and I can do so by clicking on
-
this plus button.
-
It's going to redirect me to the
-
designer authoring the pipeline, uh, where
-
I can drag and drop data and components
-
that Carlotta told you the difference
-
between.
-
And here I am going to do some changes
-
to the settings. I am going to connect
-
this with my compute cluster that I
-
created previously so I can utilize it.
-
From here I'm going to choose this
-
compute cluster demo that I have showed
-
you before in the clusters here,
-
and I am going to change the name to
-
something more meaningful. Instead of
-
byline and the date of today I'm going
-
to name it Diabetes...
-
uh...
-
let's just check this training.
-
Let's say Training 0.1 or 01, okay?
-
And I am going to close this tab in
-
order to have a bigger place to work
-
inside because this is where we will
-
work, where everything will happen. So I
-
will click on close from here,
-
and I will go to the data and I will
-
create a new data set.
-
How can I create a new data set? There is
-
multiple options here you can find, from
-
local files, from data store, from web
-
files, from open data set, but I'm going
-
to choose from web files, as this is the
-
way we're going to create our data.
-
From here, the information of my data set
-
I'm going to get them from the Microsoft
-
Learn module. So if we go to the step
-
that says "Create a dataset",
-
under it, it illustrates that you can
-
access the data from inside the asset
-
library, and inside your asset library,
-
you'll find the data and find the
-
component. And I'm going to select
-
this link because this is where my data
-
is stored. If you open this link, you will
-
find this is a CSV file, I think.
-
Yeah. And you can...like, all the data are
-
here.
-
Now let's get back..
-
Um...
-
And you are going to do something
-
meaningful, but because I have already
-
created it before twice, so I'm gonna
-
add a number to the name
-
The data set is tabular and there is
-
the file, but this is a table, so we're
-
going to choose the table.
-
Data type
-
for data set type.
-
Now we will click on "Next". That's gonna
-
review, or display for you the content
-
of this file that you have
-
imported to this workspace.
-
And for these settings, these are
-
related to our file format.
-
So this is a delimited file, and it's not
-
plain text, it's not a Jason. The delimiter
-
is common, as we have seen that they
-
[INDISTINGUISHABLE]
-
So I'm choosing common
-
errors because the only the first five...
-
[INDISTINGUISHABLE]
-
...for example. Okay, uh, if you have any
-
doubts, if you have any problems, please
-
don't hesitate to write me
-
in the chat,
-
like, what is blocking you, and
-
me and Carlotta will try to help you,
-
like whenever possible.
-
And now this is the new preview for my
-
data set. I can see that I have an ID, I
-
have patient ID, I have pregnancies, I
-
have the age of the people,
-
I have the body mass, I think
-
whether they have diabetes or not, as a
-
zero and one. Zero indicates a negative,
-
the person doesn't have diabetes, and one
-
indicates a positive, that this person
-
has diabetes. Okay.
-
Now I'm going to click on "Next". Here I am
-
defining my schema. All the data types
-
inside my columns, the column names, which
-
columns to include, which to exclude. And
-
here we will include everything except
-
the path of the bath color. And we are
-
going to review the data types of each
-
column. So let's review this first one.
-
This is numbers, numbers, numbers, then it's the
-
integer. And this is,
-
um, like decimal..
-
...dotted...
-
decimal number. So we are going to choose
-
this data type.
-
And for this one
-
it says diabetic, and it's a zero under
-
one, and we are going to make it as
-
integers.
-
Now we are going to click on "Next" and
-
move to reviewing everything. This is
-
everything that we have defined together.
-
I will click on "Create".
-
And...
-
now the first step has ended. We have
-
gotten our data ready.
-
Now...what now? We're going to utilize the
-
designer...
-
um...power. We're going to drag and drop
-
our data set to create the pipeline.
-
So I have clicked on it and dragged it
-
to this space. It's gonna appear to you.
-
And we can inspect it by right clicking and
-
choose "Preview data"
-
to see what we have created together.
-
From here, you can see everything that we
-
have seen previously, but in more
-
details. And we are just going to close
-
this. Now what? Now we are gonna do the
-
processing that Carlota mentioned.
-
These are some instructions about the
-
data, about how you can look at them, how you
-
can open them but we are going to move
-
to the transformation or the processing.
-
So as Carlotta told you, like any data
-
for us to work on we have to do some
-
processing to it
-
to make it easy easier for the model to
-
be trained and easier to work with. So, uh,
-
we're gonna do the normalization. And
-
normalization meaning is, uh,
-
to scale our data, either down or up, but
-
we're going to scale them down,
-
and we are going to decrease, uh,
-
relatively decrease
-
the values, all the values, to work
-
with lower numbers. And if we are working
-
with larger numbers, it's going to take
-
more time. If we're working with smaller
-
numbers, it's going to take less time to
-
calculate them, and that's it. So
-
where can I find the normalized data? I
-
can find it inside my component.
-
So I will choose the component and
-
search for "Normalized data".
-
I will drag and drop it as usual and I
-
will connect between these two things
-
by clicking on this spot, this, uh,
-
circuit, and
-
drag and drop onto the next circuit.
-
Now we are going to define our
-
normalization method.
-
So I'm going to double click on the
-
normalized data.
-
It's going to open the settings for the
-
normalization
-
as a better transformation method, which is
-
a mathematical way
-
that is going to scale our data
-
according to.
-
We're going to choose min-max, and for
-
this one, we are going to choose "Use Zero",
-
for constant column we are going to
-
choose "True",
-
and we are going to define which columns
-
to normalize. So we are not going to
-
normalize the whole data set. We are
-
going to choose a subset from the data
-
set to normalize. So we're going to
-
choose everything except for the patient
-
ID and the diabetic, because the patient
-
ID is a number, but it's a categorical
-
data. It describes a patient, it's not a
-
number that I can sum. I can't say "patient
-
ID number one plus patient ID number two".
-
No, this is a patient and another
-
patient, it's not a number that I can do
-
mathematical operations on, so I'm not
-
going to choose it. So we will choose
-
everything as I said, except for the
-
diabetic and the patient ID. I will
-
click on "Save".
-
And it's not showing me a warning again,
-
everything is good.
-
Now I can click on "Submit"
-
and review my normalization output.
-
Um.
-
So, if you click on "Submit" here,
-
you will choose "Create new" and
-
set the name that is mentioned here
-
inside the notebook. So it tells you
-
to create a job and name it, name
-
the experiment "MS Learn Diabetes
-
Training", because you will continue
-
working on and building component later.
-
I have it already created, I am the, uh,
-
we can review it together. So let
-
me just open this in another tab. I think
-
I have it...
-
here.
-
Okay.
-
So, these are all the jobs that I have
-
created.
-
All the jobs there. Let's do this over.
-
These are all the jobs that I have
-
submitted previously.
-
And I think this one is the
-
normalization job, so let's see the
-
output of it.
-
As you can see, it says, uh, "Check mark", yes,
-
which means that it worked, and we can
-
preview it. How can I do that? Right click
-
on it, choose "Preview data",
-
and as you can see all the data are
-
scaled down
-
so everything is between zero
-
and, uh, one I think.
-
So everything is good for us. Now we
-
can move forward to the next step
-
which is to create the whole pipeline.
-
So, uh, Carlota told you that
-
we're going to use a classification
-
model to create this data set, so let
-
me just drag and drop everything
-
to get runtime and we're doing
-
[INDISTINGUISHABLE]
-
about everything by
-
[INDISTINGUISHABLE]
-
So,
-
as a result, we are going to explain
-
[INDISTINGUISHABLE]
-
Yeah. So, I'm going to give this split
-
data. I'm going to take the
-
transformation data to split data and
-
connect it like that.
-
I'm going to get three model
-
components because I want to train my
-
model,
-
and I'm going to put it right here.
-
Okay.
-
Let's just move it down there. Okay.
-
And we are going to use a classification
-
model,
-
a two class
-
logistic regression model.
-
So I'm going to give this algorithm to
-
enable my model to work
-
This is the untrained model, this is...
-
here.
-
The left...
-
the left, uh, circuit, I'm going to
-
connect it to the data set, and the right
-
one, we are going to connect it to
-
evaluate model.
-
Evaluate model...so let's search for
-
"Evaluate model" here.
-
So because we want to do what...we want to
-
evaluate our model and see how it it has
-
been doing. Is it good, is it bad?
-
Um, sorry...
-
This is...
-
this is down there
-
after the score model.
-
So we have to get the score model first,
-
so let's get it.
-
And this will take the trained model and
-
the data set
-
to score our model and see if it's
-
performing good or bad.
-
And...
-
um...
-
after that, we have finished
-
everything. Now, we are going to do the what?
-
The presets for everything.
-
As a starter, we will be splitting our
-
data. So
-
how are we going to do this, according to
-
what? To the split rules. So I'm going to
-
double-click on it and choose "Split rules".
-
And the percentage is
-
70 percent for the [INSISTINGUASHABLE]
-
and 30 percent of the
-
data for
-
the valuation or for the scoring, okay?
-
I'm going to make it a randomization, so
-
I'm going to split data randomly and the
-
seat is, uh,
-
132, uh 23 I think...yeah.
-
And I think that's it.
-
The split says why this holds, and that's
-
good.
-
Now for the next one, which is the train
-
model we are going to connect it as
-
mentioned here.
-
And we have done that and...then why
-
am I having here? Let's double click
-
on it...yeah. It has...it needs the
-
label column that I am trying to predict.
-
So from here, I'm going to choose
-
diabetic. I'm going to save.
-
I'm going to close this one.
-
So it says here,
-
the diabetic label, the model, it will
-
predict the zero and one, because this is
-
a binary classification algorithm, so
-
it's going to predict either this or
-
that.
-
And...
-
um...
-
I think that's everything to run the the
-
pipeline.
-
So everything is done, everything is good
-
for this one. We're just gonna leave it
-
for now, because this is the next
-
step.
-
Um, this will be put instead of the
-
score model, but let's...
-
let's delete it for now.
-
Okay.
-
Now we have to submit the job in order
-
to see the output of it. So I can click
-
on "Submit" and choose the previous job
-
which is the one that I have showed you
-
before.
-
And then let's review its output
-
together here.
-
So if I go to the jobs,
-
if I go to MS Learn, maybe it is training?
-
I think it's the one that lasted the
-
longest, this one here.
-
So here I can see
-
the job output, what happened inside
-
the model, as you can see.
-
So the normalization we have seen
-
before, the split data, I can preview it.
-
The result one or the result two as it
-
splits the data to 70 here and
-
thirty percent here.
-
Um, I can see the score model, which is
-
something that we need
-
to review.
-
Inside the scroll model, uh, from
-
here,
-
we can see that...
-
let's get back here.
-
This is the data that the model has
-
been scored and this is a scoring output.
-
So it says "code label true", and he is
-
not diabetic, so this is,
-
um,
-
a wrong prediction, let's say.
-
For this one it's true and true, and this
-
is a good, like, what do you say,
-
prediction, and the probabilities of this
-
score,
-
which means the certainty of our model
-
of that this is really true. It's 80 percent.
-
For this one it's 75 percent.
-
So these are some cool metrics that we
-
can review to understand how our model
-
is performing. It's performing good for
-
now.
-
Let's check our evaluation model.
-
So this is the extra one that I told you
-
about. Instead of the
-
score model only, we are going to add
-
what evaluate model
-
after it. So here
-
we're going to go to our Asset Library
-
and we are going to choose the evaluate
-
model,
-
and we are going to put it here, and we
-
are going to connect it, and we are going
-
to submit the job using the same name of
-
the job that we used previously.
-
Let's review it. Also, so, after it
-
finishes, you will find it here. So I have
-
already done it before, this is how I'm
-
able to see the output.
-
So let's see
-
what is the output of this
-
evaluation process.
-
Here it mentioned to you that there are
-
some matrix,
-
like the confusion matrix, which Carlotta
-
told you about, there is the accuracy, the
-
precision, the recall, and F1 Score.
-
Every matrix gives us some insight about
-
our model. It helps us to understand it
-
more, and, um,
-
understand if it's overfitting, if
-
it's good, if it's bad, and really really,
-
like, understand how it's working.
-
Now I'm just waiting for the job to load.
-
Until it loads,
-
um,
-
we can continue
-
to work on our
-
model. So I will go to my designer. I'm
-
just going to confirm this.
-
And I'm going to continue working on it
-
from
-
where we have stopped. Where have we
-
stopped?
-
we have stopped on the evaluate model. So
-
I'm going to choose this one.
-
And it says here
-
"select experiment", "create inference
-
pipeline", so
-
I am going to go to the jobs,
-
I'm going to select my experiment.
-
I hope this works.
-
Okay. Finally, now we have our
-
evaluate model output.
-
Let's preview evaluation results
-
and, uh...
-
come on.
-
Finally. Now we can create our inference
-
pipeline. So,
-
I think it says that...
-
um...
-
select the experiment, then select MS
-
Learn. So,
-
I am just going to select it,
-
and finally. Now we can, the ROC curve, we
-
can see it, that the true positive rate
-
and the force was integrate. The false
-
positive rate is increasing with time,
-
and also the true positive rate. True
-
positive is something that it predicted,
-
that it is, uh, positive it has diabetes,
-
and it's really...it's really true.
-
The person really has diabetes. Okay. And
-
for the false positive, it predicted that
-
someone has diabetes and someone doesn't
-
have it. This is what true position and
-
false positive means. This is the record
-
curve, so we can review the metrics
-
of our model. This is the lift curve. I
-
can change the threshold of my confusion
-
matrix here
-
and if Carlotta wants to add
-
anything about the...the graphs,
-
you can do so.
-
[CARLOTTA]: Um, yeah, so I just
-
wanted to...if you go...yeah.
-
I just wanted to comment for the
-
RSC curve, that actually from this
-
graph, the metric which usually we're
-
going to compute is the area under
-
under the curve. And this coefficient or
-
metric,
-
it's a coefficient—
-
it's a value that could span from
-
zero to one and the the highest is...
-
...the highest is the the score.
-
So the closest one,
-
so the the highest is the amount of
-
area under this curve.
-
The highest performance
-
we've got from from our model.
-
And another thing is what John is
-
playing with. So this threshold for
-
the logistic
-
regression is the threshold used by the
-
model to, um,
-
to predict if the category is zero or
-
one. So if the probability—the
-
probability score is above the threshold,
-
then the category will be predicted as
-
one, while if the probability is
-
below the threshold, in this case, for
-
example, 0.5, the category is predicted
-
as zero. So that's why it's very
-
important to choose the threshold,
-
because the performance really can vary,
-
um,
-
with this threshold value.
-
[JOHN]: Thank you so much, Carlotta, and
-
as I mentioned now, we are going to
-
create our inference pipeline. So we are
-
going to select the latest one, which I
-
already have it opened here. This is the
-
one that we were reviewing together. This
-
is where we have stopped, and we're going
-
to create an inference pipeline. We are
-
going to choose a real-time inference
-
pipeline, okay?
-
From where I can find this? Here, as it
-
says, "Real-time inference pipeline".
-
So it's gonna add some things to my
-
workspace. It's going to add the
-
web service input, it's gonna
-
have the web service output,
-
because we will be creating
-
it as a web service to access
-
it from the internet.
-
What are we going to do? We're going
-
to remove this diabetes data, okay?
-
And we are going to get a component
-
called "Web
-
input" and...let me check
-
it's "enter data manually".
-
We have...we already have that with input
-
present.
-
So we are going to get the entire data
-
manually,
-
and we're going to collect it—to connect
-
it as it was connected before, like that.
-
And also, I am not going to directly take
-
the web service—sorry, escort model to
-
the web service output like that.
-
I'm going to delete this
-
and I'm going to execute a python script
-
before
-
I display my result.
-
So,
-
this will be connected like...
-
So...
-
the other way around.
-
And from here, I am going to connect this
-
with that and there is some data that
-
we will be getting from the node, or from
-
the explanation here, and this is the
-
data that will be entered to our
-
website manually. Okay? This is instead of
-
the data that we have been getting from
-
our data set that we created. So I'm just
-
going to double click on it and choose
-
CSV, and I will choose "it has headers",
-
and I will take or copy this content and
-
put it there, okay?
-
So let's do it.
-
I think I have to click on edit code, now
-
I can click on "Save", and I can close it.
-
Another thing which is the python script
-
that we will be executing.
-
Um, yeah. We
-
are going to remove this, also.
-
We don't need the evaluate model
-
anymore, so we are going to remove it.
-
The python script
-
that I will be executing,
-
I can find it here.
-
Um, yeah.
-
This is the python script that we will
-
execute. And it says to you that this
-
code selects only the patient's ID
-
the score label, the score
-
probability and return—returns them to
-
the web service output. So we don't want
-
to return all the columns, as we have
-
seen previously,
-
that determines everything,
-
so
-
we want to return certain stuff, the
-
stuff that we will use inside our
-
endpoint. So I'm just going to select
-
everything and delete it, and
-
paste the code that I have gotten from
-
the, uh,
-
the Microsoft Learn docs.
-
Now I can click on "Save", and I can close
-
this.
-
Let me check something,
-
I don't think it saved.
-
It's saved, but the display is
-
wrong, okay.
-
And now I think everything is good to go.
-
I'm just gonna double-check everything.
-
So, uh, yeah. We are gonna change the name
-
of this
-
pipeline, and we are gonna call it
-
"Predict
-
diabetes", okay?
-
Now let's close it, and
-
I think that we are good to go. So,
-
um,
-
Okay, I think everything is good for us.
-
I just want to make sure of something.
-
Is the data...
-
it's correct, the data is...yeah,
-
it's correct.
-
Okay, now I can run the pipeline. Let's
-
submit.
-
Select an "existing" pipeline, and we're
-
going to choose
-
the "ms-learn-diabetes-training",
-
which is the pipeline
-
that we have been working on
-
from the beginning of this module.
-
I don't think that this is going to take
-
much time. So we have submitted the job
-
and it's running.
-
Until the job ends, we are going to set
-
everything
-
for deploying a service.
-
In order to deploy a service,
-
um,
-
I have to have the job ready, so
-
until it's ready, you can't deploy it. So
-
let's go to the job—the job details from
-
here, okay?
-
And until it finishes,
-
Carlotta, do you think that we can have
-
the questions, and then we can get back
-
to the job I'm deploying it?
-
[CARLOTTA]: Yeah, yeah, yeah.
-
So yeah, guys, if you
-
have any questions
-
on what you just saw here
-
or into introductions, feel free. This is
-
a good moment, we can...we can discuss
-
now, while we wait for this job to
-
finish.
-
[JOHN]: Uh, and....
-
can...
-
we have the knowledge check one? Or, like,
-
what do you think?
-
[CARLOTTA]: Yeah, we can also go
-
to the knowledge check.
-
Um...
-
Yeah, okay. So let me share my screen.
-
Yeah, so if you have not any questions
-
for us, we can maybe propose some
-
questions to you that you can,
-
um,
-
check our knowledge so far and you
-
can maybe answer to these questions
-
via chat.
-
So we have...do you see my screen, can
-
you see my screen?
-
[JOHN]: Yes.
-
[CARLOTTA]: So, John, I think I will
-
read this
-
question aloud and ask it to you, okay? So
-
are you ready to answer?
-
[JOHN:] Yes I am.
-
[CARLOTTA]: So...
-
you're using Azure Machine Learning
-
designer to create a training pipeline
-
for a binary classification model, so
-
what we were doing in our demo,
-
right? And you have added a data set
-
containing features and labels, a Two-
-
Class Decision Forest module. So we used
-
a logistic regression model our...
-
um, in our example.
-
Here, we're using a Two-
-
Class Decision Forest model.
-
And, of course, a Train Model module. You
-
plan now to use score model and evaluate
-
model modules to test the train model
-
with the subset of the data set that
-
wasn't used for training.
-
But what are we missing? So what's
-
another model you should add? We have
-
three options: we have Join Data, we have
-
Split Data, or we have Select Columns
-
in Dataset.
-
So
-
while John thinks about the answer,
-
go ahead and,
-
um,
-
answer yourself. So give us your
-
guess.
-
Put it in the chat, or just come off mute
-
and answer.
-
"A", "B".
-
[JOHN]: Yeah, what do you
-
is the correct
-
answer for this one? I need something to
-
uh...I have to score my model, and I
-
have to evaluate it, so I need
-
something to enable me to do these two
-
things.
-
[CARLOTTA]: I think it's something
-
you showed us in your pipeline,
-
right John?
-
[JOHN]: Of course I did.
-
[CARLOTTA]: Uh, we have no guesses
-
in the chat?
-
[JOHN]: Can someone...
-
Someone want to guess?
-
[CARLOTTA]: We have a "B".
-
[JOHN]: Uh, maybe.
-
So, in order to do this,
-
I mentioned the
-
the module that is going to help me
-
to divide my data into two things:
-
70 percent for the
-
the training and 30 percent for the
-
evaluation. So what did I use? I used
-
split data, because this is what is going
-
to split my data randomly into training
-
data and validation data. So the correct
-
answer is "B", and good job. Thank you
-
for participating.
-
Next question, please.
-
[CARLOTTA]: Yes, "B" is the correct
-
answer, so thanks, John,
-
for explaining to us the correct
-
one.
-
And we want to go with question two?
-
[JOHN]: Yeah, so,
I'm going to ask you now,
-
Carlotta. You use Azure Machine Learning
-
designer to create a training pipeline
-
for your classification model.
-
What must you do before you deploy this
-
model as a service?
You have to do
-
something before
-
you deploy it.
-
What do you think is the correct answer?
-
Is it "A", "B", or "C"?
-
Share your thoughts with—
-
with us in the chat and
-
and I'm also going to give you some
-
minutes to think of it before I
-
tell you about it.
-
[CARLOTTA]: Yeah so let me go
-
through the possible
-
answers, right? So we have A: "Create an
-
inference pipeline from the training
-
pipeline";
-
B: we have "Add an Evaluate Model
-
module to the training pipeline; and then
-
three, we have "Clone the training
-
pipeline with a different name".
-
So what do you think is the correct
-
answer? "A", "B", or "C"?
-
Also this time, I think it's something
-
we mentioned both in the decks and in
-
the demo right?
-
[JOHN]: Yes it is,
-
it's something that I have done
-
like two, like five minutes ago.
-
It's real-time, real-time.
-
[CARLOTTA]: Um,
-
yeah, so, think about...you need to deploy
-
the model as a service. So if I'm
-
going to deploy model,
-
I cannot evaluate the model
-
after deploying it, right, because I
-
cannot go into production if I'm not
-
sure, I'm not satisfied with my model, and
-
I'm not sure that my model is performing
-
well.
-
So that's why I would go with,
-
um,
-
I would...exclude "B" from my
-
answer.
-
While
-
thinking about "C", uh, I don't see you—I
-
didn't see you, John, cloning the
-
training Pipeline with a different name,
-
so I don't think this is the
-
right answer.
-
While I've seen you creating an
-
inference pipeline from the
-
training pipeline, and you just converted
-
it using a one-click button, right?
-
[JOHN]: Yeah, that's correct.
-
So this is the right answer.
-
Good job. So I created an inference
-
real-time pipeline, and it has done.
-
It finished—it finished, the job is
-
finished. So we can now deploy.
-
And...
-
Yeah [LAUGHS].
-
Exactly, like, on time.
-
Like, it finished two seconds...
-
three, four seconds ago [LAUGHS].
-
So, uh,
-
until, um...
-
This is my job review, so
-
this is the job details that I
-
have already submitted, it's just opening,
-
and once it opens...
-
um...
-
I don't know why it's so heavy
-
today, it's not like that usually.
-
[CARLOTTA]: Yeah, it's probably because
-
you are also
-
showing your your screen on Teams,
-
so that's the bandwidth of your
-
connection.
-
[JOHN]: Let me do something here
-
because...yeah finally.
-
I can switch to my mobile internet if it
-
did it again. So I will click on "Deploy",
-
it's that simple. I'll just click on
-
"Deploy" and...
-
I am going to deploy a new real-time
-
endpoint.
-
So what I'm going to name it?
-
Description and the compute type.
-
Everything is already mentioned
-
for me here,
-
so I'm just gonna copy and paste it,
-
because we...we are running
-
out of time.
-
So it's all Azure Container Instance,
-
not Azure Kubernetes Service,
-
which is a containerization service also.
-
Both are for containerization, but this
-
gives you something, and this gives you
something else.
-
For the advanced options,
-
it doesn't say for us to do anything, so
-
we are just gonna click on "Deploy",
-
and now we can test our endpoint from
-
the endpoints that we can find here, so
-
it's in progress. If I go here
-
under the assets, I can find something
-
called "Endpoints", and I can find the
-
real-time ones and the batch endpoints.
-
And we have created a real-time endpoint,
-
so we are going to find it under this
-
title. So if I click on it, I should
-
be able to test it once it's ready.
-
It's still loading, but this is the
-
input, and this is the output that we
-
will get back, so if I click on "Test"...
-
and from here,
-
I will input some data to the
-
endpoint,
-
which are: the patient information; the
-
columns that we have already seen in our
-
data set; the patient ID; the pregnancies.
-
And of course, of course I'm not gonna
-
enter the label that I'm trying to
-
predict, so I'm not going to give him if
-
the patient is diabetic or not. This
-
endpoint is to tell me this.
-
The endpoint, or the URL,
-
is going to give me
-
back this information, whether someone
-
has diabetes, or he doesn't. So if I input
-
this data, I'm just going to copy it,
-
and go to my endpoint, and click on
-
"Test", I'm gonna give the result pack,
-
which are the three columns that we have
-
defined inside our python script: the
-
patient ID, the diabetic prediction, and
-
the probability—the certainty of whether
-
someone is diabetic or not based on the...
-
uh...based on the prediction.
-
So that's it.
-
And, uh, I think that this is a really
-
simple step to do, you can do it on your
-
own, you can test it.
-
And I think that I have finished, so
-
thank you.
-
[CARLOTTA]: Uh, yes,
-
we are running out of time
-
I just wanted to thank you, John, for
-
this demo, for going through all these
-
steps to
-
um, create, train a classification model,
-
and also deploy it as a predictive
-
service. And I encourage you all to go
-
back to the learn module
-
and, um, deepen all these topics
-
at your own pace, and also maybe
-
uh do this demo on your own, on your
-
subscription on your Azure for Student
-
subscription. Um...
-
And I would also like to recall that
-
this is part of a series of study
-
sessions of Cloud Skill Challenge study
-
sessions,
-
so you will have more in the...
-
in the following days, and this is for
-
you to prepare, let's say, to help you
-
in taking the Cloud Skills Challenge,
-
which collect
-
a very interesting learn module that you
-
can use to scale up on various topics,
-
and some of them are focused on AI and
-
ML. So if you are interested in these
-
topics, you can select these these learn
-
modules.
-
So let me also copy
-
the link, the short link to the
-
challenge in the chat. Remember that
-
you have time until the 13th of
-
September to take the challenge. And also
-
remember that in October, on the 7th of
-
October, you have the—you can join the
-
student—the Student Developer Summit,
-
which is, uh, which will be a virtual or
-
in...for some for some cases a hybrid
-
event, so stay tuned, because you will
-
have some surprises in the following
-
days. And if you want to learn more about
-
this event you can check the Microsoft
-
Imaging Cap Twitter page and stay tuned.
-
So thank you everyone for joining
-
this session today, and thank you very
-
much, John, for co-hosting with this
-
session with me. It was a pleasure.
-
[JOHN]: Thank you so much,
-
Carlotta, for having me
-
with you today, and thank you for
-
giving me this opportunity to
-
be with you here.
-
[CARLOTTA]: Great, thank you.
-
[JOHN]: Yeah, I hope that we
-
work again in the future.
-
[CARLOTTA]: Sure, I hope so as well.
-
Um, so, thank you everyone.
-
And have a nice rest of your day.
-
Bye-bye. Speak to you soon.
-
[JOHN]: Bye.