WEBVTT

00:00:01.920 --> 00:00:04.680
Great, so I think we can start since the

00:00:04.680 --> 00:00:07.859
meeting is recorded, so if everyone, uh

00:00:07.859 --> 00:00:11.160
jump-jumps in later can-can watch the

00:00:11.160 --> 00:00:12.420
recording.

00:00:12.420 --> 00:00:15.780
So, hi everyone and welcome to this

00:00:15.780 --> 00:00:18.000
um, Cloud Skill Challenge study session

00:00:18.000 --> 00:00:20.880
around a create classification models

00:00:20.880 --> 00:00:24.000
with Azure Machine learning designer.

00:00:24.000 --> 00:00:27.240
So today I'm thrilled to be here with

00:00:27.240 --> 00:00:29.820
John. Uh, John do you mind introduce briefly

00:00:29.820 --> 00:00:31.619
yourself?

00:00:31.619 --> 00:00:34.160
Uh, thank you Carlotta. Hello everyone.

00:00:34.160 --> 00:00:38.160
Welcome to our workshop today. I hope

00:00:38.160 --> 00:00:40.559
that you are all excited for it. I am

00:00:40.559 --> 00:00:43.140
John Aziz a gold Microsoft Learn student

00:00:43.140 --> 00:00:47.460
ambassador and I will be here with, uh,

00:00:47.460 --> 00:00:50.760
Carlotta to, like, do the practical part

00:00:50.760 --> 00:00:53.820
about this module of the Cloud Skills

00:00:53.820 --> 00:00:57.000
Challenge. Thank you for having me.

00:00:57.000 --> 00:00:59.219
Perfect, thanks John. So for those who

00:00:59.219 --> 00:01:03.440
don't know me I'm Carlotta Castelluccio,

00:01:03.440 --> 00:01:06.479
based in Italy and focused on AI

00:01:06.479 --> 00:01:08.760
machine learning technologies and about

00:01:08.760 --> 00:01:11.200
the use in education.

00:01:11.200 --> 00:01:12.340
Um, so,

00:01:12.737 --> 00:01:14.537
um this Cloud Skill Challenge study

00:01:14.537 --> 00:01:17.117
session is based on a learn module, a

00:01:17.120 --> 00:01:21.080
dedicated learn module. I sent to you, uh

00:01:21.320 --> 00:01:23.939
the link to this module, uh, in the chat

00:01:23.939 --> 00:01:25.619
in a way that you can follow along the

00:01:25.619 --> 00:01:28.680
module if you want, or just have a look at

00:01:28.680 --> 00:01:32.470
the module later at your own pace.

00:01:32.470 --> 00:01:33.780
Um...

00:01:33.780 --> 00:01:37.020
So, before starting I would also like to

00:01:37.020 --> 00:01:40.619
remember to remember you, uh, the code of

00:01:40.619 --> 00:01:43.439
conduct and guidelines of our student

00:01:43.439 --> 00:01:47.510
ambassadors community. So please during this

00:01:47.510 --> 00:01:51.000
meeting be respectful and inclusive and

00:01:51.000 --> 00:01:53.579
be friendly, open, and welcoming and

00:01:53.579 --> 00:01:56.159
respectful of other-each other

00:01:56.159 --> 00:01:57.720
differences.

00:01:57.720 --> 00:02:01.200
If you want to learn more about the code

00:02:01.200 --> 00:02:03.390
of conduct, you can use this link in the

00:02:03.390 --> 00:02:08.880
deck: aka.ms/SACoC.

00:02:09.660 --> 00:02:11.730
And now we are,

00:02:11.730 --> 00:02:15.420
um, we are ready to to start our session.

00:02:15.420 --> 00:02:18.959
So as we mentioned it we are going to

00:02:18.959 --> 00:02:21.980
focus on classification models and Azure ML,

00:02:21.980 --> 00:02:24.900
uh, today. So, first of all, we are going

00:02:24.900 --> 00:02:28.430
to, um, identify, uh, the kind of

00:02:28.430 --> 00:02:31.080
um, of scenarios in which you should

00:02:31.080 --> 00:02:34.490
choose to use a classification model.

00:02:34.490 --> 00:02:36.660
We're going to introduce Azure Machine

00:02:36.660 --> 00:02:39.060
Learning and Azure Machine Designer.

00:02:39.060 --> 00:02:41.879
We're going to understand, uh, which are

00:02:41.879 --> 00:02:43.680
the steps to follow, to create a

00:02:43.680 --> 00:02:46.200
classification model in Azure Machine

00:02:46.200 --> 00:02:48.076
Learning, and then John will,

00:02:48.076 --> 00:02:49.500
um,

00:02:49.500 --> 00:02:52.219
lead an amazing demo about training and

00:02:52.219 --> 00:02:54.300
publishing a classification model in

00:02:54.300 --> 00:02:57.000
Azure ML Designer.

00:02:57.000 --> 00:02:59.819
So, let's start from the beginning. Let's

00:02:59.819 --> 00:03:02.640
start from identifying classification

00:03:02.640 --> 00:03:05.220
machine learning scenarios.

00:03:05.220 --> 00:03:07.640
So, first of all, what is classification?

00:03:07.640 --> 00:03:09.959
Classification is a form of machine

00:03:09.959 --> 00:03:12.120
learning that is used to predict which

00:03:12.120 --> 00:03:15.599
category or class an item belongs to. For

00:03:15.599 --> 00:03:17.340
example, we might want to develop a

00:03:17.340 --> 00:03:19.800
classifier able to identify if an

00:03:19.800 --> 00:03:22.200
incoming email should be filtered or not

00:03:22.200 --> 00:03:25.080
according to the style, the sender, the

00:03:25.080 --> 00:03:28.140
length of the email, etc. In this case, the

00:03:28.140 --> 00:03:30.060
characteristics of the email are the

00:03:30.060 --> 00:03:31.080
features.

00:03:31.080 --> 00:03:34.200
And the label is a classification of

00:03:34.200 --> 00:03:38.099
either a zero or one, representing a spam

00:03:38.099 --> 00:03:40.860
or non-spam for the incoming email. So

00:03:40.860 --> 00:03:42.360
this is an example of a binary

00:03:42.360 --> 00:03:44.400
classifier. If you want to assign

00:03:44.400 --> 00:03:46.260
multiple categories to the incoming

00:03:46.260 --> 00:03:48.959
email like work letters, love letters,

00:03:48.959 --> 00:03:52.080
complaints, or other categories, in this

00:03:52.080 --> 00:03:54.000
case a binary classifier is no longer

00:03:54.000 --> 00:03:55.739
enough, and we should develop a

00:03:55.739 --> 00:03:58.319
multi-class classifier. So classification

00:03:58.319 --> 00:04:00.599
is an example of what is called

00:04:00.599 --> 00:04:02.519
supervised machine learning

00:04:02.519 --> 00:04:05.280
in which you train a model using data

00:04:05.280 --> 00:04:07.080
that includes both the features and

00:04:07.080 --> 00:04:08.879
known values for label

00:04:08.879 --> 00:04:11.099
so that the model learns to fit the

00:04:11.099 --> 00:04:13.560
feature combinations to the label. Then,

00:04:13.560 --> 00:04:15.420
after training has been completed, you

00:04:15.420 --> 00:04:17.040
can use the train model to predict

00:04:17.040 --> 00:04:19.500
labels for new items for-for which the

00:04:19.500 --> 00:04:22.320
label is unknown.

00:04:22.320 --> 00:04:25.440
But let's see some examples of scenarios

00:04:25.440 --> 00:04:27.120
for classification machine learning

00:04:27.120 --> 00:04:29.160
models. So, we already mentioned an

00:04:29.160 --> 00:04:31.020
example of a solution in which we would

00:04:31.020 --> 00:04:33.660
need a classifier, but let's explore

00:04:33.660 --> 00:04:35.699
other scenarios for classification in

00:04:35.699 --> 00:04:37.979
other industries. For example, you can use

00:04:37.979 --> 00:04:40.380
a classification model for a health

00:04:40.380 --> 00:04:43.680
clinic scenario, and use clinical data to

00:04:43.680 --> 00:04:45.720
predict whether patient will become sick

00:04:45.720 --> 00:04:47.060
or not.

00:04:47.060 --> 00:04:49.553
You can use, um...

00:04:49.553 --> 00:04:59.250
[NO AUDIO]

00:04:59.250 --> 00:05:00.930
Carlotta, you are muted.

00:05:03.780 --> 00:05:07.860
Oh, sorry. So, when I became muted, it's a

00:05:07.860 --> 00:05:11.940
long time, or? You can use-you can use, uh

00:05:11.940 --> 00:05:13.560
some models for classification. For

00:05:13.560 --> 00:05:16.919
example, you can use...You were saying this.

00:05:16.919 --> 00:05:21.660
Uh, so I was in this deck, or the previous one?

00:05:21.660 --> 00:05:24.180
This one, like you have been muted

00:05:24.180 --> 00:05:27.060
for, uh, one second [laughs]. Okay, okay perfect,

00:05:27.060 --> 00:05:30.419
perfect. Uh, yeah I was talking...sorry for

00:05:30.419 --> 00:05:33.278
that. So, I was talking about the possible

00:05:33.278 --> 00:05:34.560
scenarios in which you,

00:05:34.560 --> 00:05:37.320
you can use a classification model. Like

00:05:37.320 --> 00:05:39.660
have clinic scenario, financial scenario,

00:05:39.660 --> 00:05:41.699
or the third one is business type of

00:05:41.699 --> 00:05:44.100
scenario. You can use characteristics of

00:05:44.100 --> 00:05:45.900
small business to predict if a new

00:05:45.900 --> 00:05:47.880
venture will succeed or not, for

00:05:47.880 --> 00:05:49.560
example. And these are all types of

00:05:49.560 --> 00:05:52.160
binary classification.

00:05:52.160 --> 00:05:55.199
Uh, but today we are also going to talk

00:05:55.199 --> 00:05:57.240
about Azure Machine Learning. So let's

00:05:57.240 --> 00:05:58.139
see.

00:05:58.139 --> 00:06:00.660
What is Azure Machine Learning? So

00:06:00.660 --> 00:06:02.160
training and deploying an effective

00:06:02.160 --> 00:06:04.199
machine learning model involves a lot of

00:06:04.199 --> 00:06:06.539
work, much of it time-consuming and

00:06:06.539 --> 00:06:08.880
resource intensive. So, Azure Machine

00:06:08.880 --> 00:06:11.039
Learning is a cloud-based service that

00:06:11.039 --> 00:06:12.780
helps simplify some of the tasks it

00:06:12.780 --> 00:06:15.720
takes to prepare data, train a model, and

00:06:15.720 --> 00:06:18.060
also deploy it as a predictive service.

00:06:18.060 --> 00:06:20.220
So it helps that the scientists increase

00:06:20.220 --> 00:06:22.380
their efficiency by automating many of

00:06:22.380 --> 00:06:24.660
the time-consuming tasks associated to

00:06:24.660 --> 00:06:27.539
creating and training a model.

00:06:27.539 --> 00:06:29.520
And it enables them also to use

00:06:29.520 --> 00:06:31.740
cloud-based compute resources that scale

00:06:31.740 --> 00:06:33.720
effectively to handle large volumes of

00:06:33.720 --> 00:06:36.300
data while incurring costs only when

00:06:36.300 --> 00:06:38.699
actually used.

00:06:38.699 --> 00:06:41.220
To use Azure Machine Learning, you,

00:06:41.220 --> 00:06:43.199
first thing's first, you need to create a

00:06:43.199 --> 00:06:44.940
workspace resource in your Azure

00:06:44.940 --> 00:06:47.520
subscription, and you can then use these

00:06:47.520 --> 00:06:50.220
workspace to manage data, compute

00:06:50.220 --> 00:06:52.440
resources, code models and other

00:06:52.440 --> 00:06:55.139
artifacts after you have created an

00:06:55.139 --> 00:06:56.819
Azure Machine Learning workspace, you can

00:06:56.819 --> 00:06:58.560
develop solutions with the Azure Machine

00:06:58.560 --> 00:07:00.840
Learning service, either with developer

00:07:00.840 --> 00:07:02.580
tools or the Azure Machine Learning

00:07:02.580 --> 00:07:04.380
studio web portal.

00:07:04.380 --> 00:07:06.360
In particular, Azure Machine Learning

00:07:06.360 --> 00:07:07.800
studio is a web portal for Machine

00:07:07.800 --> 00:07:09.720
Learning Solutions in Azure, and it

00:07:09.720 --> 00:07:11.639
includes a wide range of features and

00:07:11.639 --> 00:07:13.800
capabilities that help data scientists

00:07:13.800 --> 00:07:16.259
prepare data, train models, publish

00:07:16.259 --> 00:07:18.479
predictive services, and monitor also

00:07:18.479 --> 00:07:19.680
their usage.

00:07:19.680 --> 00:07:22.139
So to begin using the web portal, you

00:07:22.139 --> 00:07:23.880
need to assign the workspace you created

00:07:23.880 --> 00:07:26.819
in the Azure portal to the Azure Machine

00:07:26.819 --> 00:07:29.419
Learning studio.

00:07:29.520 --> 00:07:31.800
At its core, Azure Machine Learning is a

00:07:31.800 --> 00:07:33.720
service for training and managing

00:07:33.720 --> 00:07:36.000
machine learning models for which you

00:07:36.000 --> 00:07:38.220
need compute resources on which to run

00:07:38.220 --> 00:07:39.919
the training process.

00:07:39.919 --> 00:07:44.280
Compute targets are, um, one of the main

00:07:44.280 --> 00:07:46.740
basic concepts of Azure Machine Learning.

00:07:46.740 --> 00:07:48.780
They are cloud-based resources on which

00:07:48.780 --> 00:07:50.639
you can run model training and data

00:07:50.639 --> 00:07:53.220
exploration processes.

00:07:53.220 --> 00:07:54.780
So in Azure Machine Learning studio, you

00:07:54.780 --> 00:07:56.759
can manage the compute targets for your

00:07:56.759 --> 00:07:58.740
data science activities, and there are

00:07:58.740 --> 00:08:03.240
four kinds of of compute targets you can

00:08:03.240 --> 00:08:05.940
create. We have the compute instances,

00:08:05.940 --> 00:08:09.539
which are vital machine set up for

00:08:09.539 --> 00:08:10.979
running machine learning code during

00:08:10.979 --> 00:08:13.319
development, so they are not designed for

00:08:13.319 --> 00:08:14.460
production.

00:08:14.460 --> 00:08:17.099
Then we have compute clusters, which are

00:08:17.099 --> 00:08:19.800
a set of virtual machines that can scale

00:08:19.800 --> 00:08:22.199
up automatically based on traffic.

00:08:22.199 --> 00:08:24.599
We have inference clusters, which are

00:08:24.599 --> 00:08:26.699
similar to compute clusters, but they are

00:08:26.699 --> 00:08:29.340
designed for deployment, so they are

00:08:29.340 --> 00:08:31.979
deployment targets for predictive

00:08:31.979 --> 00:08:35.820
services that use trained models.

00:08:35.820 --> 00:08:38.339
And finally, we have attached compute,

00:08:38.339 --> 00:08:41.339
which are any compute target that you

00:08:41.339 --> 00:08:44.159
manage yourself outside of Azure ML, like,

00:08:44.159 --> 00:08:46.560
for example, virtual machines or Azure

00:08:46.560 --> 00:08:49.700
data bricks clusters.

00:08:49.980 --> 00:08:52.800
So we talked about Azure Machine

00:08:52.800 --> 00:08:54.300
Learning, but we also mentioned-

00:08:54.300 --> 00:08:55.500
mentioned Azure Machine Learning

00:08:55.500 --> 00:08:57.540
designer. What is Azure Machine Learning

00:08:57.540 --> 00:09:00.120
designer? So, in Azure Machine Learning

00:09:00.120 --> 00:09:02.880
Studio, there are several ways to author

00:09:02.880 --> 00:09:04.560
classification machine learning models.

00:09:04.560 --> 00:09:08.100
One way is to use a visual interface, and

00:09:08.100 --> 00:09:10.260
this visual interface is called designer,

00:09:10.260 --> 00:09:13.140
and you can use it to train, test, and

00:09:13.140 --> 00:09:15.540
also deploy machine learning models. And

00:09:15.540 --> 00:09:17.940
the drag-and-drop interface makes use of

00:09:17.940 --> 00:09:20.279
clearly defined inputs and outputs that

00:09:20.279 --> 00:09:22.680
can be shared, reused, and also version

00:09:22.680 --> 00:09:23.880
control.

00:09:23.880 --> 00:09:25.920
And using the designer, you can identify

00:09:25.920 --> 00:09:28.080
the building blocks or components needed

00:09:28.080 --> 00:09:30.839
for your model, place and connect them on

00:09:30.839 --> 00:09:33.120
your canvas, and run a machine learning

00:09:33.120 --> 00:09:35.300
job.

00:09:35.399 --> 00:09:36.779
So,

00:09:36.779 --> 00:09:39.120
each designer project, so each project

00:09:39.120 --> 00:09:42.360
in the designer is known as a pipeline.

00:09:42.360 --> 00:09:45.600
And in the design, we have a left panel

00:09:45.600 --> 00:09:48.360
for navigation and a canvas on your

00:09:48.360 --> 00:09:50.640
right hand side in which you build your

00:09:50.640 --> 00:09:53.940
pipeline visually. So pipelines let you

00:09:53.940 --> 00:09:56.100
organize, manage, and reuse complex

00:09:56.100 --> 00:09:58.260
machine learning workflows across

00:09:58.260 --> 00:10:00.480
projects and users.

00:10:00.480 --> 00:10:03.000
A pipeline starts with the data set from

00:10:03.000 --> 00:10:04.140
which you want to train the model

00:10:04.140 --> 00:10:05.880
because all begins with data when

00:10:05.880 --> 00:10:07.380
talking about data science and machine

00:10:07.380 --> 00:10:09.540
learning. And each time you run a

00:10:09.540 --> 00:10:10.980
pipeline, the configuration of the

00:10:10.980 --> 00:10:12.959
pipeline and its results are stored in

00:10:12.959 --> 00:10:17.339
your workspace as a pipeline job.

00:10:17.339 --> 00:10:21.959
So the second main concept of Azure

00:10:21.959 --> 00:10:25.080
Machine Learning is a component. So, going

00:10:25.080 --> 00:10:28.440
hierarchically from the pipeline, we can

00:10:28.440 --> 00:10:30.540
say that each building block of a

00:10:30.540 --> 00:10:32.920
pipeline is called a component.

00:10:32.920 --> 00:10:34.120
In other words, an Azure Machine

00:10:34.120 --> 00:10:36.959
Learning component encapsulates one step

00:10:36.959 --> 00:10:39.420
in a machine learning pipeline. So, it's a

00:10:39.420 --> 00:10:41.640
reusable piece of code with inputs and

00:10:41.640 --> 00:10:44.100
outputs, something very similar to a

00:10:44.100 --> 00:10:46.500
function in any programming language.

00:10:46.500 --> 00:10:48.899
And in a pipeline project, you can access

00:10:48.899 --> 00:10:51.480
data assets and components from the left

00:10:51.480 --> 00:10:52.700
panels

00:10:52.700 --> 00:10:56.279
Asset Library tab, as you can see

00:10:56.279 --> 00:11:00.200
here in the screenshot in the deck.

00:11:00.300 --> 00:11:03.360
So you can create data assets on using

00:11:03.360 --> 00:11:08.339
an ADOC page called Data Page. And a data

00:11:08.339 --> 00:11:11.160
asset is a reference to a data source

00:11:11.160 --> 00:11:12.480
location.

00:11:12.480 --> 00:11:15.720
So this data source location could be a

00:11:15.720 --> 00:11:18.779
local file, a data store, a web file or

00:11:18.779 --> 00:11:21.660
even an Azure open asset.

00:11:21.660 --> 00:11:23.880
And these data assets will appear along

00:11:23.880 --> 00:11:26.459
with standard sample data set in the

00:11:26.459 --> 00:11:30.019
designers Asset Library.

00:11:30.079 --> 00:11:31.560
Um.

00:11:31.560 --> 00:11:36.959
Another basic concept of Azure ML is

00:11:36.959 --> 00:11:38.880
Azure Machine Learning jobs.

00:11:38.880 --> 00:11:43.519
So, basically, when you submit a pipeline,

00:11:43.519 --> 00:11:47.040
you create a job which will run all the

00:11:47.040 --> 00:11:49.920
steps in your pipeline. So a job executes

00:11:49.920 --> 00:11:52.800
a task against a specified compute

00:11:52.800 --> 00:11:53.760
target.

00:11:53.760 --> 00:11:56.640
Jobs enable systematic tracking for your

00:11:56.640 --> 00:11:58.560
machine learning experimentation in

00:11:58.560 --> 00:11:59.880
Azure ML.

00:11:59.880 --> 00:12:02.399
And once a job is created, Azure ML

00:12:02.399 --> 00:12:05.459
maintains a run record, uh, for the

00:12:05.459 --> 00:12:07.640
job.

00:12:07.877 --> 00:12:12.180
Um, but, let's move to the classification

00:12:12.180 --> 00:12:14.040
steps. So,

00:12:14.040 --> 00:12:17.160
um, let's introduce how to create a

00:12:17.160 --> 00:12:21.360
classification model in Azure ML, but you

00:12:21.360 --> 00:12:23.640
will see it in more details in a

00:12:23.640 --> 00:12:26.339
handsome demo that John will guide

00:12:26.339 --> 00:12:29.459
through in a few minutes.

00:12:29.459 --> 00:12:32.220
So, you can think of the steps to train

00:12:32.220 --> 00:12:33.720
and evaluate a classification machine

00:12:33.720 --> 00:12:36.660
learning model as four main steps. So

00:12:36.660 --> 00:12:38.459
first of all, you need to prepare your

00:12:38.459 --> 00:12:41.100
data. So, you need to identify the

00:12:41.100 --> 00:12:43.139
features and the label in your data set,

00:12:43.139 --> 00:12:46.139
you need to pre-process, so you need to

00:12:46.139 --> 00:12:48.839
clean and transform the data as needed.

00:12:48.839 --> 00:12:51.120
Then, the second step, of course, is

00:12:51.120 --> 00:12:52.740
training the model.

00:12:52.740 --> 00:12:54.600
And for training the model, you need to

00:12:54.600 --> 00:12:57.060
split the data into two groups: a

00:12:57.060 --> 00:12:59.519
training and a validation set.

00:12:59.519 --> 00:13:01.320
Then you train a machine learning model

00:13:01.320 --> 00:13:03.540
using the training data set and you test

00:13:03.540 --> 00:13:05.040
the machine learning model for

00:13:05.040 --> 00:13:06.889
performance using the validation data

00:13:06.889 --> 00:13:08.100
set.

00:13:08.100 --> 00:13:12.180
The third step is performance evaluation,

00:13:12.180 --> 00:13:14.519
which means comparing how close the

00:13:14.519 --> 00:13:16.139
model's predictions are to the known

00:13:16.139 --> 00:13:20.519
labels and these lead us to compute some

00:13:20.519 --> 00:13:23.279
evaluation performance metrics.

00:13:23.279 --> 00:13:25.740
And then finally...

00:13:25.740 --> 00:13:29.051
So, these three steps are not,

00:13:29.051 --> 00:13:32.770
um, not performed every time in a

00:13:32.770 --> 00:13:35.459
linear manner. It's more an iterative

00:13:35.459 --> 00:13:39.420
process. But once you obtain, you achieve

00:13:39.420 --> 00:13:42.959
a performance with which you are

00:13:42.959 --> 00:13:45.779
satisfied, so you are ready to, let's say

00:13:45.779 --> 00:13:48.660
go into production, and you can deploy

00:13:48.660 --> 00:13:51.920
your train model as a predictive service

00:13:51.920 --> 00:13:55.980
into a real-time, uh, to a real-time

00:13:55.980 --> 00:13:58.019
endpoint. And to do so, you need to

00:13:58.019 --> 00:14:00.240
convert the training pipeline into a

00:14:00.240 --> 00:14:02.820
real-time inference pipeline, and then

00:14:02.820 --> 00:14:04.260
you can deploy the model as an

00:14:04.260 --> 00:14:06.779
application on a server or device so

00:14:06.779 --> 00:14:11.420
that others can consume this model.

00:14:11.459 --> 00:14:14.279
So let's start with the first step, which

00:14:14.279 --> 00:14:17.700
is prepare data. Real-world data can contain

00:14:17.700 --> 00:14:19.920
many different issues that can affect

00:14:19.920 --> 00:14:22.320
the utility of the data and our

00:14:22.320 --> 00:14:24.959
interpretation of the results. So also

00:14:24.959 --> 00:14:26.579
the machine learning model that you

00:14:26.579 --> 00:14:29.279
train using this data. For example, real-

00:14:29.279 --> 00:14:31.440
world data can be affected by a bad

00:14:31.440 --> 00:14:34.079
recording or a bad measurement, and it

00:14:34.079 --> 00:14:36.480
can also contain missing values for some

00:14:36.480 --> 00:14:38.880
parameters. And Azure Machine Learning

00:14:38.880 --> 00:14:40.860
designer has several pre-built

00:14:40.860 --> 00:14:43.019
components that can be used to prepare

00:14:43.019 --> 00:14:46.079
data for training. These components

00:14:46.079 --> 00:14:48.300
enable you to clean data, normalize

00:14:48.300 --> 00:14:52.940
features, join tables, and more.

00:14:53.000 --> 00:14:57.120
Let's come to training. So, to train a

00:14:57.120 --> 00:14:59.220
classification model you need a data set

00:14:59.220 --> 00:15:02.160
that includes historical features, so the

00:15:02.160 --> 00:15:03.899
characteristics of the entity for which

00:15:03.899 --> 00:15:06.899
one to make a prediction, and known label

00:15:06.899 --> 00:15:09.779
values. The label is the class indicator

00:15:09.779 --> 00:15:11.820
we want to train a model to predict.

00:15:11.820 --> 00:15:13.920
And it's common practice to train a

00:15:13.920 --> 00:15:16.199
model using a subset of the data while

00:15:16.199 --> 00:15:18.300
holding back some data with which to

00:15:18.300 --> 00:15:20.760
test the train model. And this enables

00:15:20.760 --> 00:15:22.440
you to compare the labels that the model

00:15:22.440 --> 00:15:25.380
predicts with the actual known labels in

00:15:25.380 --> 00:15:27.420
the original data set.

00:15:27.420 --> 00:15:29.880
This operation can be performed in the

00:15:29.880 --> 00:15:32.100
designer using the split data component

00:15:32.100 --> 00:15:34.740
as shown by the screenshot here in the...

00:15:34.740 --> 00:15:36.660
in the deck.

00:15:36.660 --> 00:15:39.540
There's also another component that you

00:15:39.540 --> 00:15:40.980
should use, which is the score model

00:15:40.980 --> 00:15:43.139
component to generate the predicted

00:15:43.139 --> 00:15:45.360
class label value using the validation

00:15:45.360 --> 00:15:48.060
data as input. So once you connect all

00:15:48.060 --> 00:15:49.800
these components,

00:15:49.800 --> 00:15:52.440
the component specifying the

00:15:52.440 --> 00:15:54.959
model we are going to use, the split data

00:15:54.959 --> 00:15:57.060
component, the trained model component,

00:15:57.060 --> 00:16:00.300
and the score model component, you want

00:16:00.300 --> 00:16:02.639
to run a new experiment in

00:16:02.639 --> 00:16:05.760
Azure ML, which will use the data set

00:16:05.760 --> 00:16:09.600
on the canvas to train and score a model.

00:16:09.600 --> 00:16:12.000
After training a model, it is important,

00:16:12.000 --> 00:16:14.639
we say, to evaluate its performance, to

00:16:14.639 --> 00:16:17.060
understand how bad-how good sorry

00:16:17.060 --> 00:16:20.760
our model is performing.

00:16:20.760 --> 00:16:22.680
And there are many performance metrics

00:16:22.680 --> 00:16:24.600
and methodologies for evaluating how

00:16:24.600 --> 00:16:27.000
well a model makes predictions. The

00:16:27.000 --> 00:16:29.160
component to use to perform evaluation

00:16:29.160 --> 00:16:32.220
in Azure ML designer is called, as

00:16:32.220 --> 00:16:35.060
intuitive as it is, Evaluate Model.

00:16:35.060 --> 00:16:38.339
Once the job of training and evaluation

00:16:38.339 --> 00:16:40.740
of the model is completed, you can review

00:16:40.740 --> 00:16:42.959
evaluation metrics on the completed job

00:16:42.959 --> 00:16:45.860
page by right clicking on the component.

00:16:45.860 --> 00:16:48.480
In the evaluation results, you can also

00:16:48.480 --> 00:16:51.000
find the so-called confusion Matrix that

00:16:51.000 --> 00:16:53.399
you can see here in the right side of

00:16:53.399 --> 00:16:55.079
this deck

00:16:55.079 --> 00:16:57.420
A confusion matrix shows cases where

00:16:57.420 --> 00:16:59.220
both the predicted and actual values

00:16:59.220 --> 00:17:01.980
were one, the so-called true positives

00:17:01.980 --> 00:17:04.500
at the top left and also cases where

00:17:04.500 --> 00:17:06.600
both the predicted and the actual values

00:17:06.600 --> 00:17:08.459
were zero, the so-called true negatives

00:17:08.459 --> 00:17:10.919
at the bottom right. While the other

00:17:10.919 --> 00:17:13.679
cells show cases where the predicting

00:17:13.679 --> 00:17:15.380
and actual values differ,

00:17:15.380 --> 00:17:17.939
called false positive and false

00:17:17.939 --> 00:17:19.919
negatives, and this is an example of a

00:17:19.919 --> 00:17:23.579
confusion matrix for a binary classifier.

00:17:23.579 --> 00:17:25.559
While for a multi-class classification

00:17:25.559 --> 00:17:28.079
model the same approach is used to

00:17:28.079 --> 00:17:30.120
tabulate each possible combination of

00:17:30.120 --> 00:17:32.940
actual and predictive value counts. So

00:17:32.940 --> 00:17:34.740
for example, a model with three possible

00:17:34.740 --> 00:17:37.559
classes would result in three times

00:17:37.559 --> 00:17:39.120
three matrix.

00:17:39.120 --> 00:17:41.880
The confusion matrix is also useful for

00:17:41.880 --> 00:17:43.860
the matrix that can be derived from it,

00:17:43.860 --> 00:17:48.260
like accuracy, recall, or precision.

00:17:49.320 --> 00:17:52.080
We say that the last step is

00:17:52.080 --> 00:17:55.620
deploying the train model to a real-time

00:17:55.620 --> 00:17:59.280
endpoint as a predictive service. And in

00:17:59.280 --> 00:18:00.900
order to automate your model into a

00:18:00.900 --> 00:18:02.760
service that makes continuous

00:18:02.760 --> 00:18:04.980
predictions, you need, first of all, to

00:18:04.980 --> 00:18:08.039
create and then deploy an

00:18:08.039 --> 00:18:10.080
inference pipeline. The process of

00:18:10.080 --> 00:18:11.940
converting the training pipeline into a

00:18:11.940 --> 00:18:13.980
real-time inference pipeline removes

00:18:13.980 --> 00:18:16.260
training components and adds web service

00:18:16.260 --> 00:18:18.960
inputs and outputs to handle requests.

00:18:18.960 --> 00:18:21.240
And the inference pipeline performs...they

00:18:21.240 --> 00:18:22.679
seem that the transformation is the

00:18:22.679 --> 00:18:26.160
first pipeline, but for new data. Then it

00:18:26.160 --> 00:18:28.679
uses the train model to infer or predict

00:18:28.679 --> 00:18:32.539
label values based on its feature.

00:18:32.820 --> 00:18:36.120
So, I think I've talked a lot for now

00:18:36.120 --> 00:18:40.380
I would like to let John show us

00:18:40.380 --> 00:18:44.340
something in practice with

00:18:44.340 --> 00:18:47.280
the hands-on demo, so please, John, go

00:18:47.280 --> 00:18:49.860
ahead, share your screen and guide us

00:18:49.860 --> 00:18:52.380
through this demo of creating a

00:18:52.380 --> 00:18:53.760
classification with the Azure Machine

00:18:53.760 --> 00:18:55.860
Learning designer.

00:18:55.860 --> 00:18:58.919
Thank you so much Carlotta for this

00:18:58.919 --> 00:19:01.380
interesting explanation of the Azure ML

00:19:01.380 --> 00:19:03.810
designer. And now,

00:19:03.810 --> 00:19:07.500
um, I'm going to start with you in the

00:19:07.500 --> 00:19:10.200
practical demo part, so if you want to

00:19:10.200 --> 00:19:13.320
follow along, go to the link that Carlotta

00:19:13.320 --> 00:19:18.380
sent in the chat so you can do

00:19:18.380 --> 00:19:21.840
the demo or the practical part with me.

00:19:21.840 --> 00:19:25.260
I'm just going to share my screen...

00:19:25.260 --> 00:19:27.140
and...

00:19:27.140 --> 00:19:31.559
...go here. So, uh...

00:19:31.559 --> 00:19:34.320
Where am I right now? I'm inside the

00:19:34.320 --> 00:19:36.960
Microsoft Learn documentation. This is

00:19:36.960 --> 00:19:40.260
the exercise part of this module, and we

00:19:40.260 --> 00:19:43.080
will start by setting two things, which

00:19:43.080 --> 00:19:45.299
are a prequisite for us to work inside

00:19:45.299 --> 00:19:49.919
this module, which are the users group

00:19:49.919 --> 00:19:52.400
and the Azure Machine Learning workspace,

00:19:52.400 --> 00:19:55.620
and something extra which is the compute

00:19:55.620 --> 00:19:59.760
cluster that Carlotta talked about. So I

00:19:59.760 --> 00:20:02.100
just want to make sure that you all have

00:20:02.100 --> 00:20:05.660
a resource group created inside your

00:20:05.660 --> 00:20:08.039
portal inside your Microsoft Azure

00:20:08.039 --> 00:20:11.100
platform. So this is my resource group.

00:20:11.100 --> 00:20:14.640
Inside this is this Resource Group. I

00:20:14.640 --> 00:20:17.299
have created an Azure Machine Learning

00:20:17.299 --> 00:20:21.539
workspace. So I'm just going to access

00:20:21.539 --> 00:20:24.000
the workspace that I have created

00:20:24.000 --> 00:20:27.000
already from this link. I am going to

00:20:27.000 --> 00:20:30.240
open it, which is the studio web URL, and

00:20:30.240 --> 00:20:33.000
I will follow the steps. So what is this?

00:20:33.000 --> 00:20:35.760
This is your machine learning workspace,

00:20:35.760 --> 00:20:38.220
or machine learning studio. You can do a

00:20:38.220 --> 00:20:40.080
lot of things here, but we are going to

00:20:40.080 --> 00:20:42.419
focus mainly on the designer and the

00:20:42.419 --> 00:20:46.080
data and the compute. So another

00:20:46.080 --> 00:20:49.140
prerequisite here, as Carlotta told you,

00:20:49.140 --> 00:20:51.480
we need some resources to power up the

00:20:51.480 --> 00:20:54.299
classification, the processes that

00:20:54.299 --> 00:20:55.140
will happen.

00:20:55.140 --> 00:20:58.080
So, we have created this computing

00:20:58.080 --> 00:20:59.100
cluster,

00:20:59.100 --> 00:21:02.880
and we have set some presets for

00:21:02.880 --> 00:21:04.140
it. So

00:21:04.140 --> 00:21:07.080
where can you find this preset? You go

00:21:07.080 --> 00:21:10.200
here. Under the create compute, you'll

00:21:10.200 --> 00:21:13.220
find everything that you need to do. So

00:21:13.220 --> 00:21:16.740
the size is the Standard DS11 Version 2,

00:21:16.740 --> 00:21:19.799
and it's a CPU not GPU, because we don't

00:21:19.799 --> 00:21:22.500
know the GPU, and we don't need a GPU.

00:21:22.500 --> 00:21:25.799
Uh, it is ready for us to use.

00:21:25.799 --> 00:21:30.900
The next thing which we will look into

00:21:30.900 --> 00:21:33.600
is the designer. How can you access the

00:21:33.600 --> 00:21:35.100
designer?

00:21:35.100 --> 00:21:37.679
You can either click on this icon or

00:21:37.679 --> 00:21:40.020
click on the navigation menu and click

00:21:40.020 --> 00:21:42.299
on the designer for me.

00:21:42.900 --> 00:21:45.780
Now I am inside my designer.

00:21:45.780 --> 00:21:47.640
What we are going to do now is the

00:21:47.640 --> 00:21:50.280
pipeline that Carlotta told you about.

00:21:50.280 --> 00:21:54.360
And from where can I know these steps? If

00:21:54.360 --> 00:21:57.120
you follow along in the learn module, you

00:21:57.120 --> 00:21:58.740
will find everything that I'm doing

00:21:58.740 --> 00:22:02.340
right now in detail, with screenshots

00:22:02.340 --> 00:22:05.820
of course. So I'm going to create a new

00:22:05.820 --> 00:22:09.120
pipeline, and I can do so by clicking on

00:22:09.120 --> 00:22:10.980
this plus button.

00:22:10.980 --> 00:22:13.740
It's going to redirect me to the

00:22:13.740 --> 00:22:17.100
designer authoring the pipeline, uh, where

00:22:17.100 --> 00:22:19.500
I can drag and drop data and components

00:22:19.500 --> 00:22:21.780
that Carlotta told you the difference

00:22:21.780 --> 00:22:22.980
between.

00:22:22.980 --> 00:22:26.340
And here I am going to do some changes

00:22:26.340 --> 00:22:29.100
to the settings. I am going to connect

00:22:29.100 --> 00:22:31.860
this with my compute cluster that I

00:22:31.860 --> 00:22:35.120
created previously so I can utilize it.

00:22:35.120 --> 00:22:38.100
From here I'm going to choose this

00:22:38.100 --> 00:22:40.380
compute cluster demo that I have showed

00:22:40.380 --> 00:22:42.600
you before in the clusters here,

00:22:42.600 --> 00:22:45.900
and I am going to change the name to

00:22:45.900 --> 00:22:47.820
something more meaningful. Instead of

00:22:47.820 --> 00:22:50.580
byline and the date of today I'm going

00:22:50.580 --> 00:22:53.760
to name it Diabetes...

00:22:53.760 --> 00:22:56.120
uh...

00:22:56.120 --> 00:23:00.020
let's just check this training.

00:23:00.020 --> 00:23:05.100
Let's say Training 0.1 or 01, okay?

00:23:05.100 --> 00:23:09.360
And I am going to close this tab in

00:23:09.360 --> 00:23:12.000
order to have a bigger place to work

00:23:12.000 --> 00:23:14.700
inside because this is where we will

00:23:14.700 --> 00:23:17.220
work, where everything will happen. So I

00:23:17.220 --> 00:23:19.559
will click on close from here,

00:23:19.559 --> 00:23:23.460
and I will go to the data and I will

00:23:23.460 --> 00:23:25.620
create a new data set.

00:23:25.620 --> 00:23:27.900
How can I create a new data set? There is

00:23:27.900 --> 00:23:29.880
multiple options here you can find, from

00:23:29.880 --> 00:23:31.799
local files, from data store, from web

00:23:31.799 --> 00:23:34.020
files, from open data set, but I'm going

00:23:34.020 --> 00:23:36.539
to choose from web files, as this is the

00:23:36.539 --> 00:23:40.280
way we're going to create our data.

00:23:40.280 --> 00:23:43.380
From here, the information of my data set

00:23:43.380 --> 00:23:47.340
I'm going to get them from the Microsoft

00:23:47.340 --> 00:23:50.820
Learn module. So if we go to the step

00:23:50.820 --> 00:23:52.860
that says "Create a dataset",

00:23:52.860 --> 00:23:55.020
under it, it illustrates that you can

00:23:55.020 --> 00:23:57.720
access the data from inside the asset

00:23:57.720 --> 00:23:59.760
library, and inside your asset library,

00:23:59.760 --> 00:24:01.679
you'll find the data and find the

00:24:01.679 --> 00:24:05.539
component. And I'm going to select

00:24:05.539 --> 00:24:09.000
this link because this is where my data

00:24:09.000 --> 00:24:12.000
is stored. If you open this link, you will

00:24:12.000 --> 00:24:14.820
find this is a CSV file, I think.

00:24:14.820 --> 00:24:17.400
Yeah. And you can...like, all the data are

00:24:17.400 --> 00:24:18.360
here.

00:24:18.360 --> 00:24:21.079
Now let's get back..

00:24:21.079 --> 00:24:22.149
Um...

00:24:26.880 --> 00:24:28.200
And you are going to do something

00:24:28.200 --> 00:24:29.880
meaningful, but because I have already

00:24:29.880 --> 00:24:31.820
created it before twice, so I'm gonna

00:24:31.820 --> 00:24:34.980
add a number to the name

00:24:34.980 --> 00:24:37.559
The data set is tabular and there is

00:24:37.559 --> 00:24:39.360
the file, but this is a table, so we're

00:24:39.360 --> 00:24:40.760
going to choose the table.

00:24:40.760 --> 00:24:42.240
Data type

00:24:42.240 --> 00:24:43.740
for data set type.

00:24:43.740 --> 00:24:46.260
Now we will click on "Next". That's gonna

00:24:46.260 --> 00:24:51.179
review, or display for you the content

00:24:51.179 --> 00:24:54.020
of this file that you have

00:24:54.020 --> 00:24:57.419
imported to this workspace.

00:24:57.419 --> 00:25:01.559
And for these settings, these are

00:25:01.559 --> 00:25:03.720
related to our file format.

00:25:03.720 --> 00:25:08.280
So this is a delimited file, and it's not

00:25:08.280 --> 00:25:11.400
plain text, it's not a Jason. The delimiter

00:25:11.400 --> 00:25:14.159
is common, as we have seen that they

00:25:14.159 --> 00:25:26.700
[INDISTINGUISHABLE]

00:25:26.700 --> 00:25:29.039
So I'm choosing common

00:25:29.039 --> 00:25:32.900
errors because the only the first five...

00:25:32.900 --> 00:25:34.880
[INDISTINGUISHABLE]

00:25:34.880 --> 00:25:38.159
...for example. Okay, uh, if you have any

00:25:38.159 --> 00:25:39.960
doubts, if you have any problems, please

00:25:39.960 --> 00:25:42.960
don't hesitate to write me

00:25:42.960 --> 00:25:45.020
in the chat,

00:25:45.020 --> 00:25:48.480
like, what is blocking you, and

00:25:48.480 --> 00:25:50.940
me and Carlotta will try to help you,

00:25:50.940 --> 00:25:53.220
like whenever possible.

00:25:53.220 --> 00:25:55.659
And now this is the new preview for my

00:25:55.659 --> 00:25:57.840
data set. I can see that I have an ID, I

00:25:57.840 --> 00:25:59.700
have patient ID, I have pregnancies, I

00:25:59.700 --> 00:26:02.220
have the age of the people,

00:26:02.220 --> 00:26:05.720
I have the body mass, I think

00:26:05.720 --> 00:26:08.460
whether they have diabetes or not, as a

00:26:08.460 --> 00:26:10.679
zero and one. Zero indicates a negative,

00:26:10.679 --> 00:26:14.159
the person doesn't have diabetes, and one

00:26:14.159 --> 00:26:16.080
indicates a positive, that this person

00:26:16.080 --> 00:26:18.299
has diabetes. Okay.

00:26:18.299 --> 00:26:20.520
Now I'm going to click on "Next". Here I am

00:26:20.520 --> 00:26:23.400
defining my schema. All the data types

00:26:23.400 --> 00:26:25.380
inside my columns, the column names, which

00:26:25.380 --> 00:26:28.760
columns to include, which to exclude. And

00:26:28.760 --> 00:26:31.500
here we will include everything except

00:26:31.500 --> 00:26:35.580
the path of the bath color. And we are

00:26:35.580 --> 00:26:37.860
going to review the data types of each

00:26:37.860 --> 00:26:40.440
column. So let's review this first one.

00:26:40.440 --> 00:26:43.320
This is numbers, numbers, numbers, then it's the

00:26:43.320 --> 00:26:45.779
integer. And this is,

00:26:45.779 --> 00:26:48.679
um, like decimal..

00:26:48.679 --> 00:26:50.900
...dotted...

00:26:50.900 --> 00:26:53.580
decimal number. So we are going to choose

00:26:53.580 --> 00:26:55.020
this data type.

00:26:55.020 --> 00:26:57.200
And for this one

00:26:57.200 --> 00:27:01.200
it says diabetic, and it's a zero under

00:27:01.200 --> 00:27:02.460
one, and we are going to make it as

00:27:02.460 --> 00:27:04.460
integers.

00:27:04.460 --> 00:27:07.980
Now we are going to click on "Next" and

00:27:07.980 --> 00:27:09.780
move to reviewing everything. This is

00:27:09.780 --> 00:27:11.569
everything that we have defined together.

00:27:11.569 --> 00:27:13.500
I will click on "Create".

00:27:13.500 --> 00:27:15.179
And...

00:27:15.179 --> 00:27:17.940
now the first step has ended. We have

00:27:17.940 --> 00:27:19.919
gotten our data ready.

00:27:19.919 --> 00:27:22.440
Now...what now? We're going to utilize the

00:27:22.440 --> 00:27:23.468
designer...

00:27:23.468 --> 00:27:26.820
um...power. We're going to drag and drop

00:27:26.820 --> 00:27:29.820
our data set to create the pipeline.

00:27:29.820 --> 00:27:33.179
So I have clicked on it and dragged it

00:27:33.179 --> 00:27:35.640
to this space. It's gonna appear to you.

00:27:35.640 --> 00:27:39.659
And we can inspect it by right clicking and

00:27:39.659 --> 00:27:42.179
choose "Preview data"

00:27:42.179 --> 00:27:46.200
to see what we have created together.

00:27:46.200 --> 00:27:48.900
From here, you can see everything that we

00:27:48.900 --> 00:27:50.700
have seen previously, but in more

00:27:50.700 --> 00:27:53.100
details. And we are just going to close

00:27:53.100 --> 00:27:56.580
this. Now what? Now we are gonna do the

00:27:56.580 --> 00:28:00.799
processing that Carlota mentioned.

00:28:00.799 --> 00:28:03.659
These are some instructions about the

00:28:03.659 --> 00:28:05.460
data, about how you can look at them, how you

00:28:05.460 --> 00:28:07.140
can open them but we are going to move

00:28:07.140 --> 00:28:09.720
to the transformation or the processing.

00:28:09.720 --> 00:28:13.500
So as Carlotta told you, like any data

00:28:13.500 --> 00:28:15.480
for us to work on we have to do some

00:28:15.480 --> 00:28:17.299
processing to it

00:28:17.299 --> 00:28:20.159
to make it easy easier for the model to

00:28:20.159 --> 00:28:23.279
be trained and easier to work with. So, uh,

00:28:23.279 --> 00:28:25.860
we're gonna do the normalization. And

00:28:25.860 --> 00:28:29.159
normalization meaning is, uh,

00:28:29.159 --> 00:28:33.539
to scale our data, either down or up, but

00:28:33.539 --> 00:28:35.400
we're going to scale them down,

00:28:35.400 --> 00:28:38.820
and we are going to decrease, uh,

00:28:38.820 --> 00:28:40.799
relatively decrease

00:28:40.799 --> 00:28:44.640
the values, all the values, to work

00:28:44.640 --> 00:28:48.120
with lower numbers. And if we are working

00:28:48.120 --> 00:28:49.559
with larger numbers, it's going to take

00:28:49.559 --> 00:28:52.500
more time. If we're working with smaller

00:28:52.500 --> 00:28:54.779
numbers, it's going to take less time to

00:28:54.779 --> 00:28:59.159
calculate them, and that's it. So

00:28:59.159 --> 00:29:02.159
where can I find the normalized data? I

00:29:02.159 --> 00:29:04.260
can find it inside my component.

00:29:04.260 --> 00:29:06.720
So I will choose the component and

00:29:06.720 --> 00:29:09.659
search for "Normalized data".

00:29:09.659 --> 00:29:12.360
I will drag and drop it as usual and I

00:29:12.360 --> 00:29:14.820
will connect between these two things

00:29:14.820 --> 00:29:18.360
by clicking on this spot, this, uh,

00:29:18.360 --> 00:29:20.159
circuit, and

00:29:20.159 --> 00:29:23.159
drag and drop onto the next circuit.

00:29:23.159 --> 00:29:24.899
Now we are going to define our

00:29:24.899 --> 00:29:27.419
normalization method.

00:29:27.419 --> 00:29:31.080
So I'm going to double click on the

00:29:31.080 --> 00:29:32.640
normalized data.

00:29:32.640 --> 00:29:34.860
It's going to open the settings for the

00:29:34.860 --> 00:29:36.480
normalization

00:29:36.480 --> 00:29:38.820
as a better transformation method, which is

00:29:38.820 --> 00:29:40.500
a mathematical way

00:29:40.500 --> 00:29:42.299
that is going to scale our data

00:29:42.299 --> 00:29:44.520
according to.

00:29:44.520 --> 00:29:47.760
We're going to choose min-max, and for

00:29:47.760 --> 00:29:51.539
this one, we are going to choose "Use Zero",

00:29:51.539 --> 00:29:53.100
for constant column we are going to

00:29:53.100 --> 00:29:54.480
choose "True",

00:29:54.480 --> 00:29:56.880
and we are going to define which columns

00:29:56.880 --> 00:29:58.860
to normalize. So we are not going to

00:29:58.860 --> 00:30:01.080
normalize the whole data set. We are

00:30:01.080 --> 00:30:02.760
going to choose a subset from the data

00:30:02.760 --> 00:30:04.559
set to normalize. So we're going to

00:30:04.559 --> 00:30:07.020
choose everything except for the patient

00:30:07.020 --> 00:30:09.000
ID and the diabetic, because the patient

00:30:09.000 --> 00:30:10.919
ID is a number, but it's a categorical

00:30:10.919 --> 00:30:13.740
data. It describes a patient, it's not a

00:30:13.740 --> 00:30:17.460
number that I can sum. I can't say "patient

00:30:17.460 --> 00:30:20.159
ID number one plus patient ID number two".

00:30:20.159 --> 00:30:21.720
No, this is a patient and another

00:30:21.720 --> 00:30:23.399
patient, it's not a number that I can do

00:30:23.399 --> 00:30:25.740
mathematical operations on, so I'm not

00:30:25.740 --> 00:30:28.200
going to choose it. So we will choose

00:30:28.200 --> 00:30:30.539
everything as I said, except for the

00:30:30.539 --> 00:30:33.480
diabetic and the patient ID. I will

00:30:33.480 --> 00:30:34.860
click on "Save".

00:30:34.860 --> 00:30:37.740
And it's not showing me a warning again,

00:30:37.740 --> 00:30:39.480
everything is good.

00:30:39.480 --> 00:30:41.880
Now I can click on "Submit"

00:30:41.880 --> 00:30:46.799
and review my normalization output.

00:30:46.799 --> 00:30:48.240
Um.

00:30:48.240 --> 00:30:51.659
So, if you click on "Submit" here,

00:30:51.659 --> 00:30:54.659
you will choose "Create new" and

00:30:54.659 --> 00:30:56.460
set the name that is mentioned here

00:30:56.460 --> 00:30:59.899
inside the notebook. So it tells you

00:30:59.899 --> 00:31:03.419
to create a job and name it, name

00:31:03.419 --> 00:31:05.460
the experiment "MS Learn Diabetes

00:31:05.460 --> 00:31:06.720
Training", because you will continue

00:31:06.720 --> 00:31:10.160
working on and building component later.

00:31:10.160 --> 00:31:13.020
I have it already created, I am the, uh,

00:31:13.020 --> 00:31:16.919
we can review it together. So let

00:31:16.919 --> 00:31:19.860
me just open this in another tab. I think

00:31:19.860 --> 00:31:21.000
I have it...

00:31:21.000 --> 00:31:23.659
here.

00:31:25.679 --> 00:31:28.220
Okay.

00:31:30.720 --> 00:31:34.740
So, these are all the jobs that I have

00:31:34.740 --> 00:31:37.340
created.

00:31:37.860 --> 00:31:40.119
All the jobs there. Let's do this over.

00:31:40.119 --> 00:31:42.059
These are all the jobs that I have

00:31:42.059 --> 00:31:43.679
submitted previously.

00:31:43.679 --> 00:31:45.840
And I think this one is the

00:31:45.840 --> 00:31:48.360
normalization job, so let's see the

00:31:48.360 --> 00:31:50.100
output of it.

00:31:50.100 --> 00:31:54.120
As you can see, it says, uh, "Check mark", yes,

00:31:54.120 --> 00:31:56.640
which means that it worked, and we can

00:31:56.640 --> 00:31:59.399
preview it. How can I do that? Right click

00:31:59.399 --> 00:32:02.539
on it, choose "Preview data",

00:32:02.539 --> 00:32:06.659
and as you can see all the data are

00:32:06.659 --> 00:32:08.399
scaled down

00:32:08.399 --> 00:32:10.980
so everything is between zero

00:32:10.980 --> 00:32:15.860
and, uh, one I think.

00:32:15.860 --> 00:32:18.899
So everything is good for us. Now we

00:32:18.899 --> 00:32:21.840
can move forward to the next step

00:32:21.840 --> 00:32:26.939
which is to create the whole pipeline.

00:32:26.939 --> 00:32:30.840
So, uh, Carlota told you that

00:32:30.840 --> 00:32:33.179
we're going to use a classification

00:32:33.179 --> 00:32:37.260
model to create this data set, so let

00:32:37.260 --> 00:32:40.620
me just drag and drop everything

00:32:40.620 --> 00:32:43.140
to get runtime and we're doing

00:32:43.140 --> 00:32:46.489
[INDISTINGUISHABLE]

00:32:46.489 --> 00:32:48.469
about everything by

00:32:48.469 --> 00:32:51.419
[INDISTINGUISHABLE]

00:32:51.419 --> 00:32:52.919
So,

00:32:52.919 --> 00:32:55.593
as a result, we are going to explain

00:32:55.593 --> 00:32:59.760
[INDISTINGUISHABLE]

00:32:59.760 --> 00:33:03.600
Yeah. So, I'm going to give this split

00:33:03.600 --> 00:33:06.070
data. I'm going to take the

00:33:06.070 --> 00:33:08.880
transformation data to split data and

00:33:08.880 --> 00:33:10.380
connect it like that.

00:33:10.380 --> 00:33:12.299
I'm going to get three model

00:33:12.299 --> 00:33:15.240
components because I want to train my

00:33:15.240 --> 00:33:16.679
model,

00:33:16.679 --> 00:33:19.740
and I'm going to put it right here.

00:33:19.740 --> 00:33:21.740
Okay.

00:33:21.740 --> 00:33:24.419
Let's just move it down there. Okay.

00:33:24.419 --> 00:33:27.059
And we are going to use a classification

00:33:27.059 --> 00:33:28.620
model,

00:33:28.620 --> 00:33:31.880
a two class

00:33:32.240 --> 00:33:35.399
logistic regression model.

00:33:35.399 --> 00:33:38.159
So I'm going to give this algorithm to

00:33:38.159 --> 00:33:41.480
enable my model to work

00:33:41.820 --> 00:33:45.960
This is the untrained model, this is...

00:33:45.960 --> 00:33:48.059
here.

00:33:48.059 --> 00:33:51.120
The left...

00:33:51.120 --> 00:33:52.860
the left, uh, circuit, I'm going to

00:33:52.860 --> 00:33:54.819
connect it to the data set, and the right

00:33:54.819 --> 00:33:56.940
one, we are going to connect it to

00:33:56.940 --> 00:33:59.700
evaluate model.

00:33:59.700 --> 00:34:02.640
Evaluate model...so let's search for

00:34:02.640 --> 00:34:05.220
"Evaluate model" here.

00:34:05.220 --> 00:34:07.440
So because we want to do what...we want to

00:34:07.440 --> 00:34:10.800
evaluate our model and see how it it has

00:34:10.800 --> 00:34:13.790
been doing. Is it good, is it bad?

00:34:13.790 --> 00:34:18.200
Um, sorry...

00:34:19.980 --> 00:34:22.820
This is...

00:34:23.460 --> 00:34:25.560
this is down there

00:34:25.560 --> 00:34:28.139
after the score model.

00:34:28.139 --> 00:34:31.320
So we have to get the score model first,

00:34:31.320 --> 00:34:33.960
so let's get it.

00:34:33.960 --> 00:34:36.119
And this will take the trained model and

00:34:36.119 --> 00:34:37.260
the data set

00:34:37.260 --> 00:34:39.419
to score our model and see if it's

00:34:39.419 --> 00:34:42.179
performing good or bad.

00:34:42.179 --> 00:34:44.409
And...

00:34:44.409 --> 00:34:47.159
um...

00:34:47.159 --> 00:34:49.080
after that, we have finished

00:34:49.080 --> 00:34:51.920
everything. Now, we are going to do the what?

00:34:52.139 --> 00:34:54.359
The presets for everything.

00:34:54.359 --> 00:34:56.820
As a starter, we will be splitting our

00:34:56.820 --> 00:34:58.920
data. So

00:34:58.920 --> 00:35:01.140
how are we going to do this, according to

00:35:01.140 --> 00:35:03.780
what? To the split rules. So I'm going to

00:35:03.780 --> 00:35:05.940
double-click on it and choose "Split rules".

00:35:05.940 --> 00:35:09.420
And the percentage is

00:35:09.420 --> 00:35:11.780
70 percent for the [INSISTINGUASHABLE]

00:35:11.780 --> 00:35:12.780
and 30 percent of the

00:35:12.780 --> 00:35:14.820
data for

00:35:14.820 --> 00:35:18.420
the valuation or for the scoring, okay?

00:35:18.420 --> 00:35:20.880
I'm going to make it a randomization, so

00:35:20.880 --> 00:35:22.980
I'm going to split data randomly and the

00:35:22.980 --> 00:35:26.060
seat is, uh,

00:35:26.060 --> 00:35:29.339
132, uh 23 I think...yeah.

00:35:29.339 --> 00:35:32.520
And I think that's it.

00:35:32.520 --> 00:35:35.040
The split says why this holds, and that's

00:35:35.040 --> 00:35:36.240
good.

00:35:36.240 --> 00:35:39.540
Now for the next one, which is the train

00:35:39.540 --> 00:35:42.000
model we are going to connect it as

00:35:42.000 --> 00:35:43.500
mentioned here.

00:35:43.500 --> 00:35:48.660
And we have done that and...then why

00:35:48.660 --> 00:35:50.700
am I having here? Let's double click

00:35:50.700 --> 00:35:54.660
on it...yeah. It has...it needs the

00:35:54.660 --> 00:35:57.180
label column that I am trying to predict.

00:35:57.180 --> 00:35:58.680
So from here, I'm going to choose

00:35:58.680 --> 00:36:01.380
diabetic. I'm going to save.

00:36:01.380 --> 00:36:05.180
I'm going to close this one.

00:36:05.520 --> 00:36:07.380
So it says here,

00:36:07.380 --> 00:36:10.619
the diabetic label, the model, it will

00:36:10.619 --> 00:36:12.300
predict the zero and one, because this is

00:36:12.300 --> 00:36:14.700
a binary classification algorithm, so

00:36:14.700 --> 00:36:16.260
it's going to predict either this or

00:36:16.260 --> 00:36:17.520
that.

00:36:17.520 --> 00:36:18.460
And...

00:36:18.460 --> 00:36:20.160
um...

00:36:20.160 --> 00:36:23.880
I think that's everything to run the the

00:36:23.880 --> 00:36:25.859
pipeline.

00:36:25.859 --> 00:36:29.040
So everything is done, everything is good

00:36:29.040 --> 00:36:31.200
for this one. We're just gonna leave it

00:36:31.200 --> 00:36:34.140
for now, because this is the next

00:36:34.140 --> 00:36:35.620
step.

00:36:35.620 --> 00:36:39.839
Um, this will be put instead of the

00:36:39.839 --> 00:36:43.520
score model, but let's...

00:36:44.099 --> 00:36:46.920
let's delete it for now.

00:36:46.920 --> 00:36:49.500
Okay.

00:36:49.500 --> 00:36:52.920
Now we have to submit the job in order

00:36:52.920 --> 00:36:55.680
to see the output of it. So I can click

00:36:55.680 --> 00:36:59.280
on "Submit" and choose the previous job

00:36:59.280 --> 00:37:01.200
which is the one that I have showed you

00:37:01.200 --> 00:37:02.460
before.

00:37:02.460 --> 00:37:05.460
And then let's review its output

00:37:05.460 --> 00:37:06.960
together here.

00:37:06.960 --> 00:37:09.960
So if I go to the jobs,

00:37:09.960 --> 00:37:15.119
if I go to MS Learn, maybe it is training?

00:37:15.119 --> 00:37:18.180
I think it's the one that lasted the

00:37:18.180 --> 00:37:20.640
longest, this one here.

00:37:20.640 --> 00:37:23.700
So here I can see

00:37:23.700 --> 00:37:27.079
the job output, what happened inside

00:37:27.079 --> 00:37:30.420
the model, as you can see.

00:37:30.420 --> 00:37:33.839
So the normalization we have seen

00:37:33.839 --> 00:37:36.540
before, the split data, I can preview it.

00:37:36.540 --> 00:37:39.359
The result one or the result two as it

00:37:39.359 --> 00:37:41.760
splits the data to 70 here and

00:37:41.760 --> 00:37:43.639
thirty percent here.

00:37:43.639 --> 00:37:46.859
Um, I can see the score model, which is

00:37:46.859 --> 00:37:49.140
something that we need

00:37:49.140 --> 00:37:51.530
to review.

00:37:51.530 --> 00:37:56.820
Inside the scroll model, uh, from

00:37:56.820 --> 00:37:57.960
here,

00:37:57.960 --> 00:38:00.960
we can see that...

00:38:00.960 --> 00:38:04.460
let's get back here.

00:38:05.940 --> 00:38:08.220
This is the data that the model has

00:38:08.220 --> 00:38:11.579
been scored and this is a scoring output.

00:38:11.579 --> 00:38:15.300
So it says "code label true", and he is

00:38:15.300 --> 00:38:17.370
not diabetic, so this is,

00:38:17.370 --> 00:38:19.200
um,

00:38:19.200 --> 00:38:21.839
a wrong prediction, let's say.

00:38:21.839 --> 00:38:23.880
For this one it's true and true, and this

00:38:23.880 --> 00:38:26.880
is a good, like, what do you say,

00:38:26.880 --> 00:38:29.460
prediction, and the probabilities of this

00:38:29.460 --> 00:38:30.420
score,

00:38:30.420 --> 00:38:33.119
which means the certainty of our model

00:38:33.119 --> 00:38:36.620
of that this is really true. It's 80 percent.

00:38:36.620 --> 00:38:38.780
For this one it's 75 percent.

00:38:38.780 --> 00:38:42.599
So these are some cool metrics that we

00:38:42.599 --> 00:38:45.359
can review to understand how our model

00:38:45.359 --> 00:38:47.580
is performing. It's performing good for

00:38:47.580 --> 00:38:48.540
now.

00:38:48.540 --> 00:38:53.180
Let's check our evaluation model.

00:38:53.180 --> 00:38:56.700
So this is the extra one that I told you

00:38:56.700 --> 00:38:59.579
about. Instead of the

00:38:59.579 --> 00:39:01.800
score model only, we are going to add

00:39:01.800 --> 00:39:04.260
what evaluate model

00:39:04.260 --> 00:39:06.900
after it. So here

00:39:06.900 --> 00:39:09.420
we're going to go to our Asset Library

00:39:09.420 --> 00:39:12.180
and we are going to choose the evaluate

00:39:12.180 --> 00:39:14.940
model,

00:39:14.940 --> 00:39:17.760
and we are going to put it here, and we

00:39:17.760 --> 00:39:20.220
are going to connect it, and we are going

00:39:20.220 --> 00:39:23.099
to submit the job using the same name of

00:39:23.099 --> 00:39:25.140
the job that we used previously.

00:39:25.140 --> 00:39:29.520
Let's review it. Also, so, after it

00:39:29.520 --> 00:39:33.300
finishes, you will find it here. So I have

00:39:33.300 --> 00:39:35.280
already done it before, this is how I'm

00:39:35.280 --> 00:39:37.380
able to see the output.

00:39:37.380 --> 00:39:40.320
So let's see

00:39:40.320 --> 00:39:43.280
what is the output of this

00:39:43.280 --> 00:39:45.660
evaluation process.

00:39:45.660 --> 00:39:49.800
Here it mentioned to you that there are

00:39:49.800 --> 00:39:51.480
some matrix,

00:39:51.480 --> 00:39:54.839
like the confusion matrix, which Carlotta

00:39:54.839 --> 00:39:57.060
told you about, there is the accuracy, the

00:39:57.060 --> 00:39:59.760
precision, the recall, and F1 Score.

00:39:59.760 --> 00:40:02.339
Every matrix gives us some insight about

00:40:02.339 --> 00:40:04.920
our model. It helps us to understand it

00:40:04.920 --> 00:40:08.579
more, and, um,

00:40:08.579 --> 00:40:10.560
understand if it's overfitting, if

00:40:10.560 --> 00:40:12.240
it's good, if it's bad, and really really,

00:40:12.240 --> 00:40:16.339
like, understand how it's working.

00:40:17.060 --> 00:40:20.400
Now I'm just waiting for the job to load.

00:40:20.400 --> 00:40:22.710
Until it loads,

00:40:22.710 --> 00:40:23.640
um,

00:40:23.640 --> 00:40:26.040
we can continue

00:40:26.040 --> 00:40:28.740
to work on our

00:40:28.740 --> 00:40:31.800
model. So I will go to my designer. I'm

00:40:31.800 --> 00:40:34.740
just going to confirm this.

00:40:34.740 --> 00:40:38.280
And I'm going to continue working on it

00:40:38.280 --> 00:40:39.780
from

00:40:39.780 --> 00:40:42.119
where we have stopped. Where have we

00:40:42.119 --> 00:40:43.560
stopped?

00:40:43.560 --> 00:40:46.440
we have stopped on the evaluate model. So

00:40:46.440 --> 00:40:48.960
I'm going to choose this one.

00:40:48.960 --> 00:40:53.420
And it says here

00:40:54.180 --> 00:40:56.940
"select experiment", "create inference

00:40:56.940 --> 00:40:58.200
pipeline", so

00:40:58.200 --> 00:41:01.079
I am going to go to the jobs,

00:41:01.079 --> 00:41:04.680
I'm going to select my experiment.

00:41:04.680 --> 00:41:06.660
I hope this works.

00:41:06.660 --> 00:41:09.720
Okay. Finally, now we have our

00:41:09.720 --> 00:41:12.180
evaluate model output.

00:41:12.180 --> 00:41:15.480
Let's preview evaluation results

00:41:15.480 --> 00:41:18.660
and, uh...

00:41:18.660 --> 00:41:22.220
come on.

00:41:25.500 --> 00:41:28.020
Finally. Now we can create our inference

00:41:28.020 --> 00:41:31.020
pipeline. So,

00:41:31.020 --> 00:41:33.510
I think it says that...

00:41:33.510 --> 00:41:35.280
um...

00:41:35.280 --> 00:41:38.160
select the experiment, then select MS

00:41:38.160 --> 00:41:39.359
Learn. So,

00:41:39.359 --> 00:41:43.320
I am just going to select it,

00:41:43.320 --> 00:41:48.300
and finally. Now we can, the ROC curve, we

00:41:48.300 --> 00:41:51.000
can see it, that the true positive rate

00:41:51.000 --> 00:41:53.760
and the force was integrate. The false

00:41:53.760 --> 00:41:56.660
positive rate is increasing with time,

00:41:56.660 --> 00:42:01.020
and also the true positive rate. True

00:42:01.020 --> 00:42:03.540
positive is something that it predicted,

00:42:03.540 --> 00:42:06.960
that it is, uh, positive it has diabetes,

00:42:06.960 --> 00:42:09.480
and it's really...it's really true.

00:42:09.480 --> 00:42:12.599
The person really has diabetes. Okay. And

00:42:12.599 --> 00:42:14.760
for the false positive, it predicted that

00:42:14.760 --> 00:42:17.579
someone has diabetes and someone doesn't

00:42:17.579 --> 00:42:20.960
have it. This is what true position and

00:42:20.960 --> 00:42:24.960
false positive means. This is the record

00:42:24.960 --> 00:42:28.020
curve, so we can review the metrics

00:42:28.020 --> 00:42:32.160
of our model. This is the lift curve. I

00:42:32.160 --> 00:42:36.000
can change the threshold of my confusion

00:42:36.000 --> 00:42:37.740
matrix here

00:42:37.740 --> 00:42:39.119
and this could [...] don't want to add

00:42:39.119 --> 00:42:43.920
anything about the...the graphs,

00:42:43.920 --> 00:42:47.000
you can do so.

00:42:50.460 --> 00:42:51.000
Um,

00:42:51.000 --> 00:42:54.720
yeah, so just wanted to if you go yeah I

00:42:54.720 --> 00:42:57.119
just wanted to comment comment for the

00:42:57.119 --> 00:43:00.480
RSC curve uh that actually from this

00:43:00.480 --> 00:43:03.900
graph the metric which uh usually we're

00:43:03.900 --> 00:43:06.960
going to compute is the end area under

00:43:06.960 --> 00:43:09.900
under the curve and this coefficient or

00:43:09.900 --> 00:43:12.240
metric

00:43:12.240 --> 00:43:15.060
um it's a confusion

00:43:15.060 --> 00:43:18.420
um is a value that could span from from

00:43:18.420 --> 00:43:22.920
zero to one and the the highest is

00:43:22.920 --> 00:43:23.480
um

00:43:23.480 --> 00:43:26.700
this the highest is the the score so the

00:43:26.700 --> 00:43:29.220
the closest one

00:43:29.220 --> 00:43:32.760
um so the the highest is the amount of

00:43:32.760 --> 00:43:35.280
area under this curve

00:43:35.280 --> 00:43:40.500
um the the the highest performance uh we

00:43:40.500 --> 00:43:43.319
we've got from from our model and

00:43:43.319 --> 00:43:46.440
another thing is what John is

00:43:46.440 --> 00:43:49.680
um playing with so this threshold for

00:43:49.680 --> 00:43:51.380
the logistic

00:43:51.380 --> 00:43:55.920
regression is the threshold used by the

00:43:55.920 --> 00:43:57.180
model

00:43:57.180 --> 00:43:58.740
um to

00:43:58.740 --> 00:43:59.520
um

00:43:59.520 --> 00:44:02.940
to predict uh if the category is zero or

00:44:02.940 --> 00:44:05.220
one so if the probability the

00:44:05.220 --> 00:44:08.599
probability score is above the threshold

00:44:08.599 --> 00:44:11.579
then the category will be predicted as

00:44:11.579 --> 00:44:15.359
one while if the the probability is

00:44:15.359 --> 00:44:17.460
below the threshold in this case for

00:44:17.460 --> 00:44:21.300
example 0.5 the category is predicted as

00:44:21.300 --> 00:44:23.579
as zero so that's why it's very

00:44:23.579 --> 00:44:26.099
important to um to choose the the

00:44:26.099 --> 00:44:27.839
threshold because the performance really

00:44:27.839 --> 00:44:29.520
can vary

00:44:29.520 --> 00:44:30.560
um

00:44:30.560 --> 00:44:34.380
with this threshold value

00:44:34.380 --> 00:44:41.099
uh thank you uh so much uh kellota and

00:44:41.400 --> 00:44:44.400
as I mentioned now we are going to like

00:44:44.400 --> 00:44:46.560
create our inference pipeline so we are

00:44:46.560 --> 00:44:48.540
going to select the latest one which I

00:44:48.540 --> 00:44:50.819
already have it opened here this is the

00:44:50.819 --> 00:44:52.859
one that we were reviewing together this

00:44:52.859 --> 00:44:55.500
is where we have stopped and we're going

00:44:55.500 --> 00:44:57.599
to create an inference pipeline we are

00:44:57.599 --> 00:44:59.520
going to choose a real-time inference

00:44:59.520 --> 00:45:02.520
pipeline okay

00:45:02.520 --> 00:45:05.160
um from where I can find this here as it

00:45:05.160 --> 00:45:08.099
says real-time inference pipeline

00:45:08.099 --> 00:45:10.680
so it's gonna add some things to my

00:45:10.680 --> 00:45:12.420
workspace it's going to add the web

00:45:12.420 --> 00:45:13.980
service inboard it's going to have the

00:45:13.980 --> 00:45:15.780
web service output because we will be

00:45:15.780 --> 00:45:18.180
creating it as a web service to access

00:45:18.180 --> 00:45:19.740
it from the internet

00:45:19.740 --> 00:45:21.900
uh what are we going to do we're going

00:45:21.900 --> 00:45:24.720
to remove this diabetes data okay

00:45:24.720 --> 00:45:27.540
and we are going to get a component

00:45:27.540 --> 00:45:29.359
called Web

00:45:29.359 --> 00:45:33.180
input and what's up let me check

00:45:33.180 --> 00:45:35.940
it's enter data manually

00:45:35.940 --> 00:45:38.400
we have we already have the with input

00:45:38.400 --> 00:45:39.540
present

00:45:39.540 --> 00:45:42.119
so we are going to get the entire data

00:45:42.119 --> 00:45:43.200
manually

00:45:43.200 --> 00:45:45.420
and we're going to collect it to connect

00:45:45.420 --> 00:45:49.560
it as it was connected before like that

00:45:49.560 --> 00:45:53.040
and also I am not going to directly take

00:45:53.040 --> 00:45:55.260
the web service sorry escort model to

00:45:55.260 --> 00:45:57.839
the web service output like that

00:45:57.839 --> 00:46:00.240
I'm going to delete this

00:46:00.240 --> 00:46:03.960
and I'm going to execute a python script

00:46:03.960 --> 00:46:05.880
before

00:46:05.880 --> 00:46:09.500
I display my result

00:46:10.680 --> 00:46:12.060
so

00:46:12.060 --> 00:46:17.480
this will be connected like okay but

00:46:19.260 --> 00:46:20.400
so

00:46:20.400 --> 00:46:23.599
the other way around

00:46:23.599 --> 00:46:27.660
and from here I am going to connect this

00:46:27.660 --> 00:46:30.960
with that and there is some data uh that

00:46:30.960 --> 00:46:33.480
we will be getting from the node or from

00:46:33.480 --> 00:46:37.680
the the examination here and this is the

00:46:37.680 --> 00:46:40.740
data that will be entered like to our

00:46:40.740 --> 00:46:44.400
website manually okay this is instead of

00:46:44.400 --> 00:46:47.460
the data that we have been getting from

00:46:47.460 --> 00:46:49.740
our data set that we created so I'm just

00:46:49.740 --> 00:46:51.960
going to double click on it and choose

00:46:51.960 --> 00:46:55.579
CSV and I will choose it has headers

00:46:55.579 --> 00:47:00.839
and I will take or copy this content and

00:47:00.839 --> 00:47:02.819
put it there okay

00:47:02.819 --> 00:47:05.700
so let's do it

00:47:05.700 --> 00:47:07.920
I think I have to click on edit code now

00:47:07.920 --> 00:47:10.680
I can click on Save and I can close it

00:47:10.680 --> 00:47:13.079
another thing which is the python script

00:47:13.079 --> 00:47:16.700
that we will be executing

00:47:17.099 --> 00:47:19.380
um yeah we are going to remove this also

00:47:19.380 --> 00:47:21.140
we don't need the evaluate model anymore

00:47:21.140 --> 00:47:24.319
so we are going to remove

00:47:24.319 --> 00:47:28.579
script that I will be executing okay

00:47:28.579 --> 00:47:32.599
I can find it here

00:47:33.540 --> 00:47:34.619
um

00:47:34.619 --> 00:47:35.760
yeah

00:47:35.760 --> 00:47:38.640
this is the python script that we will

00:47:38.640 --> 00:47:41.520
execute and it says to you that this

00:47:41.520 --> 00:47:43.619
code selects only the patient's ID

00:47:43.619 --> 00:47:45.000
that's correct label the school

00:47:45.000 --> 00:47:47.700
probability and return returns them to

00:47:47.700 --> 00:47:49.980
the web service output so we don't want

00:47:49.980 --> 00:47:51.960
to return all the columns as we have

00:47:51.960 --> 00:47:53.339
seen previously

00:47:53.339 --> 00:47:55.560
uh the determines everything

00:47:55.560 --> 00:47:56.940
so

00:47:56.940 --> 00:47:59.040
we want to return certain stuff the

00:47:59.040 --> 00:48:02.940
stuff that we will use inside our

00:48:02.940 --> 00:48:05.640
endpoint so I'm just going to select

00:48:05.640 --> 00:48:07.980
everything and delete it and

00:48:07.980 --> 00:48:11.060
paste the code that I have gotten from

00:48:11.060 --> 00:48:14.280
the uh

00:48:14.280 --> 00:48:16.500
the Microsoft learn Docs

00:48:16.500 --> 00:48:19.079
now I can click on Save and I can close

00:48:19.079 --> 00:48:20.280
this

00:48:20.280 --> 00:48:21.960
let me check something I don't think

00:48:21.960 --> 00:48:25.020
it's saved it's saved but the display is

00:48:25.020 --> 00:48:26.160
wrong okay

00:48:26.160 --> 00:48:30.300
and now I think everything is good to go

00:48:30.300 --> 00:48:32.640
I'm just gonna double check everything

00:48:32.640 --> 00:48:36.359
so uh yeah we are gonna change the name

00:48:36.359 --> 00:48:38.640
of this uh

00:48:38.640 --> 00:48:40.800
Pipeline and we are gonna call it

00:48:40.800 --> 00:48:42.780
predict

00:48:42.780 --> 00:48:46.319
diabetes okay

00:48:46.319 --> 00:48:50.339
now let's close it and

00:48:50.339 --> 00:48:57.119
I think that we are good to go so

00:48:57.119 --> 00:48:59.300
um

00:48:59.720 --> 00:49:04.460
okay I think everything is good for us

00:49:06.420 --> 00:49:08.339
I just want to make sure of something is

00:49:08.339 --> 00:49:12.420
the data is correct the data is uh yeah

00:49:12.420 --> 00:49:13.560
it's correct

00:49:13.560 --> 00:49:16.319
okay now I can run the pipeline let's

00:49:16.319 --> 00:49:17.640
submit

00:49:17.640 --> 00:49:21.000
select an existing Pipeline and we're

00:49:21.000 --> 00:49:22.740
going to choose the MS layer and

00:49:22.740 --> 00:49:24.599
diabetes training which is the pipeline

00:49:24.599 --> 00:49:27.060
that we have been working on

00:49:27.060 --> 00:49:31.619
from the beginning of this module

00:49:31.680 --> 00:49:33.839
I don't think that this is going to take

00:49:33.839 --> 00:49:36.060
much time so we have submitted the job

00:49:36.060 --> 00:49:37.319
and it's running

00:49:37.319 --> 00:49:40.140
until the job ends we are going to set

00:49:40.140 --> 00:49:41.720
everything

00:49:41.720 --> 00:49:45.599
and for deploying a service

00:49:45.599 --> 00:49:49.560
in order to deploy a service okay

00:49:49.560 --> 00:49:50.520
um

00:49:50.520 --> 00:49:54.000
I have to have the job ready so

00:49:54.000 --> 00:49:56.040
until it's ready or you can deploy it so

00:49:56.040 --> 00:49:58.319
let's go to the job the job details from

00:49:58.319 --> 00:50:01.319
here okay

00:50:01.319 --> 00:50:05.119
and until it finishes

00:50:05.119 --> 00:50:07.260
Carlotta do you think that we can have

00:50:07.260 --> 00:50:09.240
the questions and then we can get back

00:50:09.240 --> 00:50:12.859
to the job I'm deploying it

00:50:13.700 --> 00:50:17.579
yeah yeah yeah so yeah yeah guys if you

00:50:17.579 --> 00:50:18.980
have any questions

00:50:18.980 --> 00:50:24.119
uh on on what you just uh just saw here

00:50:24.119 --> 00:50:26.940
or into introductions feel free this is

00:50:26.940 --> 00:50:30.300
a good moment we can uh we can discuss

00:50:30.300 --> 00:50:33.900
now while we wait for this job to to

00:50:33.900 --> 00:50:36.260
finish

00:50:36.300 --> 00:50:38.760
uh and the

00:50:38.760 --> 00:50:40.220
can can

00:50:40.220 --> 00:50:45.000
we have the energy check one or like

00:50:45.000 --> 00:50:47.700
what do you think uh yeah we can also go

00:50:47.700 --> 00:50:49.680
to the knowledge check

00:50:49.680 --> 00:50:50.940
um

00:50:50.940 --> 00:50:56.339
yeah okay so let me share my screen

00:50:56.339 --> 00:50:58.980
yeah so if you have not any questions

00:50:58.980 --> 00:51:01.619
for us we can maybe propose some

00:51:01.619 --> 00:51:05.339
questions to to you that you can

00:51:05.339 --> 00:51:06.240
um

00:51:06.240 --> 00:51:09.660
uh to check our knowledge so far and you

00:51:09.660 --> 00:51:12.900
can uh maybe answer to these questions

00:51:12.900 --> 00:51:15.420
uh via chat

00:51:15.420 --> 00:51:18.300
um so we have do you see my screen can

00:51:18.300 --> 00:51:19.859
you see my screen

00:51:19.859 --> 00:51:22.020
yes

00:51:22.020 --> 00:51:25.440
um so John I think I will read this

00:51:25.440 --> 00:51:29.040
question loud and ask it to you okay so

00:51:29.040 --> 00:51:32.040
are you ready to transfer

00:51:32.040 --> 00:51:33.660
yes I am

00:51:33.660 --> 00:51:35.460
so

00:51:35.460 --> 00:51:37.260
um you're using Azure machine learning

00:51:37.260 --> 00:51:39.780
designer to create a training pipeline

00:51:39.780 --> 00:51:42.540
for a binary classification model so

00:51:42.540 --> 00:51:45.300
what what we were doing in our demo

00:51:45.300 --> 00:51:48.059
right and you have added a data set

00:51:48.059 --> 00:51:51.660
containing features and labels uh a true

00:51:51.660 --> 00:51:54.359
class decision Forest module so we used

00:51:54.359 --> 00:51:56.819
a logistic regression model our

00:51:56.819 --> 00:51:59.099
um in our example here we're using A2

00:51:59.099 --> 00:52:01.260
class decision force model

00:52:01.260 --> 00:52:04.500
and of course a trained model model you

00:52:04.500 --> 00:52:07.200
plan now to use score model and evaluate

00:52:07.200 --> 00:52:09.480
model modules to test the train model

00:52:09.480 --> 00:52:11.640
with the subset of the data set that

00:52:11.640 --> 00:52:13.500
wasn't used for training

00:52:13.500 --> 00:52:15.960
but what are we missing so what's

00:52:15.960 --> 00:52:18.780
another model you should add and we have

00:52:18.780 --> 00:52:21.660
three options we have join data we have

00:52:21.660 --> 00:52:25.200
split data or we have select columns in

00:52:25.200 --> 00:52:26.819
in that set

00:52:26.819 --> 00:52:28.260
so

00:52:28.260 --> 00:52:32.040
um while John thinks about the answer uh

00:52:32.040 --> 00:52:33.839
go ahead and

00:52:33.839 --> 00:52:34.800
um

00:52:34.800 --> 00:52:37.800
answer yourself so give us your your

00:52:37.800 --> 00:52:39.540
guess

00:52:39.540 --> 00:52:41.940
put in the chat or just come off mute

00:52:41.940 --> 00:52:44.900
and announcer

00:52:46.740 --> 00:52:48.960
a b yes

00:52:48.960 --> 00:52:50.579
yeah what do you think is the correct

00:52:50.579 --> 00:52:53.579
answer for this one I need something to

00:52:53.579 --> 00:52:56.579
uh like I have to score my model and I

00:52:56.579 --> 00:53:00.359
have to evaluate it so I I like I need

00:53:00.359 --> 00:53:03.119
something to enable me to do these two

00:53:03.119 --> 00:53:05.359
things

00:53:06.660 --> 00:53:09.119
I think it's something you showed us in

00:53:09.119 --> 00:53:12.980
in your pipeline right John

00:53:13.260 --> 00:53:16.819
of course I did

00:53:23.460 --> 00:53:28.020
uh we have no guests yeah

00:53:28.020 --> 00:53:32.280
can someone like someone want to guess

00:53:32.280 --> 00:53:35.579
uh we have a b yeah

00:53:35.579 --> 00:53:38.760
uh maybe

00:53:38.760 --> 00:53:43.260
so uh in order to do this in order to do

00:53:43.260 --> 00:53:46.200
this I mentioned the

00:53:46.200 --> 00:53:49.380
the module that is going to help me to

00:53:49.380 --> 00:53:53.819
to divide my data into two things 70 for

00:53:53.819 --> 00:53:56.220
the training and thirty percent for the

00:53:56.220 --> 00:53:59.339
evaluation so what did I use I used

00:53:59.339 --> 00:54:01.859
split data because this is what is going

00:54:01.859 --> 00:54:05.280
to split my data randomly into training

00:54:05.280 --> 00:54:08.579
data and validation data so the correct

00:54:08.579 --> 00:54:12.240
answer is B and good job eek thank you

00:54:12.240 --> 00:54:13.980
for participating

00:54:13.980 --> 00:54:17.400
next question please

00:54:17.400 --> 00:54:19.339
yes

00:54:19.339 --> 00:54:22.559
answer so thanks John

00:54:22.559 --> 00:54:26.040
uh for uh explaining us the the correct

00:54:26.040 --> 00:54:26.940
one

00:54:26.940 --> 00:54:30.420
and we want to go with question two

00:54:30.420 --> 00:54:33.180
yeah so uh I'm going to ask you now

00:54:33.180 --> 00:54:35.880
karnata you use Azure machine learning

00:54:35.880 --> 00:54:38.280
designer to create a training pipeline

00:54:38.280 --> 00:54:40.500
for your classification model

00:54:40.500 --> 00:54:44.099
what must you do before you deploy this

00:54:44.099 --> 00:54:45.960
model as a service you have to do

00:54:45.960 --> 00:54:47.579
something before you deploy it what do

00:54:47.579 --> 00:54:49.740
you think is the correct answer

00:54:49.740 --> 00:54:52.740
is it a b or c

00:54:52.740 --> 00:54:55.020
share your thoughts without in touch

00:54:55.020 --> 00:54:58.380
with us in the chat and

00:54:58.380 --> 00:55:00.180
um and I'm also going to give you some

00:55:00.180 --> 00:55:02.940
like minutes to think of it before I

00:55:02.940 --> 00:55:06.020
like tell you about

00:55:06.599 --> 00:55:09.000
yeah so let me go through the possible

00:55:09.000 --> 00:55:12.359
answers right so we have a uh create an

00:55:12.359 --> 00:55:14.940
inference pipeline from the training

00:55:14.940 --> 00:55:16.020
pipeline

00:55:16.020 --> 00:55:19.260
uh B we have ADD and evaluate model

00:55:19.260 --> 00:55:22.380
module to the training Pipeline and then

00:55:22.380 --> 00:55:25.079
three we have uh clone the training

00:55:25.079 --> 00:55:29.480
Pipeline with a different name

00:55:29.520 --> 00:55:31.559
so what do you think is the correct

00:55:31.559 --> 00:55:33.960
answer a b or c

00:55:33.960 --> 00:55:36.660
uh also this time I think it's something

00:55:36.660 --> 00:55:39.300
we mentioned both in the decks and in

00:55:39.300 --> 00:55:41.960
the demo right

00:55:42.599 --> 00:55:44.819
yes it is

00:55:44.819 --> 00:55:48.720
it's something that I have done like two

00:55:48.720 --> 00:55:51.800
like five minutes ago

00:55:51.800 --> 00:55:57.200
it's real time real time what's

00:55:58.020 --> 00:55:58.760
um

00:55:58.760 --> 00:56:02.040
yeah so think about you need to deploy

00:56:02.040 --> 00:56:05.460
uh the model as a service so uh if I'm

00:56:05.460 --> 00:56:07.980
going to deploy model

00:56:07.980 --> 00:56:10.380
um I cannot like evaluate the model

00:56:10.380 --> 00:56:12.839
after deploying it right because I

00:56:12.839 --> 00:56:14.940
cannot go into production if I'm not

00:56:14.940 --> 00:56:17.579
sure I'm not satisfied over my model and

00:56:17.579 --> 00:56:19.500
I'm not sure that my model is performing

00:56:19.500 --> 00:56:20.280
well

00:56:20.280 --> 00:56:23.460
so that's why I would go with

00:56:23.460 --> 00:56:24.319
um

00:56:24.319 --> 00:56:30.480
I would like exclude B from from my from

00:56:30.480 --> 00:56:31.520
my answer

00:56:31.520 --> 00:56:33.599
uh while

00:56:33.599 --> 00:56:36.960
um thinking about C uh I don't see you I

00:56:36.960 --> 00:56:39.480
didn't see you John cloning uh the

00:56:39.480 --> 00:56:41.420
training Pipeline with a different name

00:56:41.420 --> 00:56:44.640
uh so I I don't think this is the the

00:56:44.640 --> 00:56:46.920
right answer

00:56:46.920 --> 00:56:49.619
um while I've seen you creating an

00:56:49.619 --> 00:56:52.859
inference pipeline uh yeah from the

00:56:52.859 --> 00:56:55.020
training Pipeline and you just converted

00:56:55.020 --> 00:56:59.280
it using uh a one-click button right

00:56:59.280 --> 00:57:03.300
yeah that's correct so uh this is the

00:57:03.300 --> 00:57:04.280
right answer

00:57:04.280 --> 00:57:07.460
uh good job so I created an inference

00:57:07.460 --> 00:57:11.280
real-time Pipeline and it has done it

00:57:11.280 --> 00:57:13.440
like it finished it finished the job is

00:57:13.440 --> 00:57:18.000
finished so uh we can now deploy

00:57:18.000 --> 00:57:19.400
ment

00:57:19.400 --> 00:57:21.500
yeah

00:57:21.500 --> 00:57:25.339
exactly like on time

00:57:25.380 --> 00:57:27.839
I like it finished like two seconds

00:57:27.839 --> 00:57:30.859
three three four seconds ago

00:57:30.859 --> 00:57:33.119
so uh

00:57:33.119 --> 00:57:36.480
until like um

00:57:36.480 --> 00:57:39.839
this is my job review so

00:57:39.839 --> 00:57:43.260
uh like this is the job details that I

00:57:43.260 --> 00:57:45.540
have already submitted it's just opening

00:57:45.540 --> 00:57:48.119
and once it opens

00:57:48.119 --> 00:57:50.180
um

00:57:50.400 --> 00:57:52.740
like I don't know why it's so heavy

00:57:52.740 --> 00:57:56.780
today it's not like that usually

00:57:58.740 --> 00:58:01.020
yeah it's probably because you are also

00:58:01.020 --> 00:58:06.000
showing your your screen on teams

00:58:06.000 --> 00:58:08.160
okay so that's the bandwidth of your

00:58:08.160 --> 00:58:10.740
connection is exactly do something here

00:58:10.740 --> 00:58:13.740
because yeah finally

00:58:13.740 --> 00:58:16.440
I can switch to my mobile internet if it

00:58:16.440 --> 00:58:18.599
did it again so I will click on deploy

00:58:18.599 --> 00:58:20.700
it's that simple I'll just click on

00:58:20.700 --> 00:58:23.040
deploy and

00:58:23.040 --> 00:58:25.619
I am going to deploy a new real-time

00:58:25.619 --> 00:58:27.960
endpoint

00:58:27.960 --> 00:58:30.300
so what I'm going to name it I'm

00:58:30.300 --> 00:58:31.740
description and the compute type

00:58:31.740 --> 00:58:33.720
everything is already mentioned for me

00:58:33.720 --> 00:58:36.240
here so I'm just gonna copy and paste it

00:58:36.240 --> 00:58:38.940
because we like we have we are running

00:58:38.940 --> 00:58:41.280
out of time

00:58:41.280 --> 00:58:45.680
so it's all Azure container instance

00:58:45.680 --> 00:58:48.720
which is a containerization service also

00:58:48.720 --> 00:58:51.059
both are for containerization but this

00:58:51.059 --> 00:58:52.440
gives you something and this gives you

00:58:52.440 --> 00:58:54.960
something else for the advanced options

00:58:54.960 --> 00:58:57.420
it doesn't say for us to do anything so

00:58:57.420 --> 00:59:00.420
we are just gonna click on deploy

00:59:00.420 --> 00:59:05.220
and now we can test our endpoint from

00:59:05.220 --> 00:59:07.859
the endpoints that we can find here so

00:59:07.859 --> 00:59:11.460
it's in progress if I go here

00:59:11.460 --> 00:59:13.799
under the assets I can find something

00:59:13.799 --> 00:59:16.680
called endpoints and I can find the

00:59:16.680 --> 00:59:18.599
real-time ones and the batch endpoints

00:59:18.599 --> 00:59:22.020
and we have created a real-time endpoint

00:59:22.020 --> 00:59:25.260
so we are going to find it under this uh

00:59:25.260 --> 00:59:29.760
title so if I like click on it I should

00:59:29.760 --> 00:59:32.640
be able to test it once it's ready

00:59:32.640 --> 00:59:37.200
it's still like loading but this is the

00:59:37.200 --> 00:59:40.980
input and this is the output that we

00:59:40.980 --> 00:59:45.200
will get back so if I click on test and

00:59:45.200 --> 00:59:49.920
from here I will input some data to the

00:59:49.920 --> 00:59:50.900
endpoint

00:59:50.900 --> 00:59:54.599
which are the patient information The

00:59:54.599 --> 00:59:57.119
Columns that we have already seen in our

00:59:57.119 --> 01:00:00.380
data set the patient ID the pregnancies

01:00:00.380 --> 01:00:03.960
and of course of course I'm not gonna

01:00:03.960 --> 01:00:05.940
enter the label that I'm trying to

01:00:05.940 --> 01:00:08.099
predict so I'm not going to give him if

01:00:08.099 --> 01:00:10.680
the patient is diabetic or not this end

01:00:10.680 --> 01:00:13.200
point is to tell me this is the end

01:00:13.200 --> 01:00:15.599
point or the URL is going to give me

01:00:15.599 --> 01:00:17.640
back this information whether someone

01:00:17.640 --> 01:00:22.680
has diabetes or he doesn't so if I input

01:00:22.680 --> 01:00:24.780
these this data I'm just going to copy

01:00:24.780 --> 01:00:27.780
it and go to my endpoint and click on

01:00:27.780 --> 01:00:30.180
test I'm gonna give the result pack

01:00:30.180 --> 01:00:32.359
which are the three columns that we have

01:00:32.359 --> 01:00:35.520
defined inside our python script the

01:00:35.520 --> 01:00:37.859
patient ID the diabetic prediction and

01:00:37.859 --> 01:00:41.040
the probability the certainty of whether

01:00:41.040 --> 01:00:45.720
someone is diabetic or not based on the

01:00:45.720 --> 01:00:50.660
uh based on the prediction so that's it

01:00:50.660 --> 01:00:54.359
and like uh I think that this is really

01:00:54.359 --> 01:00:56.819
simple step to do you can do it on your

01:00:56.819 --> 01:00:58.380
own you can test it

01:00:58.380 --> 01:01:01.140
and I think that I have finished so

01:01:01.140 --> 01:01:03.020
thank you

01:01:03.020 --> 01:01:06.599
uh yes we are running out of time I I

01:01:06.599 --> 01:01:09.780
just wanted to uh thank you John for for

01:01:09.780 --> 01:01:12.299
this demo for going through all these

01:01:12.299 --> 01:01:14.099
steps to

01:01:14.099 --> 01:01:16.740
um create train a classification model

01:01:16.740 --> 01:01:19.680
and also deploy it as a predictive

01:01:19.680 --> 01:01:23.040
service and I encourage you all to go

01:01:23.040 --> 01:01:25.079
back to the learn module

01:01:25.079 --> 01:01:28.260
um and uh like depend all these topics

01:01:28.260 --> 01:01:31.760
at your at your own pace and also maybe

01:01:31.760 --> 01:01:34.799
uh do this demo on your own on your

01:01:34.799 --> 01:01:37.140
subscription on your Azure for student

01:01:37.140 --> 01:01:39.359
subscription

01:01:39.359 --> 01:01:43.200
um and I would also like to recall that

01:01:43.200 --> 01:01:46.260
this is part of a series of study

01:01:46.260 --> 01:01:49.500
sessions of cloud skill challenge study

01:01:49.500 --> 01:01:51.059
sessions

01:01:51.059 --> 01:01:54.059
um so you will have more in the in the

01:01:54.059 --> 01:01:57.540
in the following days and this is for

01:01:57.540 --> 01:02:00.480
you to prepare let's say to to help you

01:02:00.480 --> 01:02:04.880
in taking the a cloud skills challenge

01:02:04.880 --> 01:02:07.040
which collect

01:02:07.040 --> 01:02:10.799
a very interesting learn module that you

01:02:10.799 --> 01:02:14.540
can use to scale up on various topics

01:02:14.540 --> 01:02:18.359
and some of them are focused on AI and

01:02:18.359 --> 01:02:20.819
ml so if you are interested in these

01:02:20.819 --> 01:02:23.099
topics you can select these these learn

01:02:23.099 --> 01:02:24.780
modules

01:02:24.780 --> 01:02:27.660
um so let me also copy

01:02:27.660 --> 01:02:29.819
um the link the short link to the

01:02:29.819 --> 01:02:32.700
challenge in the chat uh remember that

01:02:32.700 --> 01:02:34.980
you have time until the 13th of

01:02:34.980 --> 01:02:37.980
September to take the challenge and also

01:02:37.980 --> 01:02:40.440
remember that in October on the 7th of

01:02:40.440 --> 01:02:43.020
October you have the you can join the

01:02:43.020 --> 01:02:46.619
student the the student developer Summit

01:02:46.619 --> 01:02:50.640
which is uh which will be a virtual or

01:02:50.640 --> 01:02:53.220
in for some for some cases and hybrid

01:02:53.220 --> 01:02:56.040
event so stay tuned because you will

01:02:56.040 --> 01:02:58.559
have some surprises in the following

01:02:58.559 --> 01:03:01.260
days and if you want to learn more about

01:03:01.260 --> 01:03:03.480
this event you can check the Microsoft

01:03:03.480 --> 01:03:08.099
Imaging cap Twitter page and stay tuned

01:03:08.099 --> 01:03:11.460
so thank you everyone for uh for joining

01:03:11.460 --> 01:03:13.079
this session today and thank you very

01:03:13.079 --> 01:03:16.500
much Sean for co-hosting with with this

01:03:16.500 --> 01:03:20.359
session with me it was a pleasure

01:03:21.839 --> 01:03:24.119
thank you so much Carlotta for having me

01:03:24.119 --> 01:03:26.579
with you today and thank you like for

01:03:26.579 --> 01:03:28.079
giving me this opportunity to be with

01:03:28.079 --> 01:03:30.180
you here

01:03:30.180 --> 01:03:33.480
great I hope that uh yeah I hope that we

01:03:33.480 --> 01:03:36.480
work again in the future sure I I hope

01:03:36.480 --> 01:03:38.160
so as well

01:03:38.160 --> 01:03:40.760
um so

01:03:44.099 --> 01:03:46.500
bye bye speak to you soon

01:03:46.500 --> 01:03:48.920
bye