Machine Learning for Predictive maintenance: End-to-end workflow in Jupyter notebook
-
0:01 - 0:04Hello everyone, my name is Victor. I'm
-
0:04 - 0:05your friendly neighborhood data
-
0:05 - 0:08scientist from DreamCatcher. So in this
-
0:08 - 0:10presentation, I would like to talk about
-
0:10 - 0:13a specific industry use case of AI or
-
0:13 - 0:15machine learning which is predictive
-
0:15 - 0:19maintenance. So I will be covering these
-
0:19 - 0:21topics and feel free to jump forward to
-
0:21 - 0:23the specific part in the video where I
-
0:23 - 0:25talk about all these topics. So I'm going
-
0:25 - 0:27to start off with a general preview of
-
0:27 - 0:29AI and machine learning. Then, I'll
-
0:29 - 0:31discuss the use case which is predictive
-
0:31 - 0:33maintenance. I'll talk about the basics
-
0:33 - 0:35of machine learning, the workflow of
-
0:35 - 0:37machine learning, and then we will come
-
0:37 - 0:41to the meat of this presentation which
-
0:41 - 0:44is essentially a demonstration of the
-
0:44 - 0:45machine learning workflow from end to
-
0:45 - 0:48end on a real life predictive
-
0:48 - 0:52maintenance domain problem. All right, so
-
0:52 - 0:54without any further ado, let's jump into
-
0:54 - 0:57it. So let's start off with a quick
-
0:57 - 1:00preview of AI and machine learning. Well
-
1:00 - 1:04AI is a very general term, it encompasses
-
1:04 - 1:07the entire area of science and
-
1:07 - 1:09engineering that is related to creating
-
1:09 - 1:11software programs and machines that
-
1:11 - 1:14will be capable of performing tasks
-
1:14 - 1:16that would normally require human
-
1:16 - 1:20intelligence. But AI is a catchall term,
-
1:20 - 1:23so really when we talk about apply AI,
-
1:23 - 1:26how we use AI in our daily work, we are
-
1:26 - 1:28really going to be talking about machine
-
1:28 - 1:30learning. So machine learning is the
-
1:30 - 1:32design and application of software
-
1:32 - 1:34algorithms that are capable of learning
-
1:34 - 1:38on their own without any explicit human
-
1:38 - 1:40intervention. And the primary purpose of
-
1:40 - 1:43these algorithms are to optimize
-
1:43 - 1:47performance in a specific task. And the
-
1:47 - 1:50primary performance or the primary task
-
1:50 - 1:52that you want to optimize performance in
-
1:52 - 1:54is to be able to make accurate
-
1:54 - 1:57predictions about future outcomes based
-
1:57 - 2:01on the analysis of historical data
-
2:01 - 2:03from the past. So essentially machine
-
2:03 - 2:05learning is about making predictions
-
2:05 - 2:07about the future or what we call
-
2:07 - 2:09predictive analytics.
-
2:09 - 2:11And there are many different
-
2:11 - 2:13kinds of algorithms that are available in
-
2:13 - 2:15machine learning under the three primary
-
2:15 - 2:16categories of supervised learning,
-
2:16 - 2:19unsupervised learning, and reinforcement
-
2:19 - 2:21learning. And here we can see some of the
-
2:21 - 2:24different kinds of algorithms and their
-
2:24 - 2:27use cases in various areas in
-
2:27 - 2:30industry. So we have various domain use
-
2:30 - 2:30cases
-
2:30 - 2:32for all these different kind of
-
2:32 - 2:34algorithms, and we can see that different
-
2:34 - 2:38algorithms are fitted for different use cases.
-
2:38 - 2:41Deep learning is an advanced form
-
2:41 - 2:42of machine learning that's based on
-
2:42 - 2:44something called an artificial neural
-
2:44 - 2:46network or ANN for short, and this
-
2:46 - 2:48essentially simulates the structure of
-
2:48 - 2:50the human brain whereby neurons
-
2:50 - 2:51interconnect and work together to
-
2:51 - 2:55process and learn new information. So DL
-
2:55 - 2:57is the foundational technology for most
-
2:57 - 2:59of the popular AI tools that you
-
2:59 - 3:01probably have heard of today. So I'm sure
-
3:01 - 3:03you have heard of ChatGPT if you haven't
-
3:03 - 3:05been living in a cave for the past 2
-
3:05 - 3:08years. And yeah, so ChatGPT is an example
-
3:08 - 3:10of what we call a large language model
-
3:10 - 3:12and that's based on this technology
-
3:12 - 3:15called deep learning. Also, all the modern
-
3:15 - 3:17computer vision applications where a
-
3:17 - 3:20computer program can classify images or
-
3:20 - 3:23detect images or recognize images on
-
3:23 - 3:25its own, okay, we call this computer
-
3:25 - 3:28vision applications. They also use
-
3:28 - 3:30this particular form of machine learning
-
3:30 - 3:32called deep learning, right? So this is a
-
3:32 - 3:34example of an artificial neural network.
-
3:34 - 3:35For example, here I have an image of a
-
3:35 - 3:37bird that's fed into this artificial
-
3:37 - 3:40neural network, and output from this
-
3:40 - 3:41artificial neural network is a
-
3:41 - 3:44classification of this image into one of
-
3:44 - 3:46these three potential categories. So in
-
3:46 - 3:49this case, if the ANN has been trained
-
3:49 - 3:52properly, we fit in this image, this
-
3:52 - 3:54ANN should correctly classify this image
-
3:54 - 3:57as a bird, right? So this is a image
-
3:57 - 3:59classification problem which is a
-
3:59 - 4:01classic use case for an artificial
-
4:01 - 4:04neural network in the field of computer
-
4:04 - 4:08vision. And just like in the case of
-
4:08 - 4:09machine learning, there are a variety of
-
4:09 - 4:12algorithms that are available for
-
4:12 - 4:14deep learning under the category of
-
4:14 - 4:15supervised learning and also
-
4:15 - 4:17unsupervised learning.
-
4:17 - 4:19All right, so this is how we can
-
4:19 - 4:21kind of categorize this. You can think of
-
4:21 - 4:24AI is a general area of smart systems
-
4:24 - 4:27and machine. Machine learning is
-
4:27 - 4:29basically apply AI and deep learning
-
4:29 - 4:30is a
-
4:30 - 4:33subspecialization of machine learning
-
4:33 - 4:35using a particular architecture called
-
4:35 - 4:39an artificial neural network.
-
4:39 - 4:42And generative AI, so if you talk
-
4:42 - 4:45about ChatGPT, okay, Google Gemini,
-
4:45 - 4:48Microsoft Copilot, okay, all these
-
4:48 - 4:50examples of generative AI, they are
-
4:50 - 4:52basically large language models, and they
-
4:52 - 4:54are a further subcategory within the
-
4:54 - 4:55area of deep
-
4:55 - 4:58learning. And there are many applications
-
4:58 - 4:59of machine learning in industry right
-
4:59 - 5:02now, so pick which particular industry
-
5:02 - 5:04are you involved in, and these are all the
-
5:04 - 5:05specific areas of
-
5:05 - 5:10applications, right? So probably, I'm
-
5:10 - 5:12going to guess the vast majority of you
-
5:12 - 5:13who are watching this video, you're
-
5:13 - 5:14probably coming from the manufacturing
-
5:14 - 5:17industry, and so in the manufacturing
-
5:17 - 5:18industry some of the standard use cases
-
5:18 - 5:20for machine learning and deep learning
-
5:20 - 5:23are predicting potential problems, okay?
-
5:23 - 5:25So sometimes you call this predictive
-
5:25 - 5:27maintenance where you want to predict
-
5:27 - 5:29when a problem is going to happen and
-
5:29 - 5:30then kind of address it before it
-
5:30 - 5:33happens. And then monitoring systems,
-
5:33 - 5:35automating your manufacturing assembly
-
5:35 - 5:38line or production line, okay, smart
-
5:38 - 5:40scheduling, and detecting anomaly on your
-
5:40 - 5:41production line.
-
5:42 - 5:44Okay, so let's talk about the use
-
5:44 - 5:46case here which is predictive
-
5:46 - 5:49maintenance, right? So what is predictive
-
5:49 - 5:52maintenance? Well predictive maintenance,
-
5:52 - 5:53here's the long definition, is a
-
5:53 - 5:55equipment maintenance strategy that
-
5:55 - 5:56relies on real-time monitoring of
-
5:56 - 5:58equipment conditions and data to predict
-
5:58 - 6:00equipment failures in advance.
-
6:00 - 6:03And this uses advanced data models,
-
6:03 - 6:05analytics, and machine learning whereby
-
6:05 - 6:07we can reliably assess when failures are
-
6:07 - 6:09more likely to occur, including which
-
6:09 - 6:11components are more likely to be
-
6:11 - 6:14affected on your production or assembly
-
6:14 - 6:17line. So where does predictive
-
6:17 - 6:19maintenance fit into the overall scheme
-
6:19 - 6:21of things, right? So let's talk about the
-
6:21 - 6:23kind of standard way that, you know,
-
6:23 - 6:26factories or production
-
6:26 - 6:28lines, assembly lines in factories tend
-
6:28 - 6:31to handle maintenance issues say
-
6:31 - 6:3310 or 20 years ago, right? So what you
-
6:33 - 6:35have is the, what you would probably
-
6:35 - 6:36start off is the most basic mode
-
6:36 - 6:38which is reactive maintenance. So you
-
6:38 - 6:41just wait until your machine breaks down
-
6:41 - 6:43and then you repair, right? The simplest,
-
6:43 - 6:45but, of course, I'm sure if you have worked on a
-
6:45 - 6:47production line for any period of time,
-
6:47 - 6:49you know that this reactive maintenance
-
6:49 - 6:51can give you a whole bunch of headaches
-
6:51 - 6:52especially if the machine breaks down
-
6:52 - 6:54just before a critical delivery deadline,
-
6:54 - 6:56right? Then you're going to have a
-
6:56 - 6:57backlog of orders and you're going to
-
6:57 - 6:59run to a lot of problems. Okay, so we move on
-
6:59 - 7:01to preventive maintenance which is
-
7:01 - 7:04you regularly schedule a maintenance of
-
7:04 - 7:07your production machines to reduce
-
7:07 - 7:09the failure rate. So you might do
-
7:09 - 7:11maintenance once every month, once every
-
7:11 - 7:13two weeks, whatever. Okay, this is great,
-
7:13 - 7:15but the problem, of course, then is well
-
7:15 - 7:16sometimes you're doing too much
-
7:16 - 7:18maintenance, it's not really necessary,
-
7:18 - 7:21and it still doesn't totally prevent
-
7:21 - 7:23this, you know, a failure of the
-
7:23 - 7:26machine that occurs outside of your planned
-
7:26 - 7:29maintenance, right? So a bit of an
-
7:29 - 7:31improvement, but not that much better.
-
7:31 - 7:33And then, these last two categories is
-
7:33 - 7:35where we bring in AI and machine
-
7:35 - 7:37learning. So with machine learning, we're
-
7:37 - 7:39going to use sensors to do real-time
-
7:39 - 7:42monitoring of the data, and then using
-
7:42 - 7:43that data we're going to build a machine
-
7:43 - 7:46learning model which helps us to predict,
-
7:46 - 7:50with a reasonable level of accuracy, when
-
7:50 - 7:53the next failure is going to happen on
-
7:53 - 7:54your assembly or production line on a
-
7:54 - 7:57specific component or specific machine,
-
7:57 - 8:00right? So you just want to be predict to
-
8:00 - 8:02a high level of accuracy like maybe
-
8:02 - 8:04to the specific day, even the specific
-
8:04 - 8:06hour, or even minute itself when you
-
8:06 - 8:08expect that particular product to fail
-
8:08 - 8:11or the particular machine to fail. All
-
8:11 - 8:13right, so these are the advantages of
-
8:13 - 8:15predictive maintenance. It minimizes
-
8:15 - 8:17the occurrence of unscheduled downtime, it
-
8:17 - 8:18gives you a real-time overview of your
-
8:18 - 8:20current condition of assets, ensures
-
8:20 - 8:23minimal disruptions to productivity,
-
8:23 - 8:25optimizes time you spend on maintenance work,
-
8:25 - 8:27optimizes the use of spare parts, and so
-
8:27 - 8:28on. And of course there are some
-
8:28 - 8:31disadvantages, which is the
-
8:31 - 8:33primary one, you need a specialized set
-
8:33 - 8:36of skills among your engineers to
-
8:36 - 8:38understand and create machine learning
-
8:38 - 8:41models that can work on the real-time
-
8:41 - 8:44data that you're getting. Okay, so we're
-
8:44 - 8:45going to take a look at some real life
-
8:45 - 8:47use cases. So these are a bunch of links
-
8:47 - 8:49here, so if you navigate to these links
-
8:49 - 8:50here, you'll be able to get a look at
-
8:50 - 8:54some real life use cases of machine
-
8:54 - 8:58learning in predictive maintenance. So
-
8:58 - 9:01the IBM website, okay, gives you a look at
-
9:01 - 9:05a bunch of five use cases, so you can
-
9:05 - 9:07click on these links and follow up with
-
9:07 - 9:08them if you want to read more. Okay, this
-
9:08 - 9:11is waste management, manufacturing, okay,
-
9:11 - 9:15building services, and renewable energy,
-
9:15 - 9:17and also mining, right? So these are all
-
9:17 - 9:18use cases, if you want to know more about
-
9:18 - 9:20them, you can read up and follow them
-
9:20 - 9:24from this website. And this website
-
9:24 - 9:26gives, this is a pretty good website. I
-
9:26 - 9:28would really encourage you to just look
-
9:28 - 9:29through this if you're interested in
-
9:29 - 9:31predictive maintenance. So here, it tells
-
9:31 - 9:34you about, you know, an industry survey of
-
9:34 - 9:36predictive maintenance. We can see that a
-
9:36 - 9:38large portion of the industry,
-
9:38 - 9:40manufacturing industry agreed that
-
9:40 - 9:41predictive maintenance is a real need to
-
9:41 - 9:44stay competitive and predictive
-
9:44 - 9:45maintenance is essential for
-
9:45 - 9:47manufacturing industry and will gain
-
9:47 - 9:48additional strength in the future. So
-
9:48 - 9:50this is a survey that was done quite
-
9:50 - 9:52some time ago and this was the results
-
9:52 - 9:54that we got back. So we can see the vast
-
9:54 - 9:56majority of key industry players in the
-
9:56 - 9:58manufacturing sector, they consider
-
9:58 - 9:59predictive maintenance to be a very
-
9:59 - 10:00important
-
10:00 - 10:02activity that they want to
-
10:02 - 10:05incorporate into their workflow, right?
-
10:05 - 10:08And we can see here the kind of ROI that
-
10:08 - 10:11we expect on investment in predictive
-
10:11 - 10:13maintenance, so 45% reduction in downtime,
-
10:13 - 10:1725% growth in productivity, 75% fault
-
10:17 - 10:19elimination, 30% reduction in maintenance
-
10:19 - 10:23cost, okay? And best of all, if you really
-
10:23 - 10:25want to kind of take a look at examples,
-
10:25 - 10:27all right, so there are all these
-
10:27 - 10:28different companies that have
-
10:28 - 10:30significantly invested in predictive
-
10:30 - 10:32maintenance technology in their
-
10:32 - 10:34manufacturing processes. So PepsiCo, we
-
10:34 - 10:39have got Frito-Lay, General Motors, Mondi, Ecoplant,
-
10:39 - 10:41all right? So you can jump over here
-
10:41 - 10:43and take a look at some of these
-
10:43 - 10:46use cases. Let me perhaps, let me try and
-
10:46 - 10:48open this up, for example, Mondi, right? You
-
10:48 - 10:52can see Mondi has impl- oops. Mondi has used
-
10:52 - 10:54this particular piece of software
-
10:54 - 10:56called MATLAB, all right, or MathWorks
-
10:56 - 11:00sorry, to do predictive maintenance
-
11:00 - 11:02for their manufacturing processes using
-
11:02 - 11:05machine learning. And we can talk, you can
-
11:05 - 11:08study how they have used it, all right,
-
11:08 - 11:09and how it works, what was their
-
11:09 - 11:11challenge, all right, the problems they
-
11:11 - 11:13were facing, the solution that they use
-
11:13 - 11:15using this MathWorks Consulting piece of
-
11:15 - 11:17software, and data that they collected in
-
11:17 - 11:20a MATLAB database, all right, sorry
-
11:20 - 11:24in a Oracle database.
-
11:24 - 11:26So using MathWorks from MATLAB, all
-
11:26 - 11:28right, they were able to create a deep
-
11:28 - 11:31learning model to, you know, to
-
11:31 - 11:33solve this particular issue for their
-
11:33 - 11:36domain. So if you're interested, please, I
-
11:36 - 11:38strongly encourage you to read up on all
-
11:38 - 11:40these real life customer stories with
-
11:40 - 11:43showcase use cases for predictive
-
11:43 - 11:48maintenance. Okay, so that's it for
-
11:48 - 11:52real life use cases for predictive maintenance.
-
11:54 - 11:57Now in this topic, I'm
-
11:57 - 11:58going to talk about machine learning
-
11:58 - 12:00basics, so what is actually involved
-
12:00 - 12:01in machine learning, and I'm going to
-
12:01 - 12:04give a very quick, fast, conceptual, high
-
12:04 - 12:06level overview of machine learning, all
-
12:06 - 12:09right? So there are several categories of
-
12:09 - 12:11machine learning, supervised, unsupervised,
-
12:11 - 12:13semi-supervised, reinforcement, and deep
-
12:13 - 12:16learning, okay? And let's talk about the
-
12:16 - 12:19most common and widely used category of
-
12:19 - 12:21machine learning which is called
-
12:21 - 12:25supervised learning. So the particular use
-
12:25 - 12:26case here that I'm going to be
-
12:26 - 12:29discussing, predictive maintenance, it's
-
12:29 - 12:31basically a form of supervised learning.
-
12:31 - 12:33So how does supervised learning work?
-
12:33 - 12:35Well in supervised learning, you're going
-
12:35 - 12:37to create a machine learning model by
-
12:37 - 12:39providing what is called a labelled data
-
12:39 - 12:42set as a input to a machine learning
-
12:42 - 12:45program or algorithm. And this dataset
-
12:45 - 12:46is going to contain what is called an
-
12:46 - 12:49independent or feature variables, all
-
12:49 - 12:51right, so this will be a set of variables.
-
12:51 - 12:53And there will be one dependent or
-
12:53 - 12:55target variable which we also call the
-
12:55 - 12:58label, and the idea is that the
-
12:58 - 13:00independent or the feature variables are
-
13:00 - 13:02the attributes or properties of your
-
13:02 - 13:04dataset that influence the dependent or
-
13:04 - 13:08the target variable, okay? So this process
-
13:08 - 13:09that I've just described is called
-
13:09 - 13:12training the machine learning model, and
-
13:12 - 13:14the model is fundamentally a
-
13:14 - 13:16mathematical function that best
-
13:16 - 13:18approximates the relationship between
-
13:18 - 13:21the independent variables and the
-
13:21 - 13:23dependent variable. All right, so that's
-
13:23 - 13:24quite a bit of a mouthful, so let's jump
-
13:24 - 13:26into a diagram that maybe illustrates
-
13:26 - 13:28this more clearly. So let's say you have
-
13:28 - 13:30a dataset here, an Excel spreadsheet,
-
13:30 - 13:32right? And this Excel spreadsheet has a
-
13:32 - 13:34bunch of columns here and a bunch of
-
13:34 - 13:37rows, okay? So these rows here represent
-
13:37 - 13:39observations, or these rows are what
-
13:39 - 13:41we call observations or samples or data
-
13:41 - 13:43points in our dataset, okay? So let's
-
13:43 - 13:47assume this dataset is gathered by a
-
13:47 - 13:50marketing manager at a mall, at a retail
-
13:50 - 13:52mall, all right? So they've got all this
-
13:52 - 13:55information about the customers who
-
13:55 - 13:57purchase products at this mall, all right?
-
13:57 - 13:59So some of the information they've
-
13:59 - 14:00gotten about the customers are their
-
14:00 - 14:02gender, their age, their income, and the
-
14:02 - 14:04number of children. So all this
-
14:04 - 14:06information about the customers, we call
-
14:06 - 14:07this the independent or the feature
-
14:07 - 14:10variables, all right? And based on all
-
14:10 - 14:13this information about the customer, we
-
14:13 - 14:16also managed to get some or we record
-
14:16 - 14:18the information about how much the
-
14:18 - 14:20customer spends, all right? So this
-
14:20 - 14:22information or these numbers here, we call
-
14:22 - 14:24this the target variable or the
-
14:24 - 14:27dependent variable, right? So on the
-
14:27 - 14:30single row, the data point, one single sample, one
-
14:30 - 14:33single data point, contains all the data
-
14:33 - 14:35for the feature variables and one single
-
14:35 - 14:38value for the label or the target
-
14:38 - 14:41variable, okay? And the primary purpose of
-
14:41 - 14:43the machine learning model is to create
-
14:43 - 14:46a mapping from all your feature
-
14:46 - 14:48variables to your target variable, so
-
14:48 - 14:51somehow there's going to be a function,
-
14:51 - 14:52okay, this will be a mathematical
-
14:52 - 14:55function that maps all the values of
-
14:55 - 14:57your feature variable to the value of
-
14:57 - 15:00your target variable. In other words, this
-
15:00 - 15:01function represents the relationship
-
15:01 - 15:03between your feature variables and your
-
15:03 - 15:07target variable, okay? So this whole thing,
-
15:07 - 15:09this training process, we call this the
-
15:09 - 15:11fitting the model. And the target
-
15:11 - 15:13variable or the label, this thing here,
-
15:13 - 15:15this column here, or the values here,
-
15:15 - 15:17these are critical for providing a
-
15:17 - 15:19context to do the fitting or the
-
15:19 - 15:21training of the model. And once you've
-
15:21 - 15:23got a trained and fitted model, you can
-
15:23 - 15:26then use the model to make an accurate
-
15:26 - 15:28prediction of target values
-
15:28 - 15:30corresponding to new feature values that
-
15:30 - 15:33the model has yet to encounter or yet to
-
15:33 - 15:35see, and this, as I've already said
-
15:35 - 15:36earlier, this is called predictive
-
15:36 - 15:38analytics, okay? So let's see what's
-
15:38 - 15:40actually happening here, you take your
-
15:40 - 15:43training data, all right, so this is this
-
15:43 - 15:45whole bunch of data, this dataset here
-
15:45 - 15:47consisting of a thousand rows of
-
15:47 - 15:50data, 10,000 rows of data, you take this
-
15:50 - 15:52entire dataset, all right, this entire
-
15:52 - 15:54dataset, you jam it into your machine
-
15:54 - 15:57learning algorithm, and a couple of hours
-
15:57 - 15:58later your machine learning algorithm
-
15:58 - 16:01comes up with a model. And the model is
-
16:01 - 16:04essentially a function that maps all
-
16:04 - 16:06your feature variables which is these
-
16:06 - 16:08four columns here, to your target
-
16:08 - 16:10variable which is this one single column
-
16:10 - 16:14here, okay? So once you have the model, you
-
16:14 - 16:17can put in a new data point. So basically
-
16:17 - 16:19the new data point represents data about a
-
16:19 - 16:21new customer, a new customer that you
-
16:21 - 16:23have never seen before. So let's say
-
16:23 - 16:25you've already got information about
-
16:25 - 16:2810,000 customers that have visited this
-
16:28 - 16:30mall and how much each of these 10,000
-
16:30 - 16:32customers have spent when they are at this
-
16:32 - 16:34mall. So now you have a totally new
-
16:34 - 16:36customer that comes in the mall, this
-
16:36 - 16:38customer has never come into this mall
-
16:38 - 16:40before, and what we know about this
-
16:40 - 16:43customer is that he is a male, the age is
-
16:43 - 16:4550, the income is 18, and they have nine
-
16:45 - 16:48children. So now when you take this data
-
16:48 - 16:51and you pump that into your model, your
-
16:51 - 16:53model is going to make a prediction, it's
-
16:53 - 16:56going to say, hey, you know what? Based on
-
16:56 - 16:57everything that I have been trained before
-
16:57 - 16:59and based on the model I've developed,
-
16:59 - 17:02I am going to predict that a customer
-
17:02 - 17:05that is of a male gender, of the age 50
-
17:05 - 17:08with the income of 18, and nine children,
-
17:08 - 17:12that customer is going to spend 25 ringgit
-
17:12 - 17:16at the mall. And this is it, this is what
-
17:16 - 17:19you want. Right there, right here,
-
17:19 - 17:21can you see here? That is the final
-
17:21 - 17:23output of your machine learning model.
-
17:23 - 17:27It's going to make a prediction about
-
17:27 - 17:30something that it has not ever seen
-
17:30 - 17:33before, okay? That is the core, this is
-
17:33 - 17:36essentially the core of machine learning.
-
17:36 - 17:39Predictive analytics, making prediction
-
17:39 - 17:40about the future
-
17:41 - 17:44based on a historical dataset.
-
17:44 - 17:47Okay, so there are two areas of
-
17:47 - 17:49supervised learning, regression and
-
17:49 - 17:51classification. So regression is used to
-
17:51 - 17:53predict a numerical target variable, such
-
17:53 - 17:55as the price of a house or the salary of
-
17:55 - 17:58an employee, whereas classification is
-
17:58 - 18:00used to predict a categorical target
-
18:00 - 18:04variable or class label, okay? So for
-
18:04 - 18:06classification you can have either
-
18:06 - 18:09binary or multiclass, so, for example,
-
18:09 - 18:12binary will be just true or false, zero
-
18:12 - 18:15or one. So whether your machine is going
-
18:15 - 18:17to fail or is it not going to fail, right?
-
18:17 - 18:19So just two classes, two possible,
-
18:19 - 18:22outcomes, or is the customer going to
-
18:22 - 18:24make a purchase or is the customer not
-
18:24 - 18:26going to make a purchase. We call this
-
18:26 - 18:28binary classification. And then for
-
18:28 - 18:30multiclass, when there are more than two
-
18:30 - 18:33classes or types of values. So, for
-
18:33 - 18:34example, here this would be a
-
18:34 - 18:36classification problem. So if you have a
-
18:36 - 18:38dataset here, you've got information
-
18:38 - 18:39about your customers, you've got your
-
18:39 - 18:41gender of the customer, the age of the
-
18:41 - 18:43customer, the salary of the customer, and
-
18:43 - 18:45you also have record about whether the
-
18:45 - 18:48customer made a purchase or not, okay? So
-
18:48 - 18:50you can take this dataset to train a
-
18:50 - 18:52classification model, and then the
-
18:52 - 18:54classification model can then make a
-
18:54 - 18:56prediction about a new customer, and
-
18:56 - 18:59they're going to predict zero which
-
18:59 - 19:00means the customer didn't make a
-
19:00 - 19:03purchase or one which means the customer
-
19:03 - 19:06make a purchase, right? And regression,
-
19:06 - 19:09this is regression, so let's say you want
-
19:09 - 19:11to predict the wind speed, and you've got
-
19:11 - 19:14historical data about all these four
-
19:14 - 19:17other independent variables or feature
-
19:17 - 19:18variables, so you have recorded
-
19:18 - 19:20temperature, the pressure, the relative
-
19:20 - 19:22humidity, and the wind direction for the
-
19:22 - 19:25past 10 days, 15 days, or whatever, okay? So
-
19:25 - 19:27now you are going to train your machine
-
19:27 - 19:29learning model using this dataset, and
-
19:29 - 19:32the target variable column, okay, this
-
19:32 - 19:34column here, the label is basically a
-
19:34 - 19:37number, right? So now with this number,
-
19:37 - 19:40this is a regression model, and so now
-
19:40 - 19:42you can put in a new data point, so a new
-
19:42 - 19:45data point means a new set of values for
-
19:45 - 19:47temperature, pressure, relative humidity,
-
19:47 - 19:49and wind direction, and your machine
-
19:49 - 19:51learning model will then predict the
-
19:51 - 19:54wind speed for that new data point, okay?
-
19:54 - 19:57So that's a regression model.
-
19:59 - 20:02All right. So in this particular topic
-
20:02 - 20:05I'm going to talk about the workflow of
-
20:05 - 20:08that's involved in machine learning. So
-
20:08 - 20:13in the previous slides, I talked about
-
20:13 - 20:15developing the model, all right? But
-
20:15 - 20:16that's just one part of the entire
-
20:16 - 20:19workflow. So in real life when you use
-
20:19 - 20:20machine learning, there's an end-to-end
-
20:20 - 20:22workflow that's involved. So the first
-
20:22 - 20:24thing, of course, is you need to get your
-
20:24 - 20:27data, and then you need to clean your
-
20:27 - 20:29data, and then you need to explore your
-
20:29 - 20:31data. You need to see what's going on in
-
20:31 - 20:33your dataset, right? And your dataset,
-
20:33 - 20:36real life datasets are not trivial, they
-
20:36 - 20:39are hundreds of rows, thousands of rows,
-
20:39 - 20:41sometimes millions of rows, billions of
-
20:41 - 20:43rows, we're talking about billions or
-
20:43 - 20:45millions of data points especially if
-
20:45 - 20:47you're using an IoT sensor to get data
-
20:47 - 20:49in real-time. So you've got all these
-
20:49 - 20:51super large datasets, you need to clean
-
20:51 - 20:53them, and explore them, and then you need
-
20:53 - 20:56to prepare them into a right format so
-
20:56 - 21:00that you can put them into the training
-
21:00 - 21:02process to create your machine learning
-
21:02 - 21:05model, and then subsequently you check
-
21:05 - 21:08how good is the model, right? How accurate
-
21:08 - 21:10is the model in terms of its ability to
-
21:10 - 21:13generate predictions for the
-
21:13 - 21:15future, right? How accurate are the
-
21:15 - 21:17predictions that are coming up from your
-
21:17 - 21:18machine learning model. So that's
-
21:18 - 21:21validating or evaluating your model, and
-
21:21 - 21:23then subsequently if you determine that
-
21:23 - 21:25your model is of adequate accuracy to
-
21:25 - 21:27meet whatever your domain use case
-
21:27 - 21:29requirements are, right? So let's say the
-
21:29 - 21:31accuracy that's required for your domain
-
21:31 - 21:32use case is
-
21:32 - 21:3585%, okay? If my machine learning model
-
21:35 - 21:39can give an 85% accuracy rate, I think
-
21:39 - 21:40it's good enough, then I'm going to
-
21:40 - 21:43deploy it into real world use case. So
-
21:43 - 21:45here the machine learning model gets
-
21:45 - 21:48deployed on the server, and then other,
-
21:48 - 21:51you know, other data sources are going to
-
21:51 - 21:53be captured from somewhere. That data is
-
21:53 - 21:54pump into the machine learning model. The
-
21:54 - 21:55machine learning model generates
-
21:55 - 21:58predictions, and those predictions are
-
21:58 - 22:00then used to make decisions on the
-
22:00 - 22:02factory floor in real-time or in any
-
22:02 - 22:05other particular scenario. And then you
-
22:05 - 22:07constantly monitor and update the model,
-
22:07 - 22:09you get more new data, and then the
-
22:09 - 22:12entire cycle repeats itself. So that's
-
22:12 - 22:14your machine learning workflow, okay, in a
-
22:14 - 22:17nutshell. Here's another example of
-
22:17 - 22:19the same thing maybe in a slightly
-
22:19 - 22:20different format, so, again, you have your
-
22:20 - 22:22data collection and preparation. Here we
-
22:22 - 22:24talk more about the different kinds of
-
22:24 - 22:27algorithms that available to create a
-
22:27 - 22:28model, and I'll talk about this more in
-
22:28 - 22:30detail when we look at the real world
-
22:30 - 22:32example of a end-to-end machine learning
-
22:32 - 22:35workflow for the predictive maintenance
-
22:35 - 22:37use case. So once you have chosen the
-
22:37 - 22:39appropriate algorithm, you then have
-
22:39 - 22:41trained your model, you then have
-
22:41 - 22:44selected the appropriate train model
-
22:44 - 22:46among the multiple models. You are
-
22:46 - 22:48probably going to develop multiple
-
22:48 - 22:50models from multiple algorithms, you're
-
22:50 - 22:52going to evaluate them all, and then
-
22:52 - 22:53you're going to say, hey, you know what?
-
22:53 - 22:55After I've evaluated and tested that,
-
22:55 - 22:57I've chosen the best model, I'm going to
-
22:57 - 23:00deploy the model, all right, so this is
-
23:00 - 23:03for real life production use, okay? Real
-
23:03 - 23:04life sensor data is going to be pumped
-
23:04 - 23:06into my model, my model is going to
-
23:06 - 23:08generate predictions, the predicted data
-
23:08 - 23:10is going to used immediately in real
-
23:10 - 23:13time for real life decision making, and
-
23:13 - 23:15then I'm going to monitor, right, the
-
23:15 - 23:17results. So somebody's using the
-
23:17 - 23:19predictions from my model, if the
-
23:19 - 23:22predictions are lousy, that goes into the
-
23:22 - 23:23monitoring, the monitoring system
-
23:23 - 23:25captures that. If the predictions are
-
23:25 - 23:28fantastic, well that is also captured by the
-
23:28 - 23:30monitoring system, and that gets
-
23:30 - 23:32feedback again to the next cycle of my
-
23:32 - 23:34machine learning
-
23:34 - 23:36pipeline. Okay, so that's the kind of
-
23:36 - 23:38overall view, and here are the kind of
-
23:38 - 23:42key phases of your workflow. So one of
-
23:42 - 23:44the important phases is called EDA,
-
23:44 - 23:48exploratory data analysis and in this
-
23:48 - 23:50particular phase, you're going to
-
23:50 - 23:53do a lot of stuff, primarily just to
-
23:53 - 23:55understand your dataset. So like I said,
-
23:55 - 23:57real life datasets, they tend to be very
-
23:57 - 23:59complex, and they tend to have various
-
23:59 - 24:01statistical properties, all right,
-
24:01 - 24:03statistics is a very important component
-
24:03 - 24:06of machine learning. So an EDA helps you
-
24:06 - 24:07to kind of get an overview of your data
-
24:07 - 24:10set, get an overview of any problems in
-
24:10 - 24:12your dataset like any data that's
-
24:12 - 24:13missing, the statistical properties of your
-
24:13 - 24:15dataset, the distribution of your data
-
24:15 - 24:17set, the statistical correlation of
-
24:17 - 24:19variables in your data set, etc,
-
24:19 - 24:23etc. Okay, then we have data cleaning or
-
24:23 - 24:25sometimes you call it data cleansing, and
-
24:25 - 24:28in this phase what you want to do is
-
24:28 - 24:29primarily, you want to kind of do things
-
24:29 - 24:32like remove duplicate records or rows in
-
24:32 - 24:34your table, you want to make sure that
-
24:34 - 24:37your data or your data
-
24:37 - 24:39points or your samples have appropriate IDs,
-
24:39 - 24:41and most importantly, you want to make
-
24:41 - 24:43sure there's not too many missing values
-
24:43 - 24:45in your dataset. So what I mean by
-
24:45 - 24:46missing values are things like that,
-
24:46 - 24:48right? You have got a dataset, and for
-
24:48 - 24:52some reason there are some cells or
-
24:52 - 24:55locations in your dataset which are
-
24:55 - 24:57missing values, right? And if you have a
-
24:57 - 24:59lot of these missing values, then you've
-
24:59 - 25:00got a poor quality dataset, and you're
-
25:00 - 25:02not going to be able to build a good
-
25:02 - 25:04model from this dataset. You're not
-
25:04 - 25:06going to be able to train a good machine
-
25:06 - 25:08learning model from a dataset with a
-
25:08 - 25:10lot of missing values like this. So you
-
25:10 - 25:12have to figure out whether there are a
-
25:12 - 25:13lot of missing values in your dataset,
-
25:13 - 25:15how do you handle them. Another thing
-
25:15 - 25:17that's important in data cleansing is
-
25:17 - 25:19figuring out the outliers in your data
-
25:19 - 25:22set. So outliers are things like this,
-
25:22 - 25:24you know, data points that are very far from
-
25:24 - 25:26the general trend of data points in your
-
25:26 - 25:30data set, right? And so there are also
-
25:30 - 25:32several ways to detect outliers in your
-
25:32 - 25:34dataset, and there are several ways to
-
25:34 - 25:37handle outliers in your dataset.
-
25:37 - 25:38Similarly as well, there are several ways
-
25:38 - 25:40to handle missing values in your data
-
25:40 - 25:43set. So handling missing values, handling
-
25:43 - 25:46outliers, those are really two very key
-
25:46 - 25:47importance of data
-
25:47 - 25:49cleansing, and there are many, many
-
25:49 - 25:51techniques to handle this, so a data
-
25:51 - 25:52scientist needs to be acquainted with
-
25:52 - 25:55all of this. All right, why do I need to
-
25:55 - 25:58do data cleansing? Well, here is the key
-
25:58 - 25:59point.
-
25:59 - 26:03If you have a very poor quality dataset,
-
26:03 - 26:05which means you've got a lot of outliers
-
26:05 - 26:07which are errors in your dataset, or you
-
26:07 - 26:08got a lot of missing values in your data
-
26:08 - 26:11set, even though you've got a fantastic
-
26:11 - 26:13algorithm, you've got a fantastic model,
-
26:13 - 26:16the predictions that your model is going
-
26:16 - 26:19to give is absolutely rubbish. It's kind
-
26:19 - 26:22of like taking water and putting water
-
26:22 - 26:26into the tank of a Mercedes-Benz. So
-
26:26 - 26:28Mercedes-Benz is a great car, but if you
-
26:28 - 26:30take water and put it into your
-
26:30 - 26:33Mercedes-Benz, it will just die, right? Your
-
26:33 - 26:37car will just die, it can't run on water,
-
26:37 - 26:38right? On the other hand, if you have a
-
26:38 - 26:42Myvi, Myvi is just a lousy, shit car, but if
-
26:42 - 26:45you take a high octane, good petrol and
-
26:45 - 26:47you put into a Myvi, the Myvi will just go at,
-
26:47 - 26:49you know, 100 miles an hour. It would just
-
26:49 - 26:51completely destroy the Mercedes-Benz in
-
26:51 - 26:53terms of performance, so it
-
26:53 - 26:55doesn't really matter what model you're
-
26:55 - 26:57using here, right? So you can be using the most
-
26:57 - 26:59fantastic model like the
-
26:59 - 27:01Mercedes-Benz or machine learning, but if
-
27:01 - 27:03your data is lousy quality, your
-
27:03 - 27:06predictions is also going to be rubbish,
-
27:06 - 27:10okay? So cleansing dataset is, in fact,
-
27:10 - 27:12probably the most important thing that
-
27:12 - 27:14data scientists need to do and that's
-
27:14 - 27:16what they spend most of the time doing,
-
27:16 - 27:18right, building the model, training the
-
27:18 - 27:20model, getting the right algorithms, and
-
27:20 - 27:23so on, that's really a small portion of
-
27:23 - 27:25the actual machine learning workflow,
-
27:25 - 27:27right? The actual machine learning
-
27:27 - 27:30workflow, the vast majority of time is on
-
27:30 - 27:32cleaning and organizing your
-
27:32 - 27:33data. Then you have something called
-
27:33 - 27:35feature engineering which is you
-
27:35 - 27:37preprocess the feature variables of
-
27:37 - 27:39your original dataset prior to using
-
27:39 - 27:41them to train the model, and this is
-
27:41 - 27:42either through addition, deletion,
-
27:42 - 27:44combination, or transformation of these
-
27:44 - 27:45variables. And then the idea is you want
-
27:45 - 27:47to improve the predictive accuracy of
-
27:47 - 27:49the model, and also because some models
-
27:49 - 27:51can only work with numeric data, so you
-
27:51 - 27:54need to transform categorical data into
-
27:54 - 27:57numeric data. All right, so just now, in
-
27:57 - 27:59the earlier slides, I showed you that you
-
27:59 - 28:01take your original dataset, you pump it
-
28:01 - 28:03into algorithm, and then a couple of hours
-
28:03 - 28:05later, you get a machine learning model,
-
28:05 - 28:09right? So you didn't do anything to your
-
28:09 - 28:10dataset, to the feature variables in
-
28:10 - 28:12your dataset before you pump it into a
-
28:12 - 28:14machine learning algorithm. So
-
28:14 - 28:16what I showed you earlier is you just
-
28:16 - 28:19take the dataset exactly as it is and
-
28:19 - 28:21you just pump it into the algorithm,
-
28:21 - 28:23couple of hours later, you get a model,
-
28:23 - 28:28right? But that's not what generally
-
28:28 - 28:30happens in in real life. In real life,
-
28:30 - 28:32you're going to take all the original
-
28:32 - 28:34feature variables from your dataset and
-
28:34 - 28:37you're going to transform them in some
-
28:37 - 28:39way. So you can see here these are the
-
28:39 - 28:42columns of data from my original dataset,
-
28:42 - 28:46and before I actually put all these data
-
28:46 - 28:48points from my original dataset into my
-
28:48 - 28:51algorithm to train and get my model, I
-
28:51 - 28:55will actually transform them, okay? So the
-
28:55 - 28:58transformation of these feature variable
-
28:58 - 29:01values, we call this feature engineering.
-
29:01 - 29:02And there are many, many techniques to do
-
29:02 - 29:05feature engineering, so one-hot encoding,
-
29:05 - 29:08scaling, log transformation,
-
29:08 - 29:10discretization, date extraction, boolean
-
29:10 - 29:12logic, etc, etc.
-
29:12 - 29:15Okay, then finally we do something
-
29:15 - 29:17called a train-test split, so where we
-
29:17 - 29:19take our original dataset, right? So this
-
29:19 - 29:21was the original dataset, and we break
-
29:21 - 29:24it into two parts, so one is called the
-
29:24 - 29:26training dataset and the other is
-
29:26 - 29:28called the test dataset. And the primary
-
29:28 - 29:30purpose for this is when we feed and
-
29:30 - 29:31train the machine learning model, we're
-
29:31 - 29:33going to use what is called the training
-
29:33 - 29:36dataset, and when we want to evaluate
-
29:36 - 29:37the accuracy of the model, right? So this
-
29:37 - 29:41is the key part of your machine learning
-
29:41 - 29:44life cycle because you are not only just
-
29:44 - 29:45going to have one possible models
-
29:45 - 29:48because there are a vast range of
-
29:48 - 29:50algorithms that you can use to create a
-
29:50 - 29:53model. So fundamentally you have a wide
-
29:53 - 29:56range of choices, right, like wide range
-
29:56 - 29:58of cars, right? You want to buy a car, you
-
29:58 - 30:01can buy a Myvi, you can buy a Perodua,
-
30:01 - 30:03you can buy a Honda, you can buy a
-
30:03 - 30:05Mercedes-Benz, you can buy a Audi, you can
-
30:05 - 30:08buy a beamer, many, many different cars
-
30:08 - 30:09that available for you if you want
-
30:09 - 30:12to buy a car, right? Same thing. With a
-
30:12 - 30:14machine learning model there are a vast
-
30:14 - 30:17variety of algorithms that you can
-
30:17 - 30:19choose from in order to create a model,
-
30:19 - 30:22and so once you create a model from a
-
30:22 - 30:24given algorithm you need to say, hey, how
-
30:24 - 30:26accurate is this model that I've created
-
30:26 - 30:29from this algorithm. And different
-
30:29 - 30:30algorithms are going to create different
-
30:30 - 30:34models with different rates of accuracy.
-
30:34 - 30:36And so the primary purpose of the test
-
30:36 - 30:38dataset is to evaluate the accuracy
-
30:38 - 30:41of the model to see hey, is this model
-
30:41 - 30:43that I've created using this algorithm,
-
30:43 - 30:46is it adequate for me to use in a real
-
30:46 - 30:49life production use case? Okay? So that's
-
30:49 - 30:52what it's all about. Okay, so this is my
-
30:52 - 30:54original dataset, I break it into my
-
30:54 - 30:57feature dataset and
-
30:57 - 30:59also my target variable column, so my
-
30:59 - 31:01feature variable columns, the target
-
31:01 - 31:02variable columns, and then I further break
-
31:02 - 31:04it into a training dataset and a test
-
31:04 - 31:07dataset. The training dataset is to use
-
31:07 - 31:08to train, to create the machine learning
-
31:08 - 31:10model. And then once the machine learning
-
31:10 - 31:12model is created, I then use the test
-
31:12 - 31:15dataset to evaluate the accuracy of the
-
31:15 - 31:17machine learning model.
-
31:17 - 31:21All right. And then finally we can
-
31:21 - 31:23see what are the different parts or
-
31:23 - 31:26aspects that go into a successful model,
-
31:26 - 31:30so EDA about 10%, data cleansing about
-
31:30 - 31:3220%, feature engineering about
-
31:32 - 31:3625%, selecting a specific algorithm about
-
31:36 - 31:3910%, and then training the model from
-
31:39 - 31:42that algorithm about 15%, and then
-
31:42 - 31:44finally evaluating the model, deciding
-
31:44 - 31:46which is the best model with the highest
-
31:46 - 31:52accuracy rate, that's about 20%.
-
31:54 - 31:57All right, so we have reached the
-
31:57 - 31:59most interesting part of this
-
31:59 - 32:01presentation which is the demonstration
-
32:01 - 32:04of an end-to-end machine learning workflow
-
32:04 - 32:06on a real life dataset that
-
32:06 - 32:10demonstrates the use case of predictive
-
32:10 - 32:14maintenance. So for the dataset for
-
32:14 - 32:16this particular use case, I've used a
-
32:16 - 32:19dataset from Kaggle. So for those of you
-
32:19 - 32:21are not aware of this, Kaggle is the
-
32:21 - 32:25world's largest open-source community
-
32:25 - 32:28for data science and AI, and they have a
-
32:28 - 32:31large collection of datasets from all
-
32:31 - 32:34various areas of industry and human
-
32:34 - 32:37endeavor, and they also have a large
-
32:37 - 32:39collection of models that have been
-
32:39 - 32:43developed using these datasets. So here
-
32:43 - 32:47we have a dataset for the particular
-
32:47 - 32:51use case, predictive maintenance, okay? So
-
32:51 - 32:53this is some information about the data
-
32:53 - 32:56set, so in case you do not know how
-
32:56 - 32:59to get to there, this is the URL to click
-
32:59 - 33:02on, okay, to get to that dataset. So once
-
33:02 - 33:05your at the dataset here, you can- or the
-
33:05 - 33:07page for about this dataset, you can see
-
33:07 - 33:10all the information about this dataset,
-
33:10 - 33:13and you can download the dataset in a
-
33:13 - 33:14CSV format.
-
33:14 - 33:16Okay, so let's take a look at the
-
33:16 - 33:20dataset. So this dataset has a total of
-
33:20 - 33:2310,000 samples, okay? And these are the
-
33:23 - 33:26feature variables, the type, the product
-
33:26 - 33:28ID, the air temperature, process
-
33:28 - 33:31temperature, rotational speed, torque, tool
-
33:31 - 33:35wear, and this is the target variable,
-
33:35 - 33:37all right? So the target variable is what
-
33:37 - 33:38we are interested in, what we are
-
33:38 - 33:41interested in using to train the machine
-
33:41 - 33:43learning model, and also what we are
-
33:43 - 33:45interested to predict, okay? So these are
-
33:45 - 33:48the feature variables, they describe or
-
33:48 - 33:50they provide information about this
-
33:50 - 33:53particular machine on the production
-
33:53 - 33:55line, on the assembly line, so you might
-
33:55 - 33:57know the product ID, the type, the air
-
33:57 - 33:58temperature, process temperature,
-
33:58 - 34:00rotational speed, torque, tool wear, right? So
-
34:00 - 34:03let's say you've got a IoT sensor system
-
34:03 - 34:06that's basically capturing all this data
-
34:06 - 34:08about a product or a machine on your
-
34:08 - 34:11production or assembly line, okay? And
-
34:11 - 34:14you've also captured information about
-
34:14 - 34:17whether is for a specific sample,
-
34:17 - 34:20whether that sample experience a
-
34:20 - 34:23failure or not, okay? So the target value
-
34:23 - 34:26of zero, okay, indicates that there's no
-
34:26 - 34:28failure. So zero means no failure, and we
-
34:28 - 34:30can see that the vast majority of data
-
34:30 - 34:33points in this dataset are no failure.
-
34:33 - 34:34And here we can see an example here
-
34:34 - 34:37where you have a case of a failure, so a
-
34:37 - 34:40failure is marked as a one, positive, and
-
34:40 - 34:43no failure is marked as zero, negative,
-
34:43 - 34:45all right? So here we have one type of a
-
34:45 - 34:47failure, it's called a power failure. And
-
34:47 - 34:49if you scroll down the dataset, you see
-
34:49 - 34:50there are also other kinds of failures
-
34:50 - 34:53like a tool wear
-
34:53 - 34:57failure, we have a overstrain failure
-
34:57 - 34:59here, for example,
-
34:59 - 35:01we also have a power failure again,
-
35:01 - 35:02and so on. So if you scroll down through
-
35:02 - 35:04these 10,000 data points, or if
-
35:04 - 35:06you're familiar with using Excel to
-
35:06 - 35:09filter out values in a column, you can
-
35:09 - 35:12see that in this particular column here
-
35:12 - 35:14which is the so-called target variable
-
35:14 - 35:17column, you are going to have the vast
-
35:17 - 35:19majority of values as zero which means
-
35:19 - 35:23no failure, and some of the rows or the
-
35:23 - 35:24data points you are going to have a
-
35:24 - 35:26value of one, and for those rows that you
-
35:26 - 35:28have a value of one, for example,
-
35:28 - 35:31here you are- Sorry, for example, here you
-
35:31 - 35:33are going to have different types of
-
35:33 - 35:35failures, so like I said just now power
-
35:35 - 35:39failure, tool set failure, etc, etc. So we are
-
35:39 - 35:41going to go through the entire machine
-
35:41 - 35:44learning workflow process with this dataset.
-
35:44 - 35:47So to see an example of that, we are
-
35:47 - 35:50going to use a- we're going to go to the
-
35:50 - 35:52code section here, all right, so if I
-
35:52 - 35:54click on the code section here. And right
-
35:54 - 35:56down here we have see what is called a
-
35:56 - 35:59dataset notebook. So this is basically a
-
35:59 - 36:02Jupyter notebook. Jupyter is basically an
-
36:02 - 36:05Python application which allows you to
-
36:05 - 36:09create a Python machine learning
-
36:09 - 36:12program that basically builds your
-
36:12 - 36:15machine learning model, assesses or
-
36:15 - 36:16evaluates its accuracy, and generates
-
36:16 - 36:19predictions from it, okay? So here we have
-
36:19 - 36:22a whole bunch of Jupyter notebooks that
-
36:22 - 36:25are available, and you can select any one
-
36:25 - 36:26of them. All these notebooks are
-
36:26 - 36:29essentially going to process the data
-
36:29 - 36:32from this particular dataset. So if I go
-
36:32 - 36:35to this code page here, I've actually
-
36:35 - 36:37selected a specific notebook that I'm
-
36:37 - 36:40going to run through to demonstrate an
-
36:40 - 36:43end-to-end machine learning workflow using
-
36:43 - 36:46various machine learning libraries from
-
36:46 - 36:50the Python programming language, okay? So
-
36:50 - 36:52the particular notebook I'm going to
-
36:52 - 36:55use is this particular notebook here, and
-
36:55 - 36:57you can also get the URL for that
-
36:57 - 37:00particular notebook from here.
-
37:00 - 37:04Okay, so let's quickly do a quick
-
37:04 - 37:06revision again. What are we trying to do
-
37:06 - 37:08here? We're trying to build a machine
-
37:08 - 37:11learning classification model, right? So
-
37:11 - 37:13we said there are two primary areas of
-
37:13 - 37:15supervised learning, one is regression
-
37:15 - 37:16which is used to predict a numerical
-
37:16 - 37:19target variable, and the second kind of
-
37:19 - 37:21supervised learning is classification
-
37:21 - 37:23which is what we're doing here. We're
-
37:23 - 37:26trying to predict a categorical target
-
37:26 - 37:30variable, okay? So in this particular
-
37:30 - 37:32example, we actually have two kinds of
-
37:32 - 37:34ways we can classify, either a binary
-
37:34 - 37:37classification or a multiclass
-
37:37 - 37:40classification. So for binary
-
37:40 - 37:41classification, we are only going to
-
37:41 - 37:43classify the product or machine as
-
37:43 - 37:47either it failed or it did not fail, okay?
-
37:47 - 37:49So if we go back to the dataset that I
-
37:49 - 37:51showed you just now, if you look at this
-
37:51 - 37:53target variable column, there are only
-
37:53 - 37:55two possible values here. They are either
-
37:55 - 37:58zero or one. Zero means there's no failure.
-
37:58 - 38:01One means there's a failure, okay? So this
-
38:01 - 38:03is an example of a binary classification.
-
38:03 - 38:07Only two possible outcomes, zero or one,
-
38:07 - 38:10didn't fail or fail, all right? Two
-
38:10 - 38:13possible outcomes. And then we can also,
-
38:13 - 38:15for the same dataset, we can extend it
-
38:15 - 38:18and make it a multiclass classification
-
38:18 - 38:21problem, all right? So if we kind of want
-
38:21 - 38:24to drill down further, we can say that
-
38:24 - 38:27not only is there a failure, we can
-
38:27 - 38:29actually say there are different types of
-
38:29 - 38:32failures, okay? So we have one category of
-
38:32 - 38:36class that is basically no failure, okay?
-
38:36 - 38:37Then we have a category for the
-
38:37 - 38:40different types of failures, right? So you
-
38:40 - 38:44can have a power failure, you could have
-
38:44 - 38:46a tool wear failure,
-
38:46 - 38:49you could have- let's go down
-
38:49 - 38:51here, you could have a overstrain
-
38:51 - 38:54failure, and etc, etc. So you can have
-
38:54 - 38:57multiple classes of failure in addition
-
38:57 - 39:01to the general overall or the majority
-
39:01 - 39:04class of no failure, and that would be a
-
39:04 - 39:07multiclass classification problem. So
-
39:07 - 39:08with this dataset, we are going to see
-
39:08 - 39:11how to make it a binary classification
-
39:11 - 39:13problem and also a multiclass
-
39:13 - 39:15classification problem. Okay, so let's
-
39:15 - 39:17look at the workflow. So let's say we've
-
39:17 - 39:19already got the data, so right now we do
-
39:19 - 39:21have the dataset. This is the dataset
-
39:21 - 39:23that we have, so let's assume we've
-
39:23 - 39:25somehow managed to get this dataset
-
39:25 - 39:27from some IoT sensors that are
-
39:27 - 39:29monitoring real-time data in our
-
39:29 - 39:31production environment. On the assembly
-
39:31 - 39:33line, on the production line we've got
-
39:33 - 39:35sensors reading data that gives us all
-
39:35 - 39:38these data that we have in this CSV file.
-
39:38 - 39:40Okay, so we've already got the data, we've
-
39:40 - 39:42retrieved the data, now we're going to go
-
39:42 - 39:45on to the cleaning and exploration part
-
39:45 - 39:48of your machine learning life cycle. All
-
39:48 - 39:50right, so let's look at the data cleaning
-
39:50 - 39:51part. So the data cleaning part, we're
-
39:51 - 39:54interested in checking for missing
-
39:54 - 39:56values and maybe removing the rows you
-
39:56 - 39:58missing values, okay?
-
39:58 - 40:00So the kind of things we can- sorry,
-
40:00 - 40:01the kind of things we can do in missing
-
40:01 - 40:03values, we can remove the rows missing
-
40:03 - 40:06values, we can put in some new values,
-
40:06 - 40:08some replacement values which could be a
-
40:08 - 40:10average of all the values in that that
-
40:10 - 40:13particular column, etc, etc, we could also try to
-
40:13 - 40:15identify outliers in our dataset and
-
40:15 - 40:17also there are a variety of ways to deal
-
40:17 - 40:19with that. So this is called data
-
40:19 - 40:21cleansing which is a really important
-
40:21 - 40:23part of your machine learning workflow,
-
40:23 - 40:26right? So that's where we are now at,
-
40:26 - 40:27we're doing cleansing, and then we're
-
40:27 - 40:28going to follow up with
-
40:28 - 40:31exploration. So let's look at the actual
-
40:31 - 40:33code that does the cleansing here. So
-
40:33 - 40:36here we are right at the start of the
-
40:36 - 40:38machine learning life cycle here, so
-
40:38 - 40:41this is a Jupyter notebook. So here we
-
40:41 - 40:43have a brief description of the problem
-
40:43 - 40:46statement, all right? So this dataset
-
40:46 - 40:48reflects real life predictive
-
40:48 - 40:49maintenance encountered industry with
-
40:49 - 40:50measurements from real equipment. The
-
40:50 - 40:52features description is taken directly
-
40:52 - 40:55from the data source set. So here we have
-
40:55 - 40:57a description of the six key features in
-
40:57 - 41:00our dataset type which is the quality
-
41:00 - 41:03of the product, the air temperature, the
-
41:03 - 41:05process temperature, the rotational speed,
-
41:05 - 41:07the torque, and the tool wear, all right? So
-
41:07 - 41:09these are the six feature variables, and
-
41:09 - 41:11there are the two target variables, so
-
41:11 - 41:13just now- I showed you just now there's
-
41:13 - 41:15one target variable which only has two
-
41:15 - 41:17possible values, either zero or one, okay?
-
41:17 - 41:20Zero or one means failure or no failure,
-
41:20 - 41:23so that will be this column here, right?
-
41:23 - 41:25So let me go all the way back up to here.
-
41:25 - 41:27So this column here, we already saw it
-
41:27 - 41:29only has two possible values, it's either zero or
-
41:29 - 41:33one. And then we also have this column
-
41:33 - 41:35here, and this column here is basically
-
41:35 - 41:38the failure type. And so the- we have- as I
-
41:38 - 41:41already demonstrated just now, we do have
-
41:41 - 41:43several categories of types of
-
41:43 - 41:46failure, and so here we call this
-
41:46 - 41:46multiclass
-
41:46 - 41:50classification. So we can either build a
-
41:50 - 41:52binary classification model for this
-
41:52 - 41:54problem domain, or we can build a
-
41:54 - 41:54multiclass
-
41:54 - 41:58classification problem, all right. So this
-
41:58 - 42:00Jupyter notebook is going to demonstrate
-
42:00 - 42:02both approaches to us. So first step, we
-
42:02 - 42:05are going to write all this Python code
-
42:05 - 42:07that's going to import all the libraries
-
42:07 - 42:09that we need to use, okay? So this is
-
42:09 - 42:12basically Python code, okay, and it's
-
42:12 - 42:15importing the relevant machine learn-
-
42:15 - 42:18oops. We are importing the relevant
-
42:18 - 42:21machine learning libraries related to
-
42:21 - 42:24our domain use case, okay? Then we load in
-
42:24 - 42:26our dataset, okay, so this our dataset.
-
42:26 - 42:28We describe it, we have some quick
-
42:28 - 42:31insights into the dataset. And then
-
42:31 - 42:33we just take a look at all the variables
-
42:33 - 42:36of the feature variables, etc, and so on.
-
42:36 - 42:38What we're doing now is just
-
42:38 - 42:40doing a quick overview of the dataset,
-
42:40 - 42:42so this all this Python code here that
-
42:42 - 42:44we're writing is allowing us, the data
-
42:44 - 42:45scientist, to get a quick overview of our
-
42:45 - 42:48dataset, right, okay, like how many varia-
-
42:48 - 42:50how many rows are there, how many columns
-
42:50 - 42:52are there, what are the data types of the
-
42:52 - 42:53columns, what are the name of the columns,
-
42:53 - 42:57etc, etc. Okay, then we zoom in on to the
-
42:57 - 42:59target variables. So we look at the
-
42:59 - 43:02target variables, how many counts
-
43:02 - 43:05there are of this target variable, and
-
43:05 - 43:06so on. How many different types of
-
43:06 - 43:08failures there are. Then you want to
-
43:08 - 43:09check whether there are any
-
43:09 - 43:11inconsistencies between the target and
-
43:11 - 43:14the failure type, etc. Okay, so when you do
-
43:14 - 43:15all this checking, you're going to
-
43:15 - 43:17discover there are some discrepancies in
-
43:17 - 43:20your dataset, so using a specific Python
-
43:20 - 43:22code to do checking, you're going to say
-
43:22 - 43:23hey, you know what? There's some errors
-
43:23 - 43:25here, right? There are nine values that
-
43:25 - 43:27classify as failure in target variable,
-
43:27 - 43:28but as no failure in the failure type
-
43:28 - 43:30variable, so that means there's a
-
43:30 - 43:33discrepancy in your data point, right?
-
43:33 - 43:35So these are all the ones that
-
43:35 - 43:36are discrepancies because the target
-
43:36 - 43:39variable says one, and we already know
-
43:39 - 43:41that target variable one is supposed to
-
43:41 - 43:43mean there is a failure, right? Target
-
43:43 - 43:45variable one is supposed to mean there is
-
43:45 - 43:47a failure, so we are kind of expecting to
-
43:47 - 43:50see the failure classification, but some
-
43:50 - 43:51rows actually say there's no failure
-
43:51 - 43:54although the target type is one. Well here
-
43:54 - 43:56is a classic example of an error that
-
43:56 - 43:59can very well occur in a dataset, so now
-
43:59 - 44:01the question is what do you do with
-
44:01 - 44:05these errors in your dataset, right? So
-
44:05 - 44:06here the data scientist says, I think it
-
44:06 - 44:08would make sense to remove those
-
44:08 - 44:10instances, and so they write some code
-
44:10 - 44:13then to remove those instances or those
-
44:13 - 44:15rows or data points from the overall
-
44:15 - 44:17dataset, and same thing we can, again,
-
44:17 - 44:19check for other issues. So we find there's
-
44:19 - 44:21another issue here with our dataset which
-
44:21 - 44:24is another warning, so, again, we can
-
44:24 - 44:26possibly remove them. So you're going to
-
44:26 - 44:31remove 27 instances or rows from your
-
44:31 - 44:34overall dataset. So your dataset has
-
44:34 - 44:3710,000 rows or data points. You're
-
44:37 - 44:40removing 27 which is only 0.27 of the
-
44:40 - 44:42entire dataset. And these were the
-
44:42 - 44:46reasons why you removed them, okay? So if
-
44:46 - 44:48you're just removing 0.27% of the
-
44:48 - 44:51entire dataset, no big deal, right? Still
-
44:51 - 44:53okay, but you needed to remove them
-
44:53 - 44:55because these errors right, these
-
44:55 - 44:5827
-
44:58 - 45:01errors, okay, data points with errors in
-
45:01 - 45:03your dataset could really affect the
-
45:03 - 45:05training of your machine learning model.
-
45:05 - 45:09So we need to do your data cleansing,
-
45:09 - 45:12right? So we are actually cleansing now
-
45:12 - 45:15some kind of data that is
-
45:15 - 45:18incorrect or erroneous in your original
-
45:18 - 45:21dataset. Okay, so then we go on to the
-
45:21 - 45:24next part which is called EDA, right? So
-
45:24 - 45:29EDA is where we kind of explore our data,
-
45:29 - 45:32and we want to, kind of, get a visual
-
45:32 - 45:34overview of our data as a whole, and also
-
45:34 - 45:36take a look at the statistical
-
45:36 - 45:38properties of our data. The statistical
-
45:38 - 45:40distribution of the data in all the
-
45:40 - 45:43various columns, the correlation between
-
45:43 - 45:45the variables, between the feature
-
45:45 - 45:47variables different columns, and also the
-
45:47 - 45:49feature variable and the target variable.
-
45:49 - 45:52So all of this is called EDA, and EDA in
-
45:52 - 45:54a machine learning workflow is typically
-
45:54 - 45:57done through visualization,
-
45:57 - 45:59all right? So let's go back here and take
-
45:59 - 46:01a look, right? So, for example, here we are
-
46:01 - 46:03looking at correlation, so we plot the
-
46:03 - 46:06values of all the various feature
-
46:06 - 46:08variables against each other and look
-
46:08 - 46:11for potential correlations and patterns
-
46:11 - 46:13and so on. And all the different shapes
-
46:13 - 46:17that you see here in this pair plot, okay,
-
46:17 - 46:18will have different meaning,
-
46:18 - 46:20statistical meaning, and so the data
-
46:20 - 46:22scientist has to, kind of, visually
-
46:22 - 46:24inspect this pair plot, make some
-
46:24 - 46:26interpretations of these different
-
46:26 - 46:28patterns that he sees here, all right. So
-
46:28 - 46:30these are some of the insights that
-
46:30 - 46:33can be deduced from looking at these
-
46:33 - 46:34patterns, so, for example, the torque and
-
46:34 - 46:36rotational speed are highly correlated,
-
46:36 - 46:38the process temperature and air
-
46:38 - 46:40temperature also highly correlated, that
-
46:40 - 46:42failures occur for extreme values of
-
46:42 - 46:45some features, etc, etc. Then you can plot
-
46:45 - 46:46certain kinds of charts. This called a
-
46:46 - 46:48violin chart to, again, get new insights.
-
46:48 - 46:50For example, regarding the torque and
-
46:50 - 46:51rotational speed, it can see, again, that
-
46:51 - 46:53most failures are triggered for much
-
46:53 - 46:55lower or much higher values than the
-
46:55 - 46:57mean when they're not failing. So all
-
46:57 - 47:01these visualizations, they are there, and
-
47:01 - 47:02a trained data scientist can look at
-
47:02 - 47:05them, inspect them, and make some kind of
-
47:05 - 47:08insightful deductions from them, okay?
-
47:08 - 47:11Percentage of failure, right? The
-
47:11 - 47:14correlation heat map, okay, between all
-
47:14 - 47:16these different feature variables, and
-
47:16 - 47:16also the target
-
47:16 - 47:20variable, okay? The product types,
-
47:20 - 47:21percentage of product types, percentage
-
47:21 - 47:23of failure with respect to the product
-
47:23 - 47:26type, so we can also kind of visualize
-
47:26 - 47:28that as well. So certain products have a
-
47:28 - 47:30higher ratio of failure compared to other
-
47:30 - 47:33product types, etc. Or, for example, M
-
47:33 - 47:36tends to fail more than H products, etc,
-
47:36 - 47:39etc. So we can create a vast variety of
-
47:39 - 47:41visualizations in the EDA stage, so you
-
47:41 - 47:44can see here. And, again, the idea of this
-
47:44 - 47:46visualization is just to give us some
-
47:46 - 47:50insight, some preliminary insight into
-
47:50 - 47:53our dataset that helps us to model it
-
47:53 - 47:54more correctly. So some more insights
-
47:54 - 47:56that we get into our dataset from all
-
47:56 - 47:58this visualization.
-
47:58 - 48:00Then we can plot the distribution so we
-
48:00 - 48:01can see whether it's a normal
-
48:01 - 48:03distribution or some other kind of
-
48:03 - 48:06distribution. We can have a box plot
-
48:06 - 48:08to see whether there are any outliers in
-
48:08 - 48:10your dataset and so on, right? So we can
-
48:10 - 48:12see from the box plots, we can see
-
48:12 - 48:15rotational speed and have outliers. So we
-
48:15 - 48:17already saw outliers are basically a
-
48:17 - 48:19problem that you may need to kind of
-
48:19 - 48:23tackle, right? So outliers are an issue,
-
48:23 - 48:25it's a part of data cleansing. And
-
48:25 - 48:27so you may need to tackle this, so we may
-
48:27 - 48:29have to check okay, well where are the
-
48:29 - 48:31potential outliers so we can analyze
-
48:31 - 48:35them from the box plot, okay? But then
-
48:35 - 48:37we can say well they are outliers, but
-
48:37 - 48:39maybe they're not really horrible
-
48:39 - 48:41outliers so we can tolerate them or
-
48:41 - 48:43maybe we want to remove them. So we can
-
48:43 - 48:45see what our mean and maximum values for
-
48:45 - 48:47all these with respect to product type,
-
48:47 - 48:50how many of them are above or highly
-
48:50 - 48:51correlated with the product type in
-
48:51 - 48:54terms of the maximum and minimum, okay,
-
48:54 - 48:57and then so on. So the insight is well we
-
48:57 - 49:00got 4.8% of the instances are outliers,
-
49:00 - 49:03so maybe 4.87% is not really that much,
-
49:03 - 49:05the outliers are not horrible, so we just
-
49:05 - 49:07leave them in the dataset. Now for a
-
49:07 - 49:09different dataset, the data scientist
-
49:09 - 49:10could come to a different conclusion, so
-
49:10 - 49:12then they would do whatever they've
-
49:12 - 49:15deemed is appropriate to, kind of, cleanse
-
49:15 - 49:18the dataset. Okay, so now that we have
-
49:18 - 49:20done all the EDA, the next thing we're
-
49:20 - 49:23going to do is we are going to do what
-
49:23 - 49:26is called feature engineering. So we are
-
49:26 - 49:29going to transform our original feature
-
49:29 - 49:31variables and these are our original
-
49:31 - 49:33feature variables, right? These are our
-
49:33 - 49:35original feature variables, and we are
-
49:35 - 49:38going to transform them, all right? We're
-
49:38 - 49:40going to transform them in some sense
-
49:40 - 49:44into some other form before we fit this
-
49:44 - 49:46for training into our machine learning
-
49:46 - 49:49algorithm, all right? So these are
-
49:49 - 49:52examples of- let's say these are examples of a
-
49:52 - 49:55original dataset, right? And this is
-
49:55 - 49:57examples, these are some of the examples,
-
49:57 - 49:58you don't have to use all of them, but
-
49:58 - 49:59these are some of the examples of what we
-
49:59 - 50:01call feature engineering which you can
-
50:01 - 50:04then transform your original values in
-
50:04 - 50:05your feature variables to all these
-
50:05 - 50:08transform values here. So we're going to
-
50:08 - 50:10pretty much do that here, so we have a
-
50:10 - 50:13ordinal encoding, we do scaling of the
-
50:13 - 50:15data so the dataset is scaled, we use a
-
50:15 - 50:18MinMax scaling, and then finally, we come
-
50:18 - 50:22to do a modeling. So we have to split our
-
50:22 - 50:24dataset into a training dataset and a
-
50:24 - 50:29test dataset. So coming back to here again,
-
50:29 - 50:32we said that before you train your
-
50:32 - 50:34model, sorry, before you train your model,
-
50:34 - 50:36you have to take your original dataset,
-
50:36 - 50:37now this is a featured engineered dataset.
-
50:37 - 50:39We're going to break it into two or
-
50:39 - 50:41more subsets, okay. So one is called the
-
50:41 - 50:42training dataset that we use to feed
-
50:42 - 50:44and train a machine learning model. The
-
50:44 - 50:46second is test dataset to evaluate the
-
50:46 - 50:48accuracy of the model, okay? So we got
-
50:48 - 50:51this training dataset, your test dataset,
-
50:51 - 50:53and we also need
-
50:53 - 50:56to sample. So from our original dataset
-
50:56 - 50:57we need to sample some points
-
50:57 - 50:59that go into your training dataset, some
-
50:59 - 51:01points that go in your test dataset. So
-
51:01 - 51:03there are many ways to do sampling. One
-
51:03 - 51:05way is to do stratified sampling where
-
51:05 - 51:07we ensure the same proportion of data
-
51:07 - 51:09from each stata or class because right
-
51:09 - 51:11now we have a multiclass classification
-
51:11 - 51:12problem, so you want to make sure the
-
51:12 - 51:14same proportion of data from each strata or
-
51:14 - 51:16class is equally proportional in the
-
51:16 - 51:18training and test dataset as the
-
51:18 - 51:20original dataset which is very useful
-
51:20 - 51:22for dealing with what is called an
-
51:22 - 51:24imbalanced dataset. So here we have an
-
51:24 - 51:26example of what is called an imbalanced
-
51:26 - 51:30dataset in the sense that you have the
-
51:30 - 51:33vast majority of data points in your
-
51:33 - 51:35dataset, they are going to have the
-
51:35 - 51:37value of zero for their target variable
-
51:37 - 51:40column. So only a extremely small
-
51:40 - 51:43minority of the data points in your dataset
-
51:43 - 51:45will actually have the value of one
-
51:45 - 51:49for their target variable column, okay? So
-
51:49 - 51:51a situation where you have your class or
-
51:51 - 51:53your target variable column where the
-
51:53 - 51:54vast majority of values are from one
-
51:54 - 51:58class and a tiny small minority are from
-
51:58 - 52:01another class, we call this an imbalanced
-
52:01 - 52:03dataset. And for an imbalanced dataset,
-
52:03 - 52:04typically we will have a specific
-
52:04 - 52:06technique to do the train test split
-
52:06 - 52:08which is called stratified sampling, and
-
52:08 - 52:10so that's what's exactly happening here.
-
52:10 - 52:12We're doing a stratified split here, so
-
52:12 - 52:15we are doing a train test split here,
-
52:15 - 52:18and we are doing a stratified split.
-
52:18 - 52:20And then now we actually develop the
-
52:20 - 52:23models. So now we've got the train test
-
52:23 - 52:25split, now here is where we actually
-
52:25 - 52:27train the models.
-
52:27 - 52:30Now in terms of classification there are
-
52:30 - 52:31a whole bunch of
-
52:31 - 52:35possibilities, right, that you can use.
-
52:35 - 52:38There are many, many different algorithms
-
52:38 - 52:41that we can use to create a
-
52:41 - 52:43classification model. So these are an
-
52:43 - 52:45example of some of the more common ones.
-
52:45 - 52:47Logistic, support vector machine, decision
-
52:47 - 52:50trees, random forest, bagging, balanced
-
52:50 - 52:53bagging, boost, ensemble. So all
-
52:53 - 52:55these are different algorithms which
-
52:55 - 52:58will create different kinds of models
-
52:58 - 53:02which will result in different accuracy
-
53:02 - 53:05measures, okay? So it's the goal of the
-
53:05 - 53:09data scientist to find the best model
-
53:09 - 53:12that gives the best accuracy for the
-
53:12 - 53:14given dataset, for training on that
-
53:14 - 53:17given dataset. So let's head back, again,
-
53:17 - 53:20to our machine learning workflow. So
-
53:20 - 53:22here basically what I'm doing is I'm
-
53:22 - 53:24creating a whole bunch of models here,
-
53:24 - 53:26all right? So one is a random forest, one
-
53:26 - 53:27is balanced bagging, one is a boost
-
53:27 - 53:30classifier, one's a ensemble classifier,
-
53:30 - 53:33and using all of these, I am going to
-
53:33 - 53:35basically feed or train my model using
-
53:35 - 53:37all these algorithms. And then I'm going
-
53:37 - 53:40to evaluate them, okay? I'm going to
-
53:40 - 53:42evaluate how good each of these models
-
53:42 - 53:46are. And here you can see your
-
53:46 - 53:49evaluation data, right? Okay and this is
-
53:49 - 53:51the confusion matrix which is another
-
53:51 - 53:54way of evaluating. So now we come to the,
-
53:54 - 53:56kind of, the key part here which
-
53:56 - 53:59is how do I distinguish between
-
53:59 - 54:00all these models, right? I've got all
-
54:00 - 54:01these different models which are built
-
54:01 - 54:03with different algorithms which I'm
-
54:03 - 54:05using to train on the same dataset, how
-
54:05 - 54:07do I distinguish between all these
-
54:07 - 54:10models, okay? And so for that sense, for
-
54:10 - 54:14that we actually have a whole bunch of
-
54:14 - 54:16common evaluation metrics for
-
54:16 - 54:18classification, right? So this evaluation
-
54:18 - 54:22metrics tell us how good a model is in
-
54:22 - 54:24terms of its accuracy in
-
54:24 - 54:27classification. So in terms of
-
54:27 - 54:29accuracy, we actually have many different
-
54:29 - 54:32models, sorry, many different measures,
-
54:32 - 54:33right? You might think well, accuracy is
-
54:33 - 54:35just accuracy, well that's all right, it's
-
54:35 - 54:37just either it's accurate or it's not
-
54:37 - 54:39accurate, right? But actually it's not
-
54:39 - 54:41that simple. There are many different
-
54:41 - 54:44ways to measure the accuracy of a
-
54:44 - 54:45classification model, and these are some
-
54:45 - 54:48of the more common ones. So, for example,
-
54:48 - 54:51the confusion matrix tells us how many
-
54:51 - 54:54true positives, that means the value is
-
54:54 - 54:56positive, the prediction is positive, how
-
54:56 - 54:58many false positives which means the
-
54:58 - 54:59value is negative the machine learning
-
54:59 - 55:02model predicts positive. How many false
-
55:02 - 55:04negatives which means that the machine
-
55:04 - 55:06learning model predicts negative, but
-
55:06 - 55:07it's actually positive. And how many true
-
55:07 - 55:09negatives there are which means that the
-
55:09 - 55:11the machine learning model
-
55:11 - 55:13predicts negative and the true value is
-
55:13 - 55:15also negative. So this is called a
-
55:15 - 55:17confusion matrix. This is one way we
-
55:17 - 55:19assess or evaluate the performance of a
-
55:19 - 55:21classification model,
-
55:21 - 55:23okay? This is for binary
-
55:23 - 55:25classification, we can also have
-
55:25 - 55:27multiclass confusion matrix,
-
55:27 - 55:29and then we can also measure things like
-
55:29 - 55:32accuracy. So accuracy is the true
-
55:32 - 55:34positives plus the true negatives which
-
55:34 - 55:35is the total number of correct
-
55:35 - 55:38predictions made by the model divided by
-
55:38 - 55:40the total number of data points in your
-
55:40 - 55:43dataset. And then you have also other
-
55:43 - 55:43kinds of
-
55:43 - 55:47measures such as recall. And this a
-
55:47 - 55:49formula for recall, this is a formula for
-
55:49 - 55:51the F1 score, okay? And then there's
-
55:51 - 55:56something called the ROC curve, right? So
-
55:56 - 55:57without going too much in the detail of
-
55:57 - 55:59what each of these entails, essentially
-
55:59 - 56:01these are all different ways, these are
-
56:01 - 56:03different KPI, right? Just like if you
-
56:03 - 56:06work in a company, you have different KPI,
-
56:06 - 56:08right? Certain employees have certain KPI
-
56:08 - 56:11that measures how good or how, you
-
56:11 - 56:13know, efficient or how effective a
-
56:13 - 56:16particular employee is, right? So the
-
56:16 - 56:20KPI for your machine learning models
-
56:20 - 56:24are ROC curve, F1 score, recall, accuracy,
-
56:24 - 56:27okay, and your confusion matrix. So
-
56:27 - 56:30fundamentally after I have built, right,
-
56:30 - 56:33so here I've built my four different
-
56:33 - 56:35models. So after I built these four
-
56:35 - 56:38different models, I'm going to check and
-
56:38 - 56:40evaluate them using all those different
-
56:40 - 56:42metrics like, for example, the F1 score,
-
56:42 - 56:45the precision score, the recall score, all
-
56:45 - 56:47right. So for this model, I can check out
-
56:47 - 56:50the ROC score, the F1 score, the precision
-
56:50 - 56:52score, the recall score. Then for this
-
56:52 - 56:55model, this is the ROC score, the F1 score,
-
56:55 - 56:57the precision score, the recall score.
-
56:57 - 57:00Then for this model and so on. So for
-
57:00 - 57:03every single model I've created using my
-
57:03 - 57:06training dataset, I will have all my set
-
57:06 - 57:08of evaluation metrics that I can use to
-
57:08 - 57:12evaluate how good this model is, okay?
-
57:12 - 57:13Same thing here, I've got a confusion
-
57:13 - 57:15matrix here, right, so I can use that,
-
57:15 - 57:18again, to evaluate between all these four
-
57:18 - 57:20different models, and then I, kind of,
-
57:20 - 57:22summarize it up here. So we can see from
-
57:22 - 57:25this summary here that actually the top
-
57:25 - 57:28two models, right, which are I'm going to
-
57:28 - 57:29give a lot, as a data scientist, I'm now
-
57:29 - 57:31going to just focus on these two models.
-
57:31 - 57:33So these two models are bagging
-
57:33 - 57:36classifier and random forest classifier.
-
57:36 - 57:38They have the highest values of F1 score,
-
57:38 - 57:40and the highest values of the ROC curve
-
57:40 - 57:43score, okay? So we can say these are the
-
57:43 - 57:46top two models in terms of accuracy, okay,
-
57:46 - 57:49using the F1 evaluation metric and the
-
57:49 - 57:54ROC AUC evaluation metric, okay? So these
-
57:54 - 57:57results, kind of, summarize here, and
-
57:57 - 57:59then we use different sampling
-
57:59 - 58:01techniques, okay, so just now I talked
-
58:01 - 58:04about different kinds of sampling
-
58:04 - 58:06techniques, and so the idea of different
-
58:06 - 58:08kinds of sampling techniques is to just
-
58:08 - 58:11get a different feel for different
-
58:11 - 58:14distributions of the data in different
-
58:14 - 58:16areas of your dataset, so that you want
-
58:16 - 58:20to just, kind of, make sure that your
-
58:20 - 58:23your evaluation of accuracy is actually
-
58:23 - 58:27statistically correct, right? So we can
-
58:27 - 58:30do what is called oversampling and under
-
58:30 - 58:31sampling which is very useful when
-
58:31 - 58:32you're working with an imbalanced data
-
58:32 - 58:35set. So this is a example of doing that, and
-
58:35 - 58:37then here we, again, check out the
-
58:37 - 58:39results for all these different
-
58:39 - 58:42techniques we use. The F1 score, the AUC
-
58:42 - 58:44score, all right, these are the two key
-
58:44 - 58:47measures of accuracy, right? So and then
-
58:47 - 58:48we can check out the scores for the
-
58:48 - 58:50different approaches. Okay so we can see,
-
58:50 - 58:53oh well, overall the models have lower
-
58:53 - 58:56ROC AUC score, but they have a much
-
58:56 - 58:58higher F1 score. The bagging classifier
-
58:58 - 59:01had the highest ROC AUC score,
-
59:01 - 59:04but F1 score was too low, okay. Then, in
-
59:04 - 59:07the data scientist opinion, the random
-
59:07 - 59:09forest with this particular technique of
-
59:09 - 59:11sampling has an equilibrium between the F1
-
59:11 - 59:14ROC, and AUC score. So the takeaway one
-
59:14 - 59:17is the macro F1 score improves
-
59:17 - 59:18dramatically using these sampling
-
59:18 - 59:20techniques, so these models might be better
-
59:20 - 59:22compared to the balanced ones, all right.
-
59:22 - 59:26So based on all this evaluation, the
-
59:26 - 59:28data scientist says they're going to
-
59:28 - 59:30continue to work with these two models,
-
59:30 - 59:31all right, and the balanced bagging one,
-
59:31 - 59:33and then continue to make further
-
59:33 - 59:35comparisons, all right. So then, we
-
59:35 - 59:37continue to keep refining on our
-
59:37 - 59:39evaluation work here. We're going to
-
59:39 - 59:41train the models one more time again, so
-
59:41 - 59:43we, again, do a training test split, and
-
59:43 - 59:45then we do that for this particular
-
59:45 - 59:47approach model. And then we
-
59:47 - 59:48print out what is called a
-
59:48 - 59:51classification report, and this is
-
59:51 - 59:53basically a summary of all those metrics
-
59:53 - 59:55that I talk about just now, so, just now,
-
59:55 - 59:58remember I said there was
-
59:58 - 60:00several evaluation metrics, right? So
-
60:00 - 60:01we had the confusion matrix, the
-
60:01 - 60:04accuracy, the precision, the recall, the AUC
-
60:04 - 60:08ROC score. So here with the classification
-
60:08 - 60:10report, I can get a summary of all of
-
60:10 - 60:12that, so I can see all the values here,
-
60:12 - 60:15okay, for this particular model, bagging
-
60:15 - 60:17tomek links. And then, I can do that for
-
60:17 - 60:19another model, the random forest
-
60:19 - 60:21borderline SMOTE, and then I can do that
-
60:21 - 60:22for another model which is the balanced
-
60:22 - 60:25bagging. So, again, we see this a lot of
-
60:25 - 60:27comparison between different models
-
60:27 - 60:29trying to figure out what all these
-
60:29 - 60:31evaluation metrics are telling us, all
-
60:31 - 60:33right? Then, again, we have a confusion
-
60:33 - 60:36matrix. So we generate a confusion matrix
-
60:36 - 60:39for the bagging with the tomeks links
-
60:39 - 60:41undersampling, for the random forest
-
60:41 - 60:43with the borderline SMOTE oversampling,
-
60:43 - 60:45and just balanced bagging by itself. Then,
-
60:45 - 60:48again, we compare between these three
-
60:48 - 60:51models using the confusion matrix,
-
60:51 - 60:53evaluation matrix, and then we can kind
-
60:53 - 60:56of come to some conclusions. All right, so,
-
60:56 - 60:58right, so now we look at all the data,
-
60:58 - 61:01then we move on and look at another
-
61:01 - 61:03another kind of evaluation metrics which
-
61:03 - 61:07is the ROC score, right? So this is one of
-
61:07 - 61:09the other evaluation metrics I talk
-
61:09 - 61:11about. So this one is a kind of a curve,
-
61:11 - 61:13you look at it to see the area
-
61:13 - 61:14underneath the curve, this is called AOC
-
61:14 - 61:18ROC area under the curve, sorry, AUC ROC
-
61:18 - 61:20area under the curve. All right, so the
-
61:20 - 61:21area under the curve
-
61:21 - 61:24score will give us some idea about the
-
61:24 - 61:26threshold that we're going to use for
-
61:26 - 61:28classification, so we can examine this
-
61:28 - 61:29for the bagging classifier, for the
-
61:29 - 61:31random forest classifier, for the balanced
-
61:31 - 61:34bagging classifier, okay? Then we can also,
-
61:34 - 61:36again, do that- finally we can check
-
61:36 - 61:38the classification report of this
-
61:38 - 61:40particular model. So we keep doing this
-
61:40 - 61:43over and over again, evaluating this
-
61:43 - 61:46the matrix, the accuracy matrix, the
-
61:46 - 61:47evaluation matrix for all these
-
61:47 - 61:49different models. So we keep doing this
-
61:49 - 61:51over and over again for different
-
61:51 - 61:53thresholds or for classification, and so
-
61:53 - 61:57as we keep drilling into these, we kind
-
61:57 - 62:01of get more and more understanding of
-
62:01 - 62:03all these different models, which one is
-
62:03 - 62:05the best one that gives the best
-
62:05 - 62:09performance for our dataset, okay? So
-
62:09 - 62:11finally, we come to this conclusion, this
-
62:11 - 62:14particular model is not able to reduce
-
62:14 - 62:15the recall on failures less than
-
62:15 - 62:1895.18%. On the other hand, balanced begging
-
62:18 - 62:19with a decision thresold of 0.6 is able
-
62:19 - 62:22to have a better recall blah, blah, blah,
-
62:22 - 62:25etc. So finally, after having done all of
-
62:25 - 62:27this evaluations,
-
62:27 - 62:31okay, this is the conclusion.
-
62:31 - 62:34So after having gone- so right now we
-
62:34 - 62:35have gone through all the steps of the
-
62:35 - 62:38machine learning life cycle which
-
62:38 - 62:40means we have right now, or the data
-
62:40 - 62:42scientist right now has gone through all
-
62:42 - 62:43these steps
-
62:44 - 62:47which is now we have done this
-
62:47 - 62:49validation. So we have done the cleaning,
-
62:49 - 62:51exploration, preparation, transformation,
-
62:51 - 62:53the feature engineering, we have developed
-
62:53 - 62:54and trained multiple models, we have
-
62:54 - 62:56evaluated all these different models, so
-
62:56 - 62:59right now we have reached this stage, so
-
62:59 - 63:03at this stage we as the data scientist,
-
63:03 - 63:05kind of, have completed our job. So we've
-
63:05 - 63:08come to some very useful conclusions
-
63:08 - 63:10which we now can share with our
-
63:10 - 63:13colleagues, all right? And based on these
-
63:13 - 63:15conclusions or recommendations,
-
63:15 - 63:17somebody is going to choose a
-
63:17 - 63:19appropriate model, and that model is
-
63:19 - 63:23going to get deployed for real-time use
-
63:23 - 63:25in a real life production environment,
-
63:25 - 63:27okay? And that decision is going to be
-
63:27 - 63:29made based on the recommendations coming
-
63:29 - 63:31from the data scientist at the end of
-
63:31 - 63:33this phase, okay? So at the end of this
-
63:33 - 63:35phase, the data scientist is going to
-
63:35 - 63:37come up with these conclusions. So
-
63:37 - 63:42conclusions is, okay, if the engineering
-
63:42 - 63:45team they are looking, okay? The
-
63:45 - 63:46engineering team, right? The engineering
-
63:46 - 63:49team, if they are looking for the highest
-
63:49 - 63:52failure detection rate possible, then
-
63:52 - 63:54they should go with this particular
-
63:54 - 63:57model, okay?
-
63:57 - 63:59And if they want a balance between
-
63:59 - 64:01precision and recall, then they should
-
64:01 - 64:03choose between the bagging model with a
-
64:03 - 64:060.4 decision threshold or the random
-
64:06 - 64:10forest model with a 0.5 threshold, but if
-
64:10 - 64:12they don't care so much about predicting
-
64:12 - 64:14every failure, and they want the highest
-
64:14 - 64:17precision possible, then they should opt
-
64:17 - 64:20for the bagging tomek links classifier
-
64:20 - 64:23with a bit higher decision threshold. And
-
64:23 - 64:26so this is the key thing that the data
-
64:26 - 64:28scientist is going to give, right? This is
-
64:28 - 64:31the key takeaway. This is the, kind of, the
-
64:31 - 64:33end result of the entire machine
-
64:33 - 64:35learning life cycle. Right now the data
-
64:35 - 64:36scientist is going to tell the
-
64:36 - 64:39engineering team, all right you guys,
-
64:39 - 64:41which is more important for you, point A,
-
64:41 - 64:45point B, or point C. Make your decision. So
-
64:45 - 64:47the engineering team will then discuss
-
64:47 - 64:49among themselves and say, hey you know
-
64:49 - 64:52what? What we want is we want to get the
-
64:52 - 64:55highest failure detection possible
-
64:55 - 64:58because any kind of failure of that
-
64:58 - 65:00machine or the product or the assembly
-
65:00 - 65:03line is really going to screw us up big
-
65:03 - 65:06time. So what we're looking for is the
-
65:06 - 65:08model that will give us the highest
-
65:08 - 65:11failure detection rate. We don't care
-
65:11 - 65:13about precision, but we want to be make
-
65:13 - 65:15sure that if there's a failure, we are
-
65:15 - 65:18going to catch it, right? So that's what
-
65:18 - 65:20they want, and so the data scientist will
-
65:20 - 65:22say, hey you go for the balanced bagging
-
65:22 - 65:25model, okay? Then, the data scientist saves
-
65:25 - 65:28this, all right. And then, once you have
-
65:28 - 65:30saved this, you can then go right
-
65:30 - 65:32ahead and deploy that. So you can go
-
65:32 - 65:34right ahead and deploy that to
-
65:34 - 65:37production. Okay, and so if you want to
-
65:37 - 65:39continue, we can actually further
-
65:39 - 65:41continue this modeling problem. So just
-
65:41 - 65:43now, I model this problem as a binary
-
65:43 - 65:47classification problem. Uh, sorry. I
-
65:47 - 65:48modeled this problem as a binary
-
65:48 - 65:50classification which means it's either
-
65:50 - 65:52zero or one, either fail or not fail, but
-
65:52 - 65:54we can also model it as a multiclass
-
65:54 - 65:56classification problem, right, because
-
65:56 - 65:58as I said earlier just now for the
-
65:58 - 66:00target variable column which is- sorry, for
-
66:00 - 66:03the failure type column, you actually
-
66:03 - 66:05have multiple kinds of failures, right?
-
66:05 - 66:08For example, you may have a power failure,
-
66:08 - 66:10you may have a tool wear failure, you
-
66:10 - 66:13may have a overstrain failure. So now we
-
66:13 - 66:15can model the problem slightly
-
66:15 - 66:17differently, so we can model it as a
-
66:17 - 66:20multiclass classification problem, and
-
66:20 - 66:21then we go through the entire same
-
66:21 - 66:23process that we went through just now, so
-
66:23 - 66:25we create different models, we test this
-
66:25 - 66:27out, but now the confusion matrix is for
-
66:27 - 66:30a multiclass classification issue, right?
-
66:30 - 66:31So we're going
-
66:31 - 66:34to check them out. We're going to, again,
-
66:34 - 66:36try different algorithms or models.
-
66:36 - 66:38Again, train and test our dataset, do the
-
66:38 - 66:40training test split on these
-
66:40 - 66:42different models. All right, so we have
-
66:42 - 66:43like, for example, we have balanced random
-
66:43 - 66:46forest, balanced random forest grid search,
-
66:46 - 66:48then you train the models using what is
-
66:48 - 66:50called hyperparameter tuning, then you
-
66:50 - 66:51get the scores. All right, so you get the
-
66:51 - 66:53same evaluation scores again. You check
-
66:53 - 66:55out the evaluation scores, compare
-
66:55 - 66:57between them, generate a confusion matrix,
-
66:57 - 67:00so this is a multiclass confusion matrix.
-
67:00 - 67:02And then, you come to the final
-
67:02 - 67:06conclusion. So now if you are interested
-
67:06 - 67:09to frame your problem domain as a
-
67:09 - 67:11multiclass classification problem, all
-
67:11 - 67:14right, then these are the recommendations
-
67:14 - 67:15from the data scientist. So the data
-
67:15 - 67:17scientist will say, you know what, I'm
-
67:17 - 67:20going to pick this particular model, the
-
67:20 - 67:22balanced bagging classifier, and these are
-
67:22 - 67:25all the reasons that the data scientist
-
67:25 - 67:27is going to give as a rational for
-
67:27 - 67:29selecting this particular
-
67:29 - 67:32model. And then once that's done, you save
-
67:32 - 67:35the model and that's it, that's it.
-
67:35 - 67:39So that's all done now, and so then the
-
67:39 - 67:41the model, the machine learning model,
-
67:41 - 67:44now you can put it live, run it on the
-
67:44 - 67:45server, and now the machine learning
-
67:45 - 67:47model is ready to work which means it's
-
67:47 - 67:49ready to generate predictions, right?
-
67:49 - 67:50That's the main job of the machine
-
67:50 - 67:52learning model. You have picked the best
-
67:52 - 67:54machine learning model with the best
-
67:54 - 67:56evaluation metrics for whatever accuracy
-
67:56 - 67:58goal you're trying to achieve. And
-
67:58 - 68:00now you're going to run it on a server,
-
68:00 - 68:01and now you're going to get all this
-
68:01 - 68:03real-time data that's coming from your
-
68:03 - 68:05sensors, you're going to pump that into
-
68:05 - 68:06your machine learning model, your machine
-
68:06 - 68:08learning model will pump out a whole
-
68:08 - 68:10bunch of predictions, and we're going to
-
68:10 - 68:13use that predictions in real-time to
-
68:13 - 68:15make real-time, real-world decision
-
68:15 - 68:18making, right? You're going to say, okay
-
68:18 - 68:20I'm predicting that that machine is
-
68:20 - 68:23going to fail on Thursday at 5:00 p.m.,
-
68:23 - 68:26so you better get your service folks in
-
68:26 - 68:29to service it on Thursday 2 p.m. or, you
-
68:29 - 68:32know, whatever. So you can, you know,
-
68:32 - 68:33make decisions on when you want to do
-
68:33 - 68:35your maintenance, you know, and make
-
68:35 - 68:38the best decisions to optimize the cost
-
68:38 - 68:41of maintenance, etc, etc. And then based on
-
68:41 - 68:42the
-
68:42 - 68:45results that are coming up from the
-
68:45 - 68:47predictions, so the predictions may be
-
68:47 - 68:49good, the predictions may be lousy, the
-
68:49 - 68:51predictions may be average, right? So
-
68:51 - 68:54we're constantly monitoring how good
-
68:54 - 68:55or how useful are the predictions
-
68:55 - 68:58generated by this real-time model that's
-
68:58 - 69:00running on the server, and based on our
-
69:00 - 69:03monitoring, we will then take some new
-
69:03 - 69:05data and then repeat this entire life
-
69:05 - 69:07cycle again, so this is basically a
-
69:07 - 69:09workflow that's iterative, and we are
-
69:09 - 69:11constantly or the data scientist is
-
69:11 - 69:13constantly getting in all these new data
-
69:13 - 69:15points and then refining the model,
-
69:15 - 69:18picking maybe a new model, deploying the
-
69:18 - 69:22new model onto the server, and so on. All
-
69:22 - 69:24right, and so that's it. So that is
-
69:24 - 69:26basically your machine learning workflow
-
69:26 - 69:29in a nutshell. Okay so for this
-
69:29 - 69:32particular approach we have used a bunch
-
69:32 - 69:35of data science libraries from Python,
-
69:35 - 69:37so we have used Pandas which is the most
-
69:37 - 69:39basic data science libraries that
-
69:39 - 69:40provides all the tools to work with raw
-
69:40 - 69:43data. We have used Numpy which is a high
-
69:43 - 69:44performance library for implementing
-
69:44 - 69:46complex array matrix operations. We have
-
69:46 - 69:50used Matplotlib and Seaborn which is used
-
69:50 - 69:52for doing the EDA the
-
69:52 - 69:56exploratory data analysis phase of machine
-
69:56 - 69:57learning where you visualize all your
-
69:57 - 69:59data. We have used Scikit learn which is
-
69:59 - 70:01the machine learning library to do all
-
70:01 - 70:03your implementation for all your core
-
70:03 - 70:06machine learning algorithms. We
-
70:06 - 70:08have not used this because this is not a
-
70:08 - 70:11deep learning problem, but if you are
-
70:11 - 70:13working with a deep learning problem
-
70:13 - 70:15like image classification, image
-
70:15 - 70:18recognition, object detection, okay,
-
70:18 - 70:20natural language processing, text
-
70:20 - 70:22classification, well then you're going to
-
70:22 - 70:24use these libraries from Python which is
-
70:24 - 70:29Tensorflow, okay, and also Pytorch.
-
70:29 - 70:33And then lastly, that whole thing, that
-
70:33 - 70:35whole data science project that you saw
-
70:35 - 70:37just now, this entire data science
-
70:37 - 70:39project is actually developed in
-
70:39 - 70:41something called a Jupyter notebook. So
-
70:41 - 70:44all this Python code along with all the
-
70:44 - 70:46observations from the data
-
70:46 - 70:49scientists, okay, for this entire data
-
70:49 - 70:50science project was actually run in
-
70:50 - 70:53something called a Jupyter notebook. So
-
70:53 - 70:56that is the
-
70:56 - 70:59most widely used tool for interactively
-
70:59 - 71:02developing and presenting data science
-
71:02 - 71:05projects. Okay so that brings me to the
-
71:05 - 71:07end of this entire presentation. I hope
-
71:07 - 71:10that you find it useful for you, and that
-
71:10 - 71:13you can appreciate the importance of
-
71:13 - 71:15machine learning, and how it can be
-
71:15 - 71:20applied in a real life use case in a
-
71:20 - 71:23typical production environment. All right,
-
71:23 - 71:27thank you all so much for watching!
Show all