What is Supervised Learning? | Lecture 3 | Machine Learning Course

1

What is Supervised Learning?

Supervised Learning
(toc)#title=Key Points

Supervised Learning: 

  • In this video, I am going to define what's presumably the most common type of Machine Learning problem, which is Supervised Learning. The term Supervised Learning refers to the fact that we gave the algorithm a data set in which the, called," right answers" were given. 
  •  To define a bit more language, this is also called a regression problem. By regression problem, I mean we are trying to prognosticate a nonstop valued affair. To introduce a bit more language, this is an illustration of a bracket problem. The term bracket refers to the fact, that then, we are trying to prognosticate a separate value affair zero or one, malignant or benign. 
  •  In other Machine Learning problems, we will frequently have further features. But it turns out that for some learning problems what you really want isn't to use like three or five features, but instead, you want to use a horizonless number of features, a horizonless number of attributes, so that your learning algorithm has lots of attributes, or features, or cues with which to make those prognostications. So, how do you deal with a horizonless number of features? It turns out that when we talk about an algorithm called the Support Vector Machine, there will be a neat fine trick that will allow a computer to deal with a horizonless number of features. 

How supervised Learning work?

In supervised learning, we're given a data set and formerly know what our correct yield should look like, having the idea that there's a relationship between the input and the output. Supervised learning problems are distributed into" regression" and" class" problems. In a regression problem, we're trying to prognosticate results within a nonstop yield, meaning that we're trying to collude input variables to some nonstop function. In a class problem, we're rather trying to forecast results in a separate yield. In other words, we're trying to collude input variables into separate orders.

Example 1:

Given data about the size of houses on the real estate request, try to forecast their price. Price as a function of size is a nonstop yield, so this is a regression problem. We could turn this illustration into a category problem by rather making our yield about whether the house" sells for further or lower than the asking price." Then we're classifying the houses grounded on price into two separate orders. 

Example 2:

(a) Regression - Given a picture of a person, we have to predict their age based on the given picture

(b) Classification - Given a patient with a tumor, we have to predict whether the tumor is malignant or benign.

Watch Video:



Share this video with your friends and with your partners. Also, Subscribe to the channel and like the video. :) 

Supervised Learning algorithms:

Logistic Learning

 
Neural Networks


Linear Discriminant analysis


Decision Trees


Bayesian Learning


Random Forests

In this video, I am going to define what's presumably the most common type of Machine Learning problem, which is Supervised Learning. I will define Supervised Learning further formally latterly, but it's presumably stylish to explain or start with an illustration of what it is, and we'll do the formal description latterly. Let's say you want to forecast casing prices. A while back a pupil collected data sets from the City of Portland, Oregon, and let's say you compass the data set and it looks like this. Then on the vertical axis, the size of different houses on square bases, and on the perpendicular axis, the price of different houses in thousands of bones.

So, given this data, let's say you have a friend who owns a house that says 750 square bases, and they're hoping to vend the house, and they want to know how important they can get for the house. So, how can the learning algorithm help you? One thing a learning algorithm might want to do is put a straight line through the data, and also fit a straight line to the data. Grounded on that, it looks like perhaps their house can be vented for about $150,000. But perhaps this is not the only learning algorithm you can use, and there might be a better one. For illustration, rather than fitting a straight line to the data, we might decide that it's better to fit a quadratic function or an alternate-order polynomial to this data. However, also it looks like, well, If you do that and make a vaticination then. 

One of the effects we'll talk about latterly is how to choose, and how to decide, do you want to fit a straight line to the data? Or do you want to fit a quadratic function to the data? There is no fair selecting whichever bone gives your friend the better house to vend. But each of these would be a fine illustration of a learning algorithm. So, this is an illustration of a Supervised Learning algorithm. The term Supervised Learning refers to the fact that we gave the algorithm a data set in which the, called," right answers" were given. That's we gave it a data set of houses in which for every illustration in this data set, we told it what's the right price. So, what was the accurate price that that house ended for, and the algorithm's task was to just produce further of these right answers similar to this new house that your friend may be trying to vend. To define a bit more language, this is also called a regression problem. By regression problem, I mean we are trying to prognosticate a nonstop valued affair. Namely the price. So technically, I guess prices can be rounded off to the nearest cent. So, perhaps prices are actually separate values. But generally, we suppose the price of a house as a real number, as a scalar value, as a nonstop value number, and the term retrogression refers to the fact that we are trying to prognosticate the kind of nonstop values attribute. 

Then is another Supervised Learning exemplification. Some familiars and I were actually working on this earlier. Let's say you want to look at medical records and try to prognosticate a bone cancer as nasty or benign. However, a lump in their bone, a cruel excrescence is an excrescence that's dangerous and dangerous, If someone discovers a bone excrescence. So obviously, people watch a lot about this. Let's see the collected data set. Suppose you're in your dataset, you have on your vertical axis the size of the excrescence, and on the perpendicular axis, I am going to compass one or zero, yes or no, whether or not these are exemplifications of excrescences we have seen ahead are bad, which is one, or zero or not bad or benign. So, let's say your dataset looks like this, where we saw an excrescence of this size that turned out to be benign, one of this size, one of this size, and so on. sorely, we also saw many nasty excrescences cells, one of that size, one of that size, one of that size, and so on. 

So in this illustration, I've five exemplifications of benign excrescences shown down then, and five exemplifications of nasty excrescences shown with a perpendicular axis value of one. Let's say a friend tragically has a bone excrescence, and let's say her bone excrescence size is perhaps nearly around this value, the Machine Learning question is, can you estimate what's the probability, what is the chance that an excrescence as nasty versus benign? To introduce a bit more language, this is an illustration of a bracket problem. The term bracket refers to the fact, that then, we are trying to prognosticate a separate value yield zero or one, nasty or benign. It turns out that in bracket problems, occasionally you can have further than two possible values for the yield. As a concrete illustration, perhaps there are three types of bone cancers. So, you may try to forecast a separate value yield of zero, one, two, or three, where zero may mean benign, benign excrescence, so no cancer, and one may mean type one cancer, perhaps three types of cancer, whatever type one means, and two mean an alternate type of cancer, and three may mean a third type of cancer. But this will also be a category problem because these are the separate value set of affairs corresponding to you are no cancer, or cancer type one, or cancer type two, or cancer type three. In category problems, there's another way to compass this data. Let me show you what I mean. I am going to use a slightly different set of symbols to plot this data. 

So, if excrescence size is going to be the trait that I am going to use to prognosticate malice or benignness, I can also draw my data like this. I am going to use different symbols to denote my benign and nasty, or my negative and positive exemplifications. So, rather than drawing crosses, I am now going to draw O's for the benign excrescences, like so, and I am going to keep using X's to denote my nasty excrescences. I hope this figure makes sense. All I did was I took my data set on top, and I just counterplotted it down to this real line like so and started to use different symbols, circles, and crosses to denote nasty versus benign exemplifications. Now, in this illustration, we use only one point or one trait, videlicet the excrescence size to prognosticate whether an excrescence is nasty or benign. In other machine learning problems, when we've further than one point or further than one trait. Then is an illustration, let's say that rather than just knowing the excrescence size, we know both the age of the cases and the excrescence size. 

In that case, perhaps your data set would look like this, where I may have a set of cases with those periods, and that excrescence size and they look like this, and a different set of cases that look a little different, whose excrescences turn out to be nasty as denoted by the crosses. So, let's say you have a friend who tragically has an excrescence, and perhaps their excrescence size and age fall around there. So, given a data set like this, what the learning algorithm might do is fit a straight line to the data to try to separate out the nasty excrescences from the benign bones, and so the learning algorithm may decide to put a straight line like that to separate out the two causes of excrescences. With this, hopefully, we can decide that your friend's excrescence is more likely, if it's over there that hopefully, your learning algorithm will say that your friend's excrescence falls on this benign side and is thus more likely to be benign than nasty. In this illustration, we had two features videlicet, the age of the case and the size of the excrescence. In other Machine Learning problems, we will frequently have further features. 

My musketeers that worked on this problem actually used other features like these, which are clump consistency, clump consistency of the bone excrescence, uniformity of cell size of the excrescence, and uniformity of cell shape the excrescence, and so on, and other features such as well. It turns out one of the most intriguing learning algorithms that we'll see in this course, as the learning algorithm that can deal with not just two, three, or five features, but a horizonless number of features. On this slide, I have listed an aggregate of five different features. Two on the axis and three further up then. But it turns out that for some learning problems what you really want isn't to use like three or five features, but rather you want to use a horizonless number of features, a horizonless number of attributes, so that your learning algorithm has lots of attributes, or features, or cues with which to make those prognostications. 

So, how do you deal with a horizonless number of features? 

How do you indeed store a horizonless number of effects in the computer when your computer is going to run out of memory? It turns out that when we talk about an algorithm called the Support Vector Machine, there will be a neat fine trick that will allow a computer to deal with a horizonless number of features. Imagine that I did not just write down two features then and three features on the right, but imagine that I wrote down an infinitely long list. I just kept writing further and further features, like an infinitely long list of features. It turns out we will come up with an algorithm that can deal with that. So, just to recap, in this course, we'll talk about Supervised Learning, and the idea is that in Supervised Learning, in every illustration in our data set, we're told what's the correct answer that we'd have relatively liked the algorithms have prognosticated on that illustration. similar to the price of the house, or whether an excrescence is nasty or benign. We also talked about the regression problem, and by regression, that means that our thing is to forecast a nonstop valued affair. 

We talked about the category problem where the thing is to prognosticate a separate value affair. Just a quick serape-up question. Suppose you are running a company and you want to develop learning algorithms to address each of two problems. In the first problem, you have a large force of identical particulars. So, imagine that you have thousands of clones of some identical particulars to vend, and you want to prognosticate how numerous of these particulars you vend over the coming three months. In the alternate problem, problem two, you have lots of druggies, and you want to write software to examine each existent of your client's accounts, so each one of your client's accounts. For each account, decide whether or not the account has been addressed or compromised. So, for each of these problems, should they be treated as a category problem or as a regression problem? When the video pauses, please use your mouse to select whichever of these four options on the left wing you suppose is the correct answer. 

Conclusion:

So hopefully, you got that. This is the answer. For problem one, I would treat this as a regression problem because if I've thousands of particulars, well, I would presumably just treat this as a real value, as a nonstop value. thus, the number of particulars I vend as a nonstop value. For the alternate problem, I would treat that as a category problem, because I might say set the value I want to forecast with zero to denote the account has not been addressed and set the value one to denote an account that has been addressed. So, just like your bone cancers where zero is benign, one is malicious. So, I might set this be zero or one depending on whether it's been addressed, and have an algorithm try to forecast each one of these two separate values. Because there is a small number of separate values, I would thus treat it as a bracket problem. So, that is it for Supervised Learning. In the coming video, I will talk about Unsupervised Learning, which is the other major order of learning algorithms. 

Share this article and comment your thoughts below. Thanks :)

Post a Comment

1Comments
  1. We got all the concept related to supervised Learning but What is Unsupervised Learning?

    ReplyDelete
Post a Comment

#buttons=(Accept !) #days=(20)

Our website uses cookies to enhance your experience. Learn More
Accept !