What is Model Representation? | Linear Regression:
(toc)#title=(table of content)
Model Representation:
To establish notation for future use, we’ll use
x(i) to denote the “input” variables (living area in this example), also called input features, and y(i) to denote the “output” or target variable that we are trying to predict (price). A pair (x(i),y(i)) is called a training example, and the dataset that we’ll be using to learn—a list of m training examples (x(i),y(i));i=1,...,m—is called a training set. Note that the superscript “(i)” in the notation is simply an index into the training set, and has nothing to do with exponentiation. We will also use X to denote the space of input values, and Y to denote the space of output values. In this example, X = Y = ℝ.
To describe the supervised learning problem slightly more formally, our goal is, given a training set, to learn a function h: X → Y so that h(x) is a “good” predictor for the corresponding value of y. For historical reasons, this function h is called a hypothesis. Seen pictorially, the process is therefore like this:
When the target variable that we’re trying to predict is continuous, such as in our housing example, we call the learning problem a regression problem. When y can take on only a small number of discrete values (such as if, given the living area, we wanted to predict if a dwelling is a house or an apartment, say), we call it a classification problem.
Watch Video:
Subscribe and Like the video and also share it with your friends.
Linear Regression:
Our first learning algorithm will be linear regression. In this video, you will see what the model looks like and more importantly you will see what the whole supervised learning process looks like. We will use a data set of accommodation prices from the city of Portland, Oregon. Given this data set, suppose you have a friend who is trying to sell a house let's see if a friend's house is 1250 square feet and you want to tell them how much it might cost to sell the house.
Well, one thing you could do is fit a model. Maybe fit a straight line to this data. That is, we are told what the real house was, what the real price was for each of the houses in our dataset that they were sold for, and also this is an example of a regression problem referred to by the term la regression Fact that we are predicting an output of real value, which is price. And just to remind you that the other most common type of supervised learning problem is called a classification problem, where we predict discrete-valued outcomes, like looking at cancerous tumors and trying to decide whether a tumor is malignant or benign.
More formally, in supervised learning, we have a data set, and that data set is called the training data set. We will define many symbols. Therefore, in this course, I use the lowercase m to denote the number of training examples.
Example:
Let me use the small x to denote input variables, which are also often referred to as functions. I'm going to use y to denote my output variables or the target variable that I'm going to predict, and that's the second column here. Notation [unintelligible], I'll use (x,y) to denote a single training example. This (x(i), y(i)), the superscript I in parentheses, is just an index of my training set and refers to the ith row of this table, ok?
So h is a function that maps from x to y. In designing a learning algorithm, we next have to decide how to represent this hypothesis h. For this and the next videos, I will choose our first option to present the hypothesis, which will be the following. We will represent h as follows. And we write this as theta(x) equal to theta0 plus theta1 of x. And when we plot that on the pictures, it all means we're going to predict that y is a linear function of x.
Conclusion:
And why a linear function? Well, sometimes we want to fit more complicated functions, maybe nonlinear ones too. But since this linear case is the simple building block, let's start with this linear function fitting example first and build on it to eventually have more complex models and more complex learning algorithms. This model is called linear regression or, for example, this is actually a one-variable linear regression, where the variable is x. Another name for this model is univariate linear regression. And univariate is just a fancy way of saying a variable. So this is a linear regression. In the next video, we start talking about how we implement this model.
x^{(i)}
Thanks for scrolling down, Share this with your friends and also comment your thoughts below!