Part 4: Machine Learning & Python – Linear Regression Part: Sum of Squared Errors

Welcome back you do not crap teacher here I am a Kalu Kaul This video we will continue to talk about the last remaining part of our But the first thing I want to let you note This youtube video This video explains the simple solution based on linear regression The author is the Brandon foltz He has done something good I suggest that you take a look You can also subscribe to, I should have subscribed oh no Just subscribe, if you are ready to look at his stuff, watch the videos The content I want you to take a good look We are always guess who the mysterious nba players I guess how much he has been This is the main part of the machine learning Is artificial intelligence Or at least a subset of A specific artificial intelligence Or is narrow AI This is basically a function approximation or prediction Recently called prediction based Statistics for these is very good When do we use a variable forecast Our only variable is the average height of nba players So we make predictions in situations when only one variable The best way is to find the average value of the dependent variable So you can think We are always guess the mysterious nba player's height There are several factors that determine the One factor may be the average height of men, it can be assumed that Of course, it can also depend on nba players average height Or it can depend on other factors Such as weight In general, the more severe your words The higher height There is a relationship between them Although not always apply, but with undeniable relationship between the two Among them are positively correlated The higher you are, the more general weight, which is we all agree so We now want to guess his height Then the dependent variable is the height Then the argument, then the more the We can use it to represent a We can Nba players to use weight The use of body weight, you can guess the height That sounds pretty good, right If you do not understand what these words are studying it Read more like twice, I want to open up my terminal I would like to do it again Until we remember that this process tightly I turn it off, then open a new We usually come first open the Desktop Load hit out Then open the folder linear regression Open, then activate your virtual environment Now I'm going to remember if we beat ls We'll get this nba_ht_wt

csv file Enter python Enter our related impor pandas as pd import matplotlib as pyplot as plt We have not used this pd oh right we have not used the numpy But we will use the Perhaps then this video where, uh, I have been promised have been useless Probably will use today Are somehow remember We want to pull data come in Want to read the data call it data This data is from this document to the To do this, we use the pandas, it has this feature Called read csv It is equivalent parameters, pack Underline, write directly to this path csv It should be such that we will remember a wrong This error can be solved by Here search decode Now back to this page I hope to solve this problem, I should remember iso-8859-1 so do it again, hit encoding = Iso-8859-1 = Iso-8859-1 Confirm no wrong, yes Well, yes and press Enter the We can play data and the data play out, but we only want to Height So hit heights = data [ 'Height'] We know that this call Because to play datahead you can see There are four data after player names Position, weight Height, age, and now we just want height So you can directly use this, I copied it down Well write it directly Heights = data [write 'Height'] If you want to weight it, too Now we get all the height We know that the average height 79 If you use a little silly if we assume that 79 players We also understand that this should not be the best way Remember that 79 units are inches Which is equivalent to 6'5 so this assumption is certainly not btw on a video I use the word estimate Think of it as a noun I watched the video found oh no so this is not the best hypothesis, but we can do better We know this assumption has repeatedly For this purpose We can use square of Uh Jiaosha Uh Cottage and error? Uh I should not forget this is This is really sad reminders, too sad reminders This is what I uh In fact, this video

oh squared error Right here Look at this video, really useful It will teach you the error sum of squares, this is very Do you remember last issue of how we draw our line of it We use this to draw this line plt , Playing Heights Again a We can also play plt axhline We want to draw this average, 7906 Let me write heights

mean Then pltshow Then came out, this is the average of our The man, there are a bunch of data points Here, here We have 500 data points Here, here In fact, here are two of a total lot From 1-505 505 is Thabit No 1 is Nate Robinson We have the data, and this line represents the average We can see it Lower than this line of players to be shorter than the average height On the contrary to be higher than average It is this man doing We want to know what is the sum of squared errors This allows us to know our model in the end what level Model is forecasting methods We used to predict the height of fashion The model we have chosen only one variable nba player's height, which means the uh With the average guess words The mysterious player's height should be somewhere in the vicinity of average height

so we really want to know We want to compare the sum of squared errors and to compare our model so we can calculate the square of each model and error, we want to minimize it The lower the better, what does it mean these Say it again: We follow a bottom-up rather than top-down approach to learning In kalukalu learning process We only inductive inference Rather than deductive reasoning We learn by example, rather than non-gonna put every detail, every rule Each step in the process to figure out, before asking the East West asked us to listen to the whole sentence means We do not ask questions because of a break in the beginning of learning so watching the video you will understand, and if do not understand Then again, or you can find some additional resources or help Resulting in a more intuitive understanding Or to see more tutorials, really This is my suggestion a good Let's see what is the sum of squared errors To find out that we only need to calculate We guess nba players nba player height and real height of "difference" So our guess is that the average height If we have a really nba player database Let me off this Then open again, let us look at our database We have a real nba player database so not to guess I just want to say our guess is 79 So for each player, and more outrageous look at our guess so first take a look nate His height is 69 inches so if I use 69–79 (Average height) in order to more accurately I want to make heightsmean Then get -10 69 for it This really is anyway The real gap height and height nate nba players that are 10 inches This gap can be large enough, I do 10 inches We do not want this, because this is negative We have to count it is multiplied by the square of its own Write difference Do it again playing difference We want to distinguish = This figure should be -10 Now it wants The square difference * difference This is nce, so get 101, which is larger than I thought This is the data we want, but we do not want for only nate We want every player nba Each player then so hit nate sum of squared errors = Difference Uh nate sum of squared errors = difference This is the sum of squared errors of the nate I went wrong again Well this is the sum of squared errors of the nate sum of squared errors it is difference * difference Is 101 Now I want to get every player squared error Thomas then give small count again This is the same, because he and nate child high side, so Then let us fight isaiah Thomas How to fight with Isaiah As he and nate Now there are two, and we want to do a third Uh to count it Thabit He stands 87, high enough So So our guess is Uh, 87– Our average height is 79 Difference is We guess the random nba player height is the average height of 79 inches Thabit and 87, the difference between the two 7-inch Thabit height and average height difference between the nba 7inches So then count the square Multiplied by 87 – heights

mean This is definitely not Which is certainly the wrong order Let us now diff = Just thankful diff * diff Then is this, we call it How to spell the name of the hasheen sum of squared errors = Diff Now we have it, and these squared Let's put all the remaining 501 players Come calculated In only three of us, we will not now go one by one count I do tripling your estimate on its hands We need to do Every player is a one-time complete count of all data so why do not we first Height difference count it? Let us count nba height difference between the average height and each player What we do is the average of Why do not mean = heightsmean mean is 79 so mean – each player It is 69 for nate To calculate each player's words Uh You can play This cycle diff = We have to do [Mean-h for h and So be it [mean-height for height in heights] Call diff, with These wants Let's see how many people are here ok 505, as long as we do not want this We want to right the square, take on their own, To do so we have to write diff =, to write diff squared = We fight [Diff * diff for Let's play Still playing Write nums * nums for num in diff Write Playing diff squared run They are square off Looks good Let's look at the square after the last digit is 62 The former is 79 square It is then equal to 79 * 62 79 Nothing wrong with looking good pretty good Now we want them and So let's play sum (diff_squared) Then we get all the squared error sum This is why it so important that we want to take advantage of this model squared error of a single variable and Square error model and making comparison with the other two variables, Two variables including height I should no longer be under one of these and then let the video do you wait to use numpy it? Ah make you wait for it Maintaining suspense So this video where we have to do is figure out what is the sum of squared errors You now understand that this is why it is important The next one will understand video If you have the patience to read the current video We want to understand why words I'm sorry, there is a video You will be able to understand Now what we have to do this If you look at his video, he would say, uh In about 20 minutes time Simple linear regression fact, a comparison between the two models Is a model of independent variables do not exist In this remember our dependent variable is the height We mysterious player's height depends on several factors Our arguments We do not know what these factors are, we will use Average height, low point of speculation so this is a model Another is to use the best fit regression line Note ah This is what we use to lower a video We use this to best fit regression line But we are using this model is not the kind of argument If only one variable, the best predictor of other values Is the average value of the dependent variable That's what we do things So I hope you can understand some words Do not understand, then do not worry too much We have more examples, the more we repeat The more you understand

so thank you for your patience to watch Our next issue to see the video I stress again, remember this number It was, uh Call sse = = sum (diff_squared) Remember this number SSE This is our first sse only model of a variable Well, we see the next issue of video