Linear Regression in Python to find Relationship between two columns – Test and Train Explanation
You have a dataset with Columns "Study Hours" and "Exam Score" with 1000 rows.
You split the rows 800 separately and 200 separately.
80% of rows for training the Linear regression model and 20% of rows for testing the Linear regression model.
What is Training Data?
Training Data is the data we use to teach the model.
We give the model Study Hours Column and Exam Scores Column so it can find the relationship between these columns using the formula
Y = M × X + C
This is the Python code we use to train the Model: model.fit(x,y)
After training, the model learns:
Value of M which is also called as Slope
Value of C which is also called as Intercept
You split the rows 800 separately and 200 separately.
80% of rows for training the Linear regression model and 20% of rows for testing the Linear regression model.
What is Training Data?
Training Data is the data we use to teach the model.
We give the model Study Hours Column and Exam Scores Column so it can find the relationship between these columns using the formula
Y = M × X + C
This is the Python code we use to train the Model: model.fit(x,y)
After training, the model learns:
Value of M which is also called as Slope
Value of C which is also called as Intercept
What is Testing Data?
Testing Data is new data that the model has never seen before.
We give it only Study Hours (X) and ask the model to predict Exam Scores (Y).
Then we compare the Predicted Y with the Real Y to see how well our model works.
Comments
Post a Comment