1. Generate dataset Write Python code to generate a regression dataset that contains 250 examples. In this dataset, each input x is drawn uniformly random from (0, 2) and the corresponding output is y=x3-3x2+2x+e, where € is Gaussian noise with zero mean and standard deviation 0.04. You don't need to define a function, but your code should create variables that contain all the inputs and outputs. You should also fix a random seed so that your results are reproducible. 2. Split dataset into train/test sets Write code to randomly split your dataset above into a train set and a test set. Your train set should contain 150 examples and your test set should contain 100 examples. 3. Define a K-NN model Write code to define a K-NN regression model with K-5 and the neighbors are weighted by the inverse of their distance. 4. Train your K-NN model Write code to train your K-NN model on your train set. 5. Make prediction with your K-NN model Write code to predict the outputs of your K-NN model on the test set. 6. Compute the MSE Write code to compute and print out the mean squared error (MSE) of your K-NN model on the test set 7. Visualize your K-NN model Write code to plot the predictions of your K-NN model on [0, 2]. You should generate 100 evenly spaced points on [0, 2], use your K-NN model to predict their outputs, and plot them as a line graph. Your plot must also contain the train set, the test set, and the legend. Play with the settings to make your plot clear and readable. 8. Train and test a decision tree model Write code to train a decision tree model on your train set above and then print out its MSE on the test set. Use sklearn's default setting for your model. 9. Visualize your decision tree model Repeat Question 7 with your trained decision tree model. Your code should generate a plot similar to that of Question 7. Do not plot the actual tree with graphviz. 10. Compare decision tree and K-NN ( According to the above results, which model is better on your dataset (K-NN or decision tree)? Why?

icon
Related questions
Question

question 5

1. Generate dataset
Write Python code to generate a regression dataset that contains 250 examples. In this dataset, each input x is
drawn uniformly random from (0, 2) and the corresponding output is y=x3-3x2+2x+€, where € is Gaussian noise
with zero mean and standard deviation 0.04.
You don't need to define a function, but your code should create variables that contain all the inputs and outputs.
You should also fix a random seed so that your results are reproducible.
2. Split dataset into train/test sets
Write code to randomly split your dataset above into a train set and a test set. Your train set should contain 150
examples and your test set should contain 100 examples.
3. Define a K-NN model
Write code to define a K-NN regression model with K-5 and the neighbors are weighted by the inverse of their
distance.
4. Train your K-NN model
Write code to train your K-NN model on your train set.
5. Make prediction with your K-NN model
Write code to predict the outputs of your K-NN model on the test set.
6. Compute the MSE
Write code to compute and print out the mean squared error (MSE) of your K-NN model on the test set
7. Visualize your K-NN model
Write code to plot the predictions of your K-NN model on [0, 2]. You should generate 100 evenly spaced points on [0,
2], use your K-NN model to predict their outputs, and plot them as a line graph. Your plot must also contain the train
set, the test set, and the legend. Play with the settings to make your plot clear and readable.
8. Train and test a decision tree model
Write code to train a decision tree model on your train set above and then print out its MSE on the test set. Use
sklearn's default setting for your model.
9. Visualize your decision tree model
Repeat Question 7 with your trained decision tree model. Your code should generate a plot similar to that of
Question 7. Do not plot the actual tree with graphviz.
10. Compare decision tree and K-NN (
According to the above results, which model is better on your dataset (K-NN or decision tree)? Why?
Transcribed Image Text:1. Generate dataset Write Python code to generate a regression dataset that contains 250 examples. In this dataset, each input x is drawn uniformly random from (0, 2) and the corresponding output is y=x3-3x2+2x+€, where € is Gaussian noise with zero mean and standard deviation 0.04. You don't need to define a function, but your code should create variables that contain all the inputs and outputs. You should also fix a random seed so that your results are reproducible. 2. Split dataset into train/test sets Write code to randomly split your dataset above into a train set and a test set. Your train set should contain 150 examples and your test set should contain 100 examples. 3. Define a K-NN model Write code to define a K-NN regression model with K-5 and the neighbors are weighted by the inverse of their distance. 4. Train your K-NN model Write code to train your K-NN model on your train set. 5. Make prediction with your K-NN model Write code to predict the outputs of your K-NN model on the test set. 6. Compute the MSE Write code to compute and print out the mean squared error (MSE) of your K-NN model on the test set 7. Visualize your K-NN model Write code to plot the predictions of your K-NN model on [0, 2]. You should generate 100 evenly spaced points on [0, 2], use your K-NN model to predict their outputs, and plot them as a line graph. Your plot must also contain the train set, the test set, and the legend. Play with the settings to make your plot clear and readable. 8. Train and test a decision tree model Write code to train a decision tree model on your train set above and then print out its MSE on the test set. Use sklearn's default setting for your model. 9. Visualize your decision tree model Repeat Question 7 with your trained decision tree model. Your code should generate a plot similar to that of Question 7. Do not plot the actual tree with graphviz. 10. Compare decision tree and K-NN ( According to the above results, which model is better on your dataset (K-NN or decision tree)? Why?
Expert Solution
steps

Step by step

Solved in 4 steps with 5 images

Blurred answer