Calculating Distance with Categorical Predictors. This exercise with a tiny dataset illustrates the calculation of Euclidean distance and the creation of binary dummies. The online education company Statistics.com segments its customers and prospects into three main categories: IT professionals (IT), statisticians (Stat), and other (Other). It also tracks, for each customer, the number of years since first contact (years). Consider the following customers; information about whether they have taken a course or not (the outcome to be predicted) is included:

Computer Networking: A Top-Down Approach (7th Edition)
7th Edition
ISBN:9780133594140
Author:James Kurose, Keith Ross
Publisher:James Kurose, Keith Ross
Chapter1: Computer Networks And The Internet
Section: Chapter Questions
Problem R1RQ: What is the difference between a host and an end system? List several different types of end...
icon
Related questions
Question
Calculating Distance with Categorical Predictors. This exercise with a tiny dataset
illustrates the calculation of Euclidean distance and the creation of binary dummies. The
7.1
online education company Statistics.com segments its customers and prospects into three
main categories: IT professionals (IT), statisticians (Stat), and other (Other). It also tracks,
for each customer, the number of years since first contact (years). Consider the following
customers; information about whether they have taken a course or not (the outcome to
be predicted) is included:
Customer 1: Stat, 1 year, did not take course
Customer 2: Other, 1.1 year, took course
a. Consider now the following new prospect:
Prospect 1: IT, 1 year
Using the information above on the two customers and one prospect, create one
dataset for all three with the categorical predictor variable transformed into 2 binaries,
and a similar dataset with the categorical predictor variable transformed into 3 binaries.
b. For each derived dataset, calculate the Euclidean distance between the prospect and
each of the other two customers. (Note: While it is typical to normalize data for k-
NN, this is not an iron-clad rule and you may proceed here without normalization.)
c. Using k-NN with k = 1, classify the prospect as taking or not taking a course using
each of the two derived datasets. Does it make a difference whether you use 2 or 3
dummies?
Transcribed Image Text:Calculating Distance with Categorical Predictors. This exercise with a tiny dataset illustrates the calculation of Euclidean distance and the creation of binary dummies. The 7.1 online education company Statistics.com segments its customers and prospects into three main categories: IT professionals (IT), statisticians (Stat), and other (Other). It also tracks, for each customer, the number of years since first contact (years). Consider the following customers; information about whether they have taken a course or not (the outcome to be predicted) is included: Customer 1: Stat, 1 year, did not take course Customer 2: Other, 1.1 year, took course a. Consider now the following new prospect: Prospect 1: IT, 1 year Using the information above on the two customers and one prospect, create one dataset for all three with the categorical predictor variable transformed into 2 binaries, and a similar dataset with the categorical predictor variable transformed into 3 binaries. b. For each derived dataset, calculate the Euclidean distance between the prospect and each of the other two customers. (Note: While it is typical to normalize data for k- NN, this is not an iron-clad rule and you may proceed here without normalization.) c. Using k-NN with k = 1, classify the prospect as taking or not taking a course using each of the two derived datasets. Does it make a difference whether you use 2 or 3 dummies?
Expert Solution
trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 2 steps

Blurred answer
Recommended textbooks for you
Computer Networking: A Top-Down Approach (7th Edi…
Computer Networking: A Top-Down Approach (7th Edi…
Computer Engineering
ISBN:
9780133594140
Author:
James Kurose, Keith Ross
Publisher:
PEARSON
Computer Organization and Design MIPS Edition, Fi…
Computer Organization and Design MIPS Edition, Fi…
Computer Engineering
ISBN:
9780124077263
Author:
David A. Patterson, John L. Hennessy
Publisher:
Elsevier Science
Network+ Guide to Networks (MindTap Course List)
Network+ Guide to Networks (MindTap Course List)
Computer Engineering
ISBN:
9781337569330
Author:
Jill West, Tamara Dean, Jean Andrews
Publisher:
Cengage Learning
Concepts of Database Management
Concepts of Database Management
Computer Engineering
ISBN:
9781337093422
Author:
Joy L. Starks, Philip J. Pratt, Mary Z. Last
Publisher:
Cengage Learning
Prelude to Programming
Prelude to Programming
Computer Engineering
ISBN:
9780133750423
Author:
VENIT, Stewart
Publisher:
Pearson Education
Sc Business Data Communications and Networking, T…
Sc Business Data Communications and Networking, T…
Computer Engineering
ISBN:
9781119368830
Author:
FITZGERALD
Publisher:
WILEY