Data Mining for Business Analytics: Concepts, Techniques, and Applications with XLMiner
3rd Edition
ISBN: 9781118729274
Author: Galit Shmueli, Peter C. Bruce, Nitin R. Patel
Publisher: WILEY
expand_more
expand_more
format_list_bulleted
Expert Solution & Answer
Chapter 2, Problem 3P
Explanation of Solution
Given: Refer to the table 2.5 as the sample of the
To find: Whether the data was sampled randomly or not. Also, whether the data present in the table is a useful sample or not.
Solution:
In the given database table 2.5, the selection of data is linearly ordered...
Expert Solution & Answer
Want to see the full answer?
Check out a sample textbook solutionStudents have asked these similar questions
Question 5:
A random sample of 15 patients yielded the following
data on the length of stay (in days) in the hospital.
5, 6, 9, 10, 15, 10, 14, 12, 10, 13, 13, 9, 8, 10, 12.
Find the mean, median and mode
1. Suppose that the data for analysis includes the attribute age. The age values for the data tuples are
(in increasing order) 13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30, 33, 33, 35, 35, 35, 35, 36,
40, 45, 46, 52, 70.
Using the data above, please answer the following questions:
b. Use "Normalization with decimal scaling" to change the age value to 25.
First, perform the following tasks:
• Make a linear regression model with all the features in the dataset. Use train_test_split to keep 20% of the data for
testing.
• Use your model to predict values for test set and print the predictions for the first 10 instances of the test data and
compare them with actual values.
• Print the coefficient values and their corresponding feature name (e.g. age 43, bmi 200, .)
• Note that you can access feature_names from diabetes dataset directly
• Calculate training-MSE, testing-MSE, and R-squared value.
Compare the two models. Did using all available features improve the performance?
In [ ]: # Your code goes here
In [ ]: # Your code goes here
Chapter 2 Solutions
Data Mining for Business Analytics: Concepts, Techniques, and Applications with XLMiner
Knowledge Booster
Similar questions
- 1. Suppose that the data for analysis includes the attribute age. The age values for the data tuples are (in increasing order) 13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30, 33, 33, 35, 35, 35, 35, 36, 40, 45, 46, 52, 7o. Using the data above, please answer the following questions: a. Use “min-max Normalization" to change the age value of 25 to range [0.0 , 1.0].arrow_forwardHow would you know if the data showed heteroscedasticity?arrow_forwardConvert your NFA from Homework No. 2 into a DFA. Identify if this DFA is (already) minimized. If it is not, minimize it.Hint: Use the "table filling method" for minimizing your DFA (if needed).arrow_forward
- In a database describing 100 examples of printer failures, 75 are hardware failures and 25 are driver failures. Of the hardware failures, 15 had Windows. Of the driver failures, 15 had Windows. If the probability of a driver failure is 25/100, the probability that the system where a failure occurred was Windows is 30/100 and the probability of a failure in the Windows system given it has been caused by the driver is 15/25, what is the probability that a failure has been caused by a driver knowing that the system is Windows?arrow_forwardWhat is a stepwise regression?arrow_forwardAnswer the following questions: Assume that in cone-shaped structures, the measurements for the height and radius of 6 cones are given as 8.28, 8.04, 9.06, 8.70, 7.58, 8.34 and 2.27, 1.98, 1.69, 1.88, 1.64, 2.14 respectively. Write R program for the scenario given below. (a). Make vectors with the given values. (b) The volume of a cone with radius R and height H is given by (1/ 3) TIR²H. Make a vector with the volumes of the 6 cones. (c). Compute the mean, median and standard deviation of the cone volumes. (d). Compute also the mean of volume for the cones with a height less than 8.5.arrow_forward
- Question p .Suppose that the data for analysis includes the attribute age. The age values for the data tuples are (in increasing order) 13, 15, 16, 16, 19, 20, 20, 21, 22, 22, 25, 25, 25, 25, 30, 33, 33, 35, 35, 35, 35, 36, 40, 45,46, 52, 70. Show a boxplot of the data? Full explain this question and text typing work only We should answer our question within 2 hours takes more time then we will reduce Rating Dont ignore this linearrow_forwardWrite the objective function that can be used to determine the regression model parameters. How is this objective function will be used to find model parameters?arrow_forwardConvert NFA from picture to DFA . Identify if this DFA is (already) minimized. If it is not, minimize it.Hint: Use the "table filling method" for minimizing your DFA (if needed).arrow_forward
- Fit the following data below using Cubic Regression. Terminate if Ea ≤ 0.0001arrow_forwardSuppose you wish to study the relationship between the number of 'likes' on Facebook and the number of 'friends' one has on Facebook. Describe the statistical technique you would use.arrow_forwardGiven the following table of students, assignments, and grades for a single class. Examine the data in the table above and identify any columns that contain data inconsistencies. Place an X in the Consistent or Inconsistent column.arrow_forward
arrow_back_ios
SEE MORE QUESTIONS
arrow_forward_ios
Recommended textbooks for you
- Database System ConceptsComputer ScienceISBN:9780078022159Author:Abraham Silberschatz Professor, Henry F. Korth, S. SudarshanPublisher:McGraw-Hill EducationStarting Out with Python (4th Edition)Computer ScienceISBN:9780134444321Author:Tony GaddisPublisher:PEARSONDigital Fundamentals (11th Edition)Computer ScienceISBN:9780132737968Author:Thomas L. FloydPublisher:PEARSON
- C How to Program (8th Edition)Computer ScienceISBN:9780133976892Author:Paul J. Deitel, Harvey DeitelPublisher:PEARSONDatabase Systems: Design, Implementation, & Manag...Computer ScienceISBN:9781337627900Author:Carlos Coronel, Steven MorrisPublisher:Cengage LearningProgrammable Logic ControllersComputer ScienceISBN:9780073373843Author:Frank D. PetruzellaPublisher:McGraw-Hill Education
Database System Concepts
Computer Science
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:McGraw-Hill Education
Starting Out with Python (4th Edition)
Computer Science
ISBN:9780134444321
Author:Tony Gaddis
Publisher:PEARSON
Digital Fundamentals (11th Edition)
Computer Science
ISBN:9780132737968
Author:Thomas L. Floyd
Publisher:PEARSON
C How to Program (8th Edition)
Computer Science
ISBN:9780133976892
Author:Paul J. Deitel, Harvey Deitel
Publisher:PEARSON
Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781337627900
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning
Programmable Logic Controllers
Computer Science
ISBN:9780073373843
Author:Frank D. Petruzella
Publisher:McGraw-Hill Education