Data Mining for Business Analytics: Concepts, Techniques, and Applications with XLMiner
3rd Edition
ISBN: 9781118729274
Author: Galit Shmueli, Peter C. Bruce, Nitin R. Patel
Publisher: WILEY
expand_more
expand_more
format_list_bulleted
Want to see more full solutions like this?
Subscribe now to access step-by-step solutions to millions of textbook problems written by subject matter experts!
Expert Solution & Answer
Want to see the full answer?
Check out a sample textbook solutionStudents have asked these similar questions
Use the R data set “anscombe” and answer the following questions.
a. Save a subset of all the columns of the data except x4 and y4. Use this subset for the rest of the questions below.
b. Add two new columns to the subset from (a) above as follows.
i. Directly (i.e. do not use cbind() function) add a column of random values from a normal distribution with mean of 100 and standard deviation of 5. Use set.seed(1) in the previous line for reproducibility.
ii. Using the cbind() function add a column consisting of the following as the data: 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0.
c. Remove column y3.
d. Using subset function remove columns x2, y1 and y2
e. Add the following two new rows to the data frame using the appropriate bind function
Change the data type of the new column from (b) (i) to integer and that from (b)(ii) to binary.
g. Change the names of the first two columns to 'q1x1' and 'q1x3'.
Tables: 1. Brands (sample csv enclosed) a. Brand ID b. Brand Name c. Holding Company ID d. Holding Company Name 2. Store Order Items (sample csv enclosed) a. Order ID b. User ID c. Product ID d. Brand ID e. Unit Price f. Quantity g. Date (of sale) h. Store ID (of sale)
2. For each Product (ID), determine the “store of sale” that made the first sale of the product in 2019.
Written in Sql
Write an order to generate binomial data with the number of observations (n) being 50, the number of trials (size) being 5 and the probability of success being 0.5, then storing the results in the binomial data matrix, with the number of columns being 5.
Chapter 4 Solutions
Data Mining for Business Analytics: Concepts, Techniques, and Applications with XLMiner
Knowledge Booster
Similar questions
- Give an example of when you would want to analyze the mean of a dataset.arrow_forwardMatLab Load the data flu.mat (you can do this by typing load flu in your script). This data is the flu trends seen in the United States 2005-2006, divided by region. We will use regressions to look at the data during flu season in the Pacific region. Create your x data: have x equal to 1:30. These represent 30 weeks between Oct. 2005 and May 2006. Create your y data: have y equal to flu.Pac(1:30)’. This is the flu trend for each week. Make sure you have an apostrophe after the last parenthesis. Fit the data below with a straight line and with a 2nd order polynomial. Use least-squares regression. Calculate the coefficient of determination (r^2) and the correlation coefficient (r) for each regression. Plot the two regression curves against the data. Which regression is better? Is there a polynomial you think would work better? Describe the data – what does it mean to you?arrow_forwardCan you explain the Dataset object in detail?arrow_forward
- Use Python Code Your task is : Visualization of confirmed cases of COVID-19 by country . (Use pandas to read the data table csv. Example df = pd.read_csv('COVID19.csv' ) then make a visualization using that csv table) Province/State Country Lat Long ######## ######## ######## ######## ######## ######## ######## ######## ######## ######## 2/1/2020 2/2/2020 2/3/2020 2/4/2020 2/5/2020 2/6/2020 2/7/2020 2/8/2020 2/9/2020 ######## ######## ######## ######## ######## ######## ######## ######## ######## ######## ######## ######## ######## ######## ######## ######## ######## ######## ######## ######## 3/1/2020 3/2/2020 3/3/2020 3/4/2020 3/5/2020 3/6/2020 3/7/2020 3/8/2020 3/9/2020 ######## ######## ######## ######## ######## ######## ######## ######## ######## ######## ######## ######## ######## ######## ######## ######## ######## ######## ######## ######## ######## ######## 4/1/2020 4/2/2020 4/3/2020 4/4/2020 4/5/2020 4/6/2020 4/7/2020 Afghanistan 33 65 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0…arrow_forwardUsing the data frame juul from the package ISwR in R studio 1) Plot histograms of insulin-like growth factor for each sex. Plot the histograms under one-another in a single column. Do not include NAs.arrow_forwardExplain the Clark-Wilson Model Edit View Insert Format Tools Table 12pt v Paragraph v BI U ...arrow_forward
- We created some models for a dataset and, for each model we computed its R2 score. The results are presented in the table below: Model m1 m2 m3 m4 m5 R2 0.85 0.76 0.87 0.68 0.79 What model should we use from the ones presented in the table? Justify your answer. Answer:arrow_forwardplz answer all questions C.The data below shows results for three treatments and weight results obtained from anexperiment on beetles. WeightsTreatment 1 (52,46,62,48,57,54)Treatment 2 (66,49,64,53,68)Treatment 3 (63,65,58,70,71,73)1. Create a data-frame for the results above2. Is there any significant difference between the average weights of beetles in the 3experimental conditions?D.The studies below shows results for three plant extract treatment (extracts 1-3) andbiomass results obtained from an experiment on bacteria. BiomassExtract1 (64, 66, 68, 75, 78, 94, 98, 79, 71, 80)Extract2 (91, 92, 93, 85, 87, 84, 82, 88, 95, 96)Extract3 (79, 78, 88, 94, 92, 85, 83, 85, 82, 81)1. Create a data-frame for the results above2. Is there any significant difference between the average biomass of bacteria treatedwith the three extracts?arrow_forwardThe following table (Training Dataset D) shows the midterm and final exam grades obtained for students in a databas x Midterm exam Final exam 72 84 Plot the data. Do xand y seem to have a linear relationship? b) Find an equation (F(x)) for the prediction of a student's final exam grade based on the student's midterm grade in the course. Predict the final exam grade of a student who received an 86 on the midterm exam. O Find All error measures for the model F(X). * Use MS-Excel or SPSS to find the Model. Compare it with your model you found in (b). 50 63 81 77 74 78 94 90 86 75 59 49 83 79 65 77 33 52 88 74 81 90arrow_forward
- MatLab Create a function that calculates the coefficients of the least-squares regression for a 2nd order polynomial. Use LU Decomposition to find the coefficients. The function should have: Input: X and Y Output: a_0, a_1, a_2, r^2, r. Please test your function on the data below and plot the results. Be sure to save your function and include it in your homework submission!arrow_forwardCase Study:This case study considers the Credit approval dataset. This file concerns credit card applications. Allattribute names and values have been changed to meaningless symbols to protect confidentiality ofthe data.This dataset is interesting because there is a good mix of attributes -- continuous, nominal with smallnumbers of values, and nominal with larger numbers of values. There are also a few missing values.Attribute Information:A1: b, a.A2: continuous.A3: continuous.A4: u, y, l, t.A5: g, p, gg.A6: c, d, cc, i, j, k, m, r, q, w, x, e, aa, ff.A7: v, h, bb, j, n, z, dd, ff, o.A8: continuous.A9: t, f.A10: t, f.A11: continuous.A12: t, f.A13: g, p, s.A14: continuous.A15: continuous.A16: +,- (class attribute)Question: Apply the KNN algorithm to predict the acceptance of rejection of a credit cardapplicationarrow_forwardCase Study:This case study considers the Credit approval dataset. This file concerns credit card applications. Allattribute names and values have been changed to meaningless symbols to protect confidentiality ofthe data.This dataset is interesting because there is a good mix of attributes -- continuous, nominal with smallnumbers of values, and nominal with larger numbers of values. There are also a few missing values.Attribute Information:A1: b, a.A2: continuous.A3: continuous.A4: u, y, l, t.A5: g, p, gg.A6: c, d, cc, i, j, k, m, r, q, w, x, e, aa, ff.A7: v, h, bb, j, n, z, dd, ff, o.A8: continuous.A9: t, f.A10: t, f.A11: continuous.A12: t, f.A13: g, p, s.A14: continuous.A15: continuous.A16: +,- (class attribute)Question: Apply the KNN algorithm to predict the acceptance of rejection of a credit cardapplication. Follow the same strategy used in chapter 7. please solve it using RStudioarrow_forward
arrow_back_ios
SEE MORE QUESTIONS
arrow_forward_ios
Recommended textbooks for you
- Database System ConceptsComputer ScienceISBN:9780078022159Author:Abraham Silberschatz Professor, Henry F. Korth, S. SudarshanPublisher:McGraw-Hill EducationStarting Out with Python (4th Edition)Computer ScienceISBN:9780134444321Author:Tony GaddisPublisher:PEARSONDigital Fundamentals (11th Edition)Computer ScienceISBN:9780132737968Author:Thomas L. FloydPublisher:PEARSON
- C How to Program (8th Edition)Computer ScienceISBN:9780133976892Author:Paul J. Deitel, Harvey DeitelPublisher:PEARSONDatabase Systems: Design, Implementation, & Manag...Computer ScienceISBN:9781337627900Author:Carlos Coronel, Steven MorrisPublisher:Cengage LearningProgrammable Logic ControllersComputer ScienceISBN:9780073373843Author:Frank D. PetruzellaPublisher:McGraw-Hill Education
Database System Concepts
Computer Science
ISBN:9780078022159
Author:Abraham Silberschatz Professor, Henry F. Korth, S. Sudarshan
Publisher:McGraw-Hill Education
Starting Out with Python (4th Edition)
Computer Science
ISBN:9780134444321
Author:Tony Gaddis
Publisher:PEARSON
Digital Fundamentals (11th Edition)
Computer Science
ISBN:9780132737968
Author:Thomas L. Floyd
Publisher:PEARSON
C How to Program (8th Edition)
Computer Science
ISBN:9780133976892
Author:Paul J. Deitel, Harvey Deitel
Publisher:PEARSON
Database Systems: Design, Implementation, & Manag...
Computer Science
ISBN:9781337627900
Author:Carlos Coronel, Steven Morris
Publisher:Cengage Learning
Programmable Logic Controllers
Computer Science
ISBN:9780073373843
Author:Frank D. Petruzella
Publisher:McGraw-Hill Education