Weekly Quiz - Hierarchical Clustering and PCA

pdf

School

University of Texas *

*We aren’t endorsed by this school

Course

DSBA

Subject

Computer Science

Date

Feb 20, 2024

Type

pdf

Pages

5

Report

Uploaded by BrigadierRainCat57

Q No: 1 (Correct Answer) Marks: 2/2 Which of the following statement is/are true about the difference between PCA and Hierarchical Clustering? Cluster analysis groups observations while PCA is used for dimensionality reduction. PCA groups observations while cluster analysis is used for dimensionality reduction. PCA can be used to reduce the number of variables in the data whereas cluster analysis cannot. Clustering analysis can be used to reduce the number of variables in the data whereas PCA cannot. ] and 3 l(You Se ected)l PCA extracts principal components which capture the highest variance in the data, while clustering forms clusters to maximize homogeneity within the clusters and heterogeneity between the clusters. PCA works column-wise whereas clustering works row-wise. Q No: 2 (Correct Answer) Marks: 1/1 PCA is used for reducing dimensions ( o) I\You Se evteq/l PCA is a dimensionality reduction technique. From a given set of variables, we compute principal components that indicate the captured variance in the data. By choosing relevant principal components, we reduce the no. of dimensions in the data.
QNo: 3 (Correct Answer) Marks: 2/2 In the case of a dataset with multiple numeric variables with different units of measurement, which of the below two statements hold true? l. It is necessary to scale data before applying PCA II. It is necessary to scale data before applying Hierarchical clustering Both are false Both are true (YOU Se ected)l Since PCA and hierarchical clustering involve distance calculations, we need to scale the data to avoid the influence of the units of measurement. Q No: 4 (Correct Answer> Marks: 1/1 Covariance matrix is a mathematical representation of 1 f individ \d covariance bet I 1 f dimensions [ cted ) t\\You Se ebted/l In a covariance matrix, the diagonal values represent the variances of individual attributes, and the off diagonal values represent the covariance of the attributes corresponding to the respective row and column.
Q No: 5 ( :Incorrect Answerru ) Marks: 0/2 If we have 4 components in PCA and the percentage of variance explained by each of them are 10%, 15%, 25%, and 50%, what percentage of variance will be explained by the first principal component? ( You Selected Correct Option The magnitude of the eigen value corresponding to a principal component determines the percentage of variance explained in the data. The principal components are chosen in the descending order of their magnitude. Hence, the first principal component has the highest eigen value and correspondingly explains the highest amount of variation in the data. Q No: 6 (Correct Answer) Marks: 1/1 Feature elimination techniques reduce dimensionality by creating few new variables using the original variables. False ( c D) I\You Se euted—/l Feature extraction techniques reduce dimensionality by creating few new variables using the original variables, while feature elimination techniques involve dropping one or more of the original variables.
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help
Ve o QNo:7 |@rrect AnsweD Marks: 2/z What does measuring the distance between clusters A and B mean in the case of complete linkage? Minimum Distance between pair of records in cluster A and B respectively Maximum Distance between pair of records in cluster A and B respectively (3 cted ) F 1\.You Se ebted/' D ) ntroids of er Aand B Average of ' l n pair of records ster A and B In the case of complete linkage, the distance between 2 clusters is measured as the maximum possible distance between the points in two different clusters. Q No: 8 (Correct Answer> Marks: 1/1 Which of the following linkage methods involves analysis of variance of clusters while combining clusters using the agglomerative approach of clustering? Ward Linkage (3 D) l\You Se ecteq/l The Ward linkage analyzes the variance of clusters. It measures how much the within-cluster sum of squares (WCSS) will increase when one cluster is merged with another and merges those two clusters such that the increase in WCSS is minimum.
Q No: 9 (Correct Answer> Marks: 1/1 The angle between any two symmetric eigenvectors for a given matrix is L P Ae(rees | YC elected ) ) \YJLJ Sa,lw.ud/ Symmetric eigen vectors are orthogonal to each other. So, the angle between them is 90 degrees. Q No: 10 (Correct Answer) Marks: 2/2 For a data matrix X with n rows and p columns, the number of eigenvalues possible for the covariance matrix of X IS YA ( You Selected ) \Y._)Ll bclwtgd/ The covariance matrix of an nxp matrix will have p eigen values.