need to centralize each of the features. From each column, subtract the mean of that
Solve the machine learning problem on Scaled Dot Product Attention using the given dataset (titanic dataset for ml).
1. You need to centralize each of the features. From each column, subtract the mean of that column. Then divide each column by the standard deviation of that column.
2. There is no training in the attention
3. In the testing section, take any test data a.
4. Let’s assume that we have n features.
5. We have to figure out whether a survives or not.
6 and 7 are given in the picture
The scoring formula you've provided appears to be a form of similarity computation between feature vectors x_i in your training dataset and a test feature vector a. The goal is to calculate a score for each training data point x_i based on the similarity between x_i and a, and then use softmax to convert these scores into probabilities for binary classification (where each class corresponds to "survived" or "did not survive").
Here's how the below code works:
calculate_score computes the similarity score between a single training data point x_i and the test data point a based on the formula you provided.
The softmax function converts the similarity scores into probabilities.
predict_survival takes the training dataset and a test data point as input. It calculates the scores for each training data point, applies softmax to obtain probabilities, and then classifies the test data point as "survived" (1) or "did not survive" (0) based on a probability threshold (0.5 in this case).
You can repeat this process for each test data point and calculate the overall accuracy by comparing the predicted labels with the true labels in your test set.
Step by step
Solved in 3 steps