Suppose that when performing attention, we have the following keys and values: Keys: Values: {[-3 0 1],[1 1 -1],[011],[1 0 0]} {[211] [122], [0 3 1], [-1 0 2]} We want to compute the attention embedding using these keys and values for the following query: [2 1 -1] Which of the following is the correct attention embedding? To simplify calculations, replace softmax with argmax. For example, softmax([-1,1,0]) would instead be argmax([-1, 1, 0]) = [0, 1, 0]. [211] [122] [031] [301]

COMPREHENSIVE MICROSOFT OFFICE 365 EXCE
1st Edition
ISBN:9780357392676
Author:FREUND, Steven
Publisher:FREUND, Steven
Chapter3: Working With Large Worksheets, Charting, And What-if Analysis
Section: Chapter Questions
Problem 1.1EYK
icon
Related questions
Question
Suppose that when performing attention, we have the following keys and values:
Keys:
Values:
{[−3 0 1],[1 1 −1],[0 1 1],[100]}
{[211] [122], [031], [-1 0 2]}
We want to compute the attention embedding using these keys and values for the following query:
[2 1 -1]
Which of the following is the correct attention embedding?
To simplify calculations, replace softmax with argmax. For example, softmax([-1,1,0])
would instead be argmax ([-1, 1, 0]) = [0, 1, 0].
[2 1 1]
[122]
[031]
[-3 0 1]
Transcribed Image Text:Suppose that when performing attention, we have the following keys and values: Keys: Values: {[−3 0 1],[1 1 −1],[0 1 1],[100]} {[211] [122], [031], [-1 0 2]} We want to compute the attention embedding using these keys and values for the following query: [2 1 -1] Which of the following is the correct attention embedding? To simplify calculations, replace softmax with argmax. For example, softmax([-1,1,0]) would instead be argmax ([-1, 1, 0]) = [0, 1, 0]. [2 1 1] [122] [031] [-3 0 1]
Expert Solution
Step 1

Solution:

 

To compute the attention embedding, we need to first calculate the attention weights, which are computed as the dot product between the query and each key vector, followed by a softmax operation to normalize the results. Then, we compute the weighted sum of the value vectors using the attention weights as the weights.

trending now

Trending now

This is a popular solution!

steps

Step by step

Solved in 2 steps

Blurred answer
Knowledge Booster
SQL Query
Learn more about
Need a deep-dive on the concept behind this application? Look no further. Learn more about this topic, computer-science and related others by exploring similar questions and additional content below.
Similar questions
  • SEE MORE QUESTIONS
Recommended textbooks for you
COMPREHENSIVE MICROSOFT OFFICE 365 EXCE
COMPREHENSIVE MICROSOFT OFFICE 365 EXCE
Computer Science
ISBN:
9780357392676
Author:
FREUND, Steven
Publisher:
CENGAGE L
Programming with Microsoft Visual Basic 2017
Programming with Microsoft Visual Basic 2017
Computer Science
ISBN:
9781337102124
Author:
Diane Zak
Publisher:
Cengage Learning