einsum 연산을 통해 행렬, 벡터의 내적, 외적, 전치, 행렬곱등을 일관성있게 표현할 수 있다.

해당 코드는 전반적으로 아래 게시물을 참조하여 정리하였다.

Einsum 사용하기

Torch나 Tensorflow로 짜여진 코드들을 보다보면 einsum() 연산이 포함되어 있는 경우를 볼 수 있습니다. 아주 가끔 보이는 방법이라 보일때마다 해석하는 법을 찾아보고는 했는데, 이번에 살펴보았던

baekyeongmin.github.io

einsum examples

Einsum¶

Einstein Summation Convention¶

특정 index의 집합에 대한 합(시그마)연산을 간결하게 표시하는 방법

행렬, 벡터의 내적, 외적, 전치, 행렬곱 표현

Einstein Notation¶

$A_{ik} \cdot B_{kj}$를 하면 output dimension은 $[I,J]$
이후 i에 대해 summation

Einstein Notation(우변) 에서는 다음의 경우 sigma기호를 생략¶

반복적으로 합산되는데 이용되는 index(k)에 관련된 sigma
최종 결과 값 $C_j$에 명시되지 않은 index(i)에 관련된 sigma

간단한 벡터 연산¶

$A, B \in \mathbb{R}^I$ 의 내적 (dot product)

$A, B \in \mathbb{R}^I$ 의 외적 (outer product)

복잡한 행렬 연산¶

3차원 텐서 $T \in \mathbb{R}^{N\times T\times K}$의 마지막 차원 $K$에 대해 $W\in \mathbb{R}^{K\times Q}$을 이용하여 projection
ex) 배치 크기 N, 시퀀스 길이 K, 단어 벡터 임베딩 K을 다른 차원(Q)로 projection 시키는 경우

4차원 텐서 $T \in \mathbb{R}^{N\times T\times K\times M}$에 대해
1) $T$의 3번째 차원을 위에서 정의한 $W$를 이용해 projection
2) 2번째 차원에 대해 합 진행
3) 1번째 차원과 마지막 차원 Transpose

Apply to Numpy, Pytorch, Tensorflow¶

numpy : np.einsum
torch : torch.einsum
tensorflow : tf. einsum
인자로 equation과 operands를 받음

eqaution(string)¶

operand의 각 index에 대응하는 소문자로 구성된 식
"->"를 기준으로 왼쪽, 오른쪽의 의미가 달라짐
왼쪽에는 operand들의 차원을 나열한 것으로 ","를 기준으로 구분
오른쪽에는 출력값의 차원 인덱스들을 나타냄 (출력값에 표현되지 않은 인덱스들은 oeprand들을 곱한 후 해당 인덱스를 기준으로 더해짐)

operand(Tensor)¶

해당 연산을 수행할 연산들

Example¶

In [11]:

import torch

Transpose¶

In [13]:

# 2-D Matrix Transpose
A = torch.randint(1, 10, (2,2))
print(A)
B = torch.einsum("ij->ji",A)
print(B)

tensor([[5, 2],
        [1, 6]])
tensor([[5, 1],
        [2, 6]])

Sum¶

In [14]:

# 2-D Matrix Transpose
A = torch.randint(1, 10, (2,2))
print(A)
B = torch.einsum("ij->",A)
print(B)

tensor([[4, 7],
        [3, 1]])
tensor(15)

Column/Row Sum¶

In [16]:

A = torch.randint(1,10,(2,2))
print(A)
B = torch.einsum('ij->j',A)
print(B)

tensor([[3, 4],
        [6, 5]])
tensor([9, 9])

In [17]:

A = torch.randint(1,10,(2,2))
print(A)
B = torch.einsum('ij->i', A)
print(B)

tensor([[9, 3],
        [6, 8]])
tensor([12, 14])

Matrix - Matrix Multiplication¶

In [30]:

A = torch.randint(0,10,(2,3))
B = torch.randint(0,10,(3,))
print(A)
print(B)

C = torch.einsum('ij,j->i',A,B)
print(C)

tensor([[1, 4, 6],
        [6, 2, 1]])
tensor([6, 6, 4])
tensor([54, 52])

In [31]:

A = torch.randint(0,10,(2,3))
B = torch.randint(0,10,(3,2))
print(A)
print(B)

C = torch.einsum('ij,jk->ik',A,B)
print(C)

tensor([[7, 1, 0],
        [2, 2, 2]])
tensor([[6, 6],
        [0, 0],
        [4, 9]])
tensor([[42, 42],
        [20, 30]])

Dot/Outer/Hadamard Product¶

In [36]:

A = torch.arange(0,4)
B = torch.arange(2,6)
print(A)
print(B)

C = torch.einsum('i,i->',A,B)
print(C)

tensor([0, 1, 2, 3])
tensor([2, 3, 4, 5])
tensor(26)

In [44]:

A = torch.arange(6)
B = torch.arange(6)
print(A)
print(B)

C = torch.einsum('i,j->ij',A,B)
print(C)

tensor([0, 1, 2, 3, 4, 5])
tensor([0, 1, 2, 3, 4, 5])
tensor([[ 0,  0,  0,  0,  0,  0],
        [ 0,  1,  2,  3,  4,  5],
        [ 0,  2,  4,  6,  8, 10],
        [ 0,  3,  6,  9, 12, 15],
        [ 0,  4,  8, 12, 16, 20],
        [ 0,  5, 10, 15, 20, 25]])

In [45]:

A = torch.arange(6).resize(2,3)
B = torch.arange(6).resize(2,3)
print(A)
print(B)

C = torch.einsum('ij,ij->ij',A,B)
print(C)

tensor([[0, 1, 2],
        [3, 4, 5]])
tensor([[0, 1, 2],
        [3, 4, 5]])
tensor([[ 0,  1,  4],
        [ 9, 16, 25]])

Batch Matrix Multiplication¶

Batch 단위의 행렬곱 연산을 진행하고 결과 확인

In [47]:

i,j,k,l = 2,1,2,3

A = torch.randint(0,10,(i,j,k))
print(A.shape)
print(A)

B = torch.randint(0,10,(i,k,l))
print(B.shape)
print(B)

C = torch.einsum('ijk,ikl->ijl',A,B)

torch.Size([2, 1, 2])
tensor([[[3, 2]],

        [[3, 6]]])
torch.Size([2, 2, 3])
tensor([[[0, 9, 7],
         [5, 8, 0]],

        [[3, 0, 0],
         [4, 7, 8]]])

Bilinear Transformation¶

두 개 이상의 텐서를 operands 입력으로 받을 수 있다.
그 예로 bilinear transformation

In [53]:

i, j, k, l= 2,3,2,2
A = torch.randint(0,10, (i,k))
X = torch.randint(0,10, (j,k,l))
B = torch.randint(0,10, (i,l))

print(A.shape)
print(A)
print(X.shape)
print(X)
print(B.shape)
print(B)

D = torch.einsum('ik,jkl,il->ij',A,X,B)

torch.Size([2, 2])
tensor([[6, 7],
        [0, 3]])
torch.Size([3, 2, 2])
tensor([[[5, 3],
         [6, 0]],

        [[6, 5],
         [2, 3]],

        [[4, 8],
         [1, 4]]])
torch.Size([2, 2])
tensor([[3, 5],
        [0, 3]])

In [ ]:

저작자표시

[python] einsum 관련 함수