머신 러닝을 해보자 6장 - 텐서플로우 기초

텐서란? (Tensor)

선형대수학에서 사용하는 수학적 대상을 텐서(Tensor)라고 한다.

스칼라, 벡터, 행렬, n-차원 배열등을 일반화(Generalization)한 개념이다.

19세기 미분 기하학에서 처음 도입하였으며, 물리학, 공학을 비롯한 다양한 학문에서 이용된다.

랭크 (Rank)

텐서에는 Rank라는 개념이 존재하며 텐서의 차원 수를 의미한다.

Rank를 Order라고도 한다.

낮은 Rank부터 연산이 정의되어 더 높은 Rank로 확장되기 때문이다.

Rank	데이터 타입
0	스칼라 (0-order-tensor)
1	벡터 (1-order-tensor)
2	행렬 (2-order-tensor)
3	(3-order-tensor)
…	…
n	(n-order-tensor)

Tensor 예제

Tensor를 실험하기 위해 머신 러닝 프레임워크인 Tensorflow를 사용해보자.

아래는 Tensorflow + Keras로 Linear Function을 표현하였다.

import tensorflow as tf
from tensorflow.keras.layers import Dense 

x = tf.constant([[10., 20.], [30., 40.], [50., 60.]])

dense = Dense(units = 1) # Linear Function

y = dense(x) # Initialize W & Feed Forward
W, b = dense.get_weights()

print(f'y = x﹒{W} + {b}')
print(f'x.shape: {x.shape} W.shape: {W.shape} B.shape: {b.shape}')
print(y)

y = x﹒[[0.08496201]
 [0.19183493]] + [0.]
x.shape: (3, 2) W.shape: (2, 1) B.shape: (1,)
tf.Tensor(
[[ 4.6863184]
 [10.222258 ]
 [15.758196 ]], shape=(3, 1), dtype=float32)

머신 러닝에 등장하는 x(입력), W(가중치), b(바이어스) 모두 행렬 또는 벡터의 연산이며,

다시 말해 텐서의 연산이라고 볼 수 있다.

상수 텐서 선언

상수 텐서는 tf.constant 함수로 선언할 수 있다.

상수 텐서는 연산을 진행하는 동안 ‘텐서 객체의 값’이 변하지 않는다.

a = tf.constant(10)을 생각해보자.

여기서 바뀌지 않는 것은 tf.Tensor 객체이다. a는 지역 변수이므로 바뀔 수 있다.

a가 tf.Tensor가 되는게 아니라 a는 레퍼런스 변수로써 tf.Tensor를 가리키고 있는 것이다. (포인터 개념)

좀 더 자세히 알고 싶다면 포인터를 직접 사용하는 C언어나 C++ 언어를 접해보길 권장한다.

import tensorflow as tf

# RANK-0
a = tf.constant(10.)
b = tf.constant(-5.)
c = a + b
d = a * b
e = a - b
f = a / b
print('rank-0')
print('a', a)
print('b', b)
print('c', c)
print('d', d)
print('e', e)
print('f', f)
print()

# RANK-1
a = tf.constant([5., -4.])
b = tf.constant([-2., -3.])
c = a + b
d = a * b
e = a - b
f = a / b
g = tf.tensordot(a, b, axes=1)
print('rank-1')
print('a', a)
print('b', b)
print('c', c)
print('d', d)
print('e', e)
print('f', f)
print('g', g)
print()

# RANK-2
a = tf.constant([[1., 2.], [3., 4.]])
b = tf.constant([[5., 6.], [7., 8.]])
c = a + b
d = a * b # 아다마르 프로덕트
e = a - b
f = a / b
g = tf.linalg.matmul(a, b) # 도트 프로덕트
print('rank-2')
print('a', a)
print('b', b)
print('c', c)
print('d', d)
print('e', e)
print('f', f)
print('g', g)
print()

rank-0
a tf.Tensor(10.0, shape=(), dtype=float32)
b tf.Tensor(-5.0, shape=(), dtype=float32)
c tf.Tensor(5.0, shape=(), dtype=float32)
d tf.Tensor(-50.0, shape=(), dtype=float32)
e tf.Tensor(15.0, shape=(), dtype=float32)
f tf.Tensor(-2.0, shape=(), dtype=float32)

rank-1
a tf.Tensor([ 5. -4.], shape=(2,), dtype=float32)
b tf.Tensor([-2. -3.], shape=(2,), dtype=float32)
c tf.Tensor([ 3. -7.], shape=(2,), dtype=float32)
d tf.Tensor([-10.  12.], shape=(2,), dtype=float32)
e tf.Tensor([ 7. -1.], shape=(2,), dtype=float32)
f tf.Tensor([-2.5        1.3333334], shape=(2,), dtype=float32)
g tf.Tensor(2.0, shape=(), dtype=float32)

rank-2
a tf.Tensor(
[[1. 2.]
 [3. 4.]], shape=(2, 2), dtype=float32)
b tf.Tensor(
[[5. 6.]
 [7. 8.]], shape=(2, 2), dtype=float32)
c tf.Tensor(
[[ 6.  8.]
 [10. 12.]], shape=(2, 2), dtype=float32)
d tf.Tensor(
[[ 5. 12.]
 [21. 32.]], shape=(2, 2), dtype=float32)
e tf.Tensor(
[[-4. -4.]
 [-4. -4.]], shape=(2, 2), dtype=float32)
f tf.Tensor(
[[0.2        0.33333334]
 [0.42857143 0.5       ]], shape=(2, 2), dtype=float32)
g tf.Tensor(
[[19. 22.]
 [43. 50.]], shape=(2, 2), dtype=float32)

Tensorflow의 Tensor는 numpy 호환 배열과, .shape, .dtype을 가지고 있다.

Tensorflow 2.0부터 Session 방식을 사용하지 않기 떄문에, Tensor를 평가하기 위해 .eval() 대신 .numpy()를 사용한다.

초기화 함수

초기화 함수를 이용하여 영백터, 영행렬, 단위행렬, 대각행렬, 난수텐서 등을 생성할 수 있다.

import tensorflow as tf

a = tf.zeros(2) # 2d O-vector
print('a', a)

b = tf.ones((4, 4)) # 4x4 Matrix
print('b', b)

c = tf.eye(4) # 4x4 Identify Matrix
print('c', c)

d = tf.fill((3, 2), value=5.) # 3x2 Matrix
print('d', d)

e = tf.linalg.diag(tf.range(1, 5, 1)) # Diagonal Matrix
print('e', e)

f = tf.random.normal((2, 2), mean=0, stddev=1) # Normal Distribution
print('f', f)

g = tf.random.uniform((2, 2), minval=-2, maxval=2) # Uniform Distribution
print('g', g)

a tf.Tensor([0. 0.], shape=(2,), dtype=float32)
b tf.Tensor(
[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]], shape=(4, 4), dtype=float32)
c tf.Tensor(
[[1. 0. 0. 0.]
 [0. 1. 0. 0.]
 [0. 0. 1. 0.]
 [0. 0. 0. 1.]], shape=(4, 4), dtype=float32)
d tf.Tensor(
[[5. 5.]
 [5. 5.]
 [5. 5.]], shape=(3, 2), dtype=float32)
e tf.Tensor(
[[1 0 0 0]
 [0 2 0 0]
 [0 0 3 0]
 [0 0 0 4]], shape=(4, 4), dtype=int32)
f tf.Tensor(
[[ 0.26831564 -1.0274279 ]
 [-0.26385054 -2.047377  ]], shape=(2, 2), dtype=float32)
g tf.Tensor(
[[-1.0781832   1.899333  ]
 [-0.43762493 -1.9079366 ]], shape=(2, 2), dtype=float32)

텐서 형상 조작

tf.reshape 함수를 이용하면 다른 형상의 텐서로 바꿀수 있다.

import tensorflow as tf

a = tf.constant([[1, 2], [3, 4], [5, 6]])
print(f'a: {a}, a.shape: {a.shape}')
b = tf.reshape(a, (6, )) # flatten
print(f'b: {b}, b.shape: {b.shape}')
c = tf.reshape(b, (2, 3))
print(f'c: {c}, c.shape: {c.shape}')

a: [[1 2]
 [3 4]
 [5 6]], a.shape: (3, 2)
b: [1 2 3 4 5 6], b.shape: (6,)
c: [[1 2 3]
 [4 5 6]], c.shape: (2, 3)

브로드캐스팅

브로드캐스팅은 numpy에서 도입된 개념으로 행렬과 서로 다른 크기의 벡터를 더할 때 쓰인다.

본래 선형대수에서 정의되지 않지만 벡터를 브로드캐스트(확산)하여 계산할 수 있는 형태로 만든 것이다.

https://i.ibb.co/5YZ6Gsk/2022-03-03-1-37-24.png

import tensorflow as tf

a = tf.ones((2, 2))
print('a', a)

b = tf.ones(1)
print('b', b)

print('a+b', a + b)

a tf.Tensor(
[[1. 1.]
 [1. 1.]], shape=(2, 2), dtype=float32)
b tf.Tensor([1.], shape=(1,), dtype=float32)
a+b tf.Tensor(
[[2. 2.]
 [2. 2.]], shape=(2, 2), dtype=float32)

텐서 변수

tf.Variable 객체는 tf.Tensor를 담고있는 변수 역할을 하는 객체이다.

.assign() 함수를 통해 다른 tf.Tensor로 배정할 수 있다.

공학적인 측면에서 생각해보자.

a = a = tf.Variable(tf.constant([1., 2.]))의 경우 레퍼런스 구조가 a -> tf.Variable -> tf.Tensor 된다.

tf.Variable이 변수 처럼 동작하는 이유는 tf.Variable의 레퍼런스를 교체할 수 있기 때문이다.

import tensorflow as tf

a = tf.constant([1., 2.])
v_a = tf.Variable(a)

print(v_a)

v_a.assign([3., 4.])

print(v_a)

1 2	<tf.Variable 'Variable:0' shape=(2,) dtype=float32, numpy=array([1., 2.], dtype=float32)> <tf.Variable 'Variable:0' shape=(2,) dtype=float32, numpy=array([3., 4.], dtype=float32)>

머신 러닝에서 업데이트 되는 가중치를 표현할 수 있다.

import tensorflow as tf

w_init = tf.transpose(tf.constant([[1., 2.]]))
W = tf.Variable(w_init)
x = tf.constant([[2., 3.]])

y = tf.linalg.matmul(x, W)

print(y)

w_add = tf.constant([1.])
W.assign(W + w_add) # 각 가중치 값에 +1

y = tf.linalg.matmul(x, W)

print(y)

1 2	tf.Tensor([[8.]], shape=(1, 1), dtype=float32) tf.Tensor([[13.]], shape=(1, 1), dtype=float32)