Tensorflow笔记(第二讲)

Tensorflow框架

张量、计算图、会话

基于Tensorflow的NN:用张量表示数据,用计算图搭建神经网络,用会话执行计算图,优化线上的权重(参数),得到模型。

张量 (tensor)

  • 多维数组(列表),可表示 0~n 阶数组
  • 阶:张量的维数
维数 名字 例子
0-D 0 标量 scalar s = 1
1-D 1 向量 vector v = [1, 2, 3]
2-D 2 矩阵 matrix m = [[1, 2, 3], [4, 5, 6]]
n-D n 张量 tensor $t= \underbrace{[ [ [ }_{n个}\cdots$
python2
1
2
3
4
5
6
7
import tensorflow as tf
a = tf.constant([1.0, 2.0])
b = tf.constant([3.0, 4.0])
reuslt = a + b
print result

# Tensor("add:0", shape=(2,), dtype=float32)

计算图 (graph)

  • 搭建神经网络的计算过程,只搭建,不运算
    tensenflower005.jpg

会话(Session)

  • 执行计算图中的节点运算
python2
1
2
3
4
5
6
7
8
9
10
11
import tensorflow as tf
a = tf.constant([1.0, 2.0])
w = tf.constant([3.0, 4.0])
y = tf.matmul(x,w)
print y
# Tensor("matmul:0", shape(1,1), dtype=float32)

with tf.Session() as sess:
print sess.run(y)

# [[11.]]

神经网络实现过程

  1. 准备数据集,提取特征,作为输入喂给神经网络
  2. 搭建NN结构,从输入到输出(先搭建计算图,再用会话执行)
    (NN前向传播算法 $\Rightarrow$ 计算输出)
  3. 大量特征数据喂给NN,迭代优化NN参数
    (NN反向传播算法 $\Rightarrow$ 优化参数训练模型)
  4. 使用训练好的模型预测和分类

前向传播

参数

  • 权重W, 用变量表示,随机给初值
    1
    2
    3
    4
    5
    6
    7
    8
    # W 随机生成方法

    w = tf.Variable(tf.random_normal([2,3], stddev = 2, mean = 0, seed = 1))
    # tf.random_normal 正态分布
    # [2,3] 2*3矩阵
    # stddev 标准差
    # mean 均值
    # seed 随机种子 随机种子相同,生成的随机数相同
函数 作用 示例 结果
tf.truncated_normal() 去过大偏离点的正态分布 - -
tf.random_uniform() 均匀分布 - -
tf.zeros() 全0 数组 tf.zeros([3,2], int32) [[0, 0], [0, 0], [0, 0]]
tf.ones() 全1 数组 tf.ones([3,2], int32) [[1, 1], [1, 1], [1, 1]]
tf.fill() 全定值数组 tf.fill([3,2], 6) [[6, 6], [6, 6], [6, 6]]
tf.constant() 直接给值 tf.constant([3, 2, 1]) [3, 2, 1]

前向传播

  • 搭建模型,实现推理

tensenflower006.jpg

  • $X$ 是输入为 $1×2$ 矩阵
  • $ w^{(k)}_{i,j} $ 为待优化参数:$ i $ 为前节点编号、 $ j $ 后节点编号、 $ k $ 层数

$$
W^{(1)} =
\begin{bmatrix}
w^{(1)}{1,1} & w^{(1)}{1,2} & w^{(1)}{1,3} \
w^{(1)}
{2,1} & w^{(1)}{2,2} & w^{(1)}{2,3} \
\end{bmatrix}
$$

$$ a^{(1)} =[a_{11}, a_{12},a_{13}] = XW^{(1)} $$

$$
W^{(2)} =
\begin{bmatrix}
w^{(2)}{1,1} \
w^{(2)}
{2,1} \
w^{(2)}_{3,1} \
\end{bmatrix}
$$

$$ y=a^{(1)}W^{(1)} $$

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
a = tf.matmul(X, W1)
y = tf.matmul(a, W2)

# 变量初始化、计算图节点运,算都要用会话(with结构)实现
with tf.Session() as sess:
sess.run()

# 所有变量初始化:在sess.run函数中用tf.global_variables_initializer()
with tf.Session() as sess:
init_op = tf.global_variables_initializer()
sess.run(init_op)

# 计算图节点运算:在sess.run函数中写入待运算的节点
with tf.Session() as sess:
sess.run(y)

# 用tf.placeholder占位,在sess.run函数中用feed_dict喂数据
# 喂一组数据:
x = tf.placeholder(tf.float32, shape = (1,2))
with tf.Session() as sess:
sess.run(y, feed_dic = {x: [[0.5, 0.6]]})

# 喂多组数据:
x = tf.placeholder(tf.float32, shape = (None,2))
with tf.Session() as sess:
sess.run(y, feed_dic = {x: [[0.1, 0.2], [0.2, 0.3], [0.3, 0.4], [0.4, 0.5]]})
python2 example1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
# coding:utf-8
# 两层简单神经网络(全连接)

import tensorflow as tf

# 定义输入和参数
x = tf.constant([[0.7, 0.5]])
w1 = tf.Variable(tf.random_normal([2, 3], stddev = 1, seed = 1))
w2 = tf.Variable(tf.random_normal([3, 1], stddev = 1, seed = 1))

# 定义前向传播
a = tf.matmul(x, w1)
y = tf.matmul(a, w2)

# 用会话计算结果
with tf.Session() as sess:
init_op = tf.global_variables_initializer()
sess.run(init_op)
ptint "y is: ",sess.run(y)

# y is [[3.0904665]]
python2 example2
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# coding:utf-8
# 两层简单神经网络(全连接)

import tensorflow as tf

# 定义输入和参数
# 用placeholder 实现输入定义(sess.run 中喂一组数据)
x = tf.placeholder(tf.float32, shape = (1, 2))
w1 = tf.Variable(tf.random_normal([2, 3], stddev = 1, seed = 1))
w2 = tf.Variable(tf.random_normal([3, 1], stddev = 1, seed = 1))

# 定义前向传播
a = tf.matmul(x, w1)
y = tf.matmul(a, w2)

# 用会话计算结果
with tf.Session() as sess:
init_op = tf.global_variables_initializer()
sess.run(init_op)
ptint "y is: ",sess.run(y, feed_dict = {x: [[0.7, 0.5]]})

# y is [[3.0904665]]
python2 example3
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
# coding:utf-8
# 两层简单神经网络(全连接)

import tensorflow as tf

# 定义输入和参数
# 用placeholder 实现输入定义(sess.run 中喂多组数据)
x = tf.placeholder(tf.float32, shape = (None, 2))
w1 = tf.Variable(tf.random_normal([2, 3], stddev = 1, seed = 1))
w2 = tf.Variable(tf.random_normal([3, 1], stddev = 1, seed = 1))

# 定义前向传播
a = tf.matmul(x, w1)
y = tf.matmul(a, w2)

# 用会话计算结果
with tf.Session() as sess:
init_op = tf.global_variables_initializer()
sess.run(init_op)
ptint "y is: \n",sess.run(y, feed_dict = {x: [[0.7, 0.5], [0.2, 0.3], [0.3, 0.4], [0.4, 0.5]]})

# y is
# [[3.0904665]
# [1.2236414]
# [1.72707319]
# [2.23050475]]

反向传播

训练模型参数,在所有参数上用梯度下降,使NN 模型在训练数据上的损失函数最小

损失函数(loss)

预测值与已知答案的差距
均方误差MSE
$$ MSE(y_,y) = \frac{\sum_{i=1}^n(y-y_)^2}{n}$$

1
loss = tf.reduce_mean(tf.square(y_ - y))

反向传播训练方法:以减小loss 值为优化目标

1
2
3
train_step = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss)
train_step = tf.train.MomentumOptimizer(learning_rate, momentum). minimize(loss)
train_step = tf.train.AdamOptimizer(learning_rate).minimize(loss)

学习率:决定参数每次更新幅度 (小一点 0.001)

python2 example4
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
# coding:utf-8
# 0 导入模块,生成模拟数据

import tensorflow as tf
import numpy as np
BATCH_SIZE = 8
seed = 23455

# 基于seed 产生随机数
rng = np.random.RandomState(seed)

# 输入数据集:随机数返回32行2列的矩阵 表示32组 体重和重量
X = rng.rand(32, 2)

# 输入数据标签:从X取出一行,如果和小于1,给Y赋值1; 否则Y赋值0
Y = [[int(x0 + x1 < 1)] for (x0, x1) in X]

# 1 定义神经网络的输入、参数和输出,定义前向传播过程
x = tf.placeholder(tf.float32, shape = (None, 2))
y_ = tf.placeholder(tf.float32, shape = (None, 1))

w1 = tf.Variable(tf.random_normal([2, 3], stddev = 1, seed = 1))
w2 = tf.Variable(tf.random_normal([3, 1], stddev = 1, seed = 1))

a = tf.matmul(x, w1)
y = tf.matmul(a, w2)

# 2 定义损失函数及反向传播方法
loss = tf.reduce_mean(tf.square(y - y_))
train_stpe = tf.train.GradientDescentOptimizer(0.001).minimize(loss)
# train_step = tf.train.MomentumOptimizer(0.001, 0.9). minimize(loss)
# train_step = tf.train.AdamOptimizer(0.001).minimize(loss)


# 3 生成会话,训练STEPS轮
with tf.Session() as sess:
init_op = tf.global_variables_initializer()
sess.run(init_op)
# 输出未训练的参数值
print "w1:\n", sess.run(w1)
print "w2:\n", sess.run(w2)
print "\n"

# 训练模型
STEPS = 3000
for i in range(STEPS):
start = (i * BATCH_SIZE) % 32
end = start + BATCH_SIZE
sess.run(train_step, feed_dict = {x: X[start: end], y_: Y[start, end]})
if i % 500 == 0:
total_loss = sess.run(loss, feed_dict = {x: X, y_: Y})
print("After %d training stap(s), loss on all data is %g" % (i, total_loss))

print "\n"
print "w1:\n", sess.run(w1)
print "w2:\n", sess.run(w2)


# w1:
# [[-0.81131822 1.48459876 0.06532937]
# [-2.4427042 0.0992484 0.59122431]]
# w2:
# [[-0.81131822]
# [ 1.48459876]
# [ 0.06532937]]
#
# After 0 training step(s), loss on all data is 5.13118
# After 500 training step(s), loss on all data is 0.429111
# After 1000 training step(s), loss on all data is 0.409789
# After 1500 training step(s), loss on all data is 0.399923
# After 2000 training step(s), loss on all data is 0.394146
# After 2500 training step(s), loss on all data is 0.390597
#
# w1:
# [[-0.70006633 0.9136318 0.08953571]
# [-2.3402493 -0.14641267 0.58823055]]
# w2:
# [[-0.06024267]
# [ 0.91956186]
# [-0.0682071 ]]

搭建神经网络步骤

1 准备

import
常量定义
生成数据集

2 前向传播:定义输入、参数和输出

x =
y_ =

w1 =
w2 =

a =
y =

3 反向传播:定义损失函数、反向传播方法

loss =
train_step =

4 生成会话,训练STEPS轮

1
2
3
4
5
6
7
8
with tf.Session() as sess:
init_op = tf.global_variables_initializer()
sess.run(init_op)
STEPS =
for i in range(STEPS):
start =
end =
sess.run(train_step, feed_dict)