Tensorflow深度学习之二十二:AlexNet的实现(CIFAR-10数据集)

一、模型
模型向前向后传播时间的计算请参考:Tensorflow深度学习之十:Tensorflow实现经典卷积神经网络AlexNet

二、工程结构
由于我自己训练的机器内存显存不足,不能一次性读取10000张图片,因此,在这之前我按照图片的类别,将每一张图片都提取了出来,保存成了jpg格式。与此同时,在保存图片的过程中,存储了一个python的dict结构,键为每一张图片的相对地址,值为每一张图片对应的类别,将这个dict结构保存成npy文件。每一张jpg图片的大小为32*32,而AlexNet需要的输入为224*224,所以在读取图片的时候需要使用cv2.resize进行图片分辨率的调整。

分别对训练集和测试集做以上操作。得到的工程目录如下所示:
每个文件和文件夹的作用显示如下:

AlexNet文件夹 保存相关日志的文件夹 cifar-10-python文件夹 保存CIFAR-10数据集的源文件 data\test 测试集数据 data\train 训练集数据,按照标签分成十类,分别存储在0~9的文件夹内,test文件夹也是一样 model文件夹 保存模型的目录 AlexNet.py 建立AlexNet网络结构和训练 AlexNetPrediction.py 使用训练好的模型进行预测 label.npy 保存训练集的文件名与标签的文件,是一个dict test-label.npy 保存测试集的文件名与标签的文件,是一个dict

三,训练代码

1import tensorflow as tf 2import numpy as np 3import random 4import cv2 5 6# 将传入的label转换成one hot的形式。 7def getOneHotLabel(label, depth): 8 m = np.zeros([len(label), depth]) 9 for i in range(len(label)): 10 m[i][label[i]] = 1 11 return m 12 13# 建立神经网络。 14def alexnet(image, keepprob=0.5): 15 16 # 定义卷积层1,卷积核大小,偏置量等各项参数参考下面的程序代码,下同。 17 with tf.name_scope("conv1") as scope: 18 kernel = tf.Variable(tf.truncated_normal([11, 11, 3, 64], dtype=tf.float32, stddev=1e-1, name="weights")) 19 conv = tf.nn.conv2d(image, kernel, [1, 4, 4, 1], padding="SAME") 20 biases = tf.Variable(tf.constant(0.0, dtype=tf.float32, shape=[64]), trainable=True, name="biases") 21 bias = tf.nn.bias_add(conv, biases) 22 conv1 = tf.nn.relu(bias, name=scope) 23 24 pass 25 26 # LRN27 lrn1 = tf.nn.lrn(conv1, 4, bias=1.0, alpha=0.001/9, beta=0.75, name="lrn1") 28 29 # 最大池化层 30 pool1 = tf.nn.max_pool(lrn1, ksize=[1,3,3,1], strides=[1,2,2,1],padding="VALID", name="pool1") 31 32 # 定义卷积层2 33 with tf.name_scope("conv2") as scope: 34 kernel = tf.Variable(tf.truncated_normal([5,5,64,192], dtype=tf.float32, stddev=1e-1, name="weights")) 35 conv = tf.nn.conv2d(pool1, kernel, [1, 1, 1, 1], padding="SAME") 36 biases = tf.Variable(tf.constant(0.0, dtype=tf.float32, shape=[192]), trainable=True, name="biases") 37 bias = tf.nn.bias_add(conv, biases) 38 conv2 = tf.nn.relu(bias, name=scope) 39 pass 40 41 # LRN42 lrn2 = tf.nn.lrn(conv2, 4, bias=1.0, alpha=0.001 / 9, beta=0.75, name="lrn2") 43 44 # 最大池化层 45 pool2 = tf.nn.max_pool(lrn2, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding="VALID", name="pool2") 46 47 # 定义卷积层3 48 with tf.name_scope("conv3") as scope: 49 kernel = tf.Variable(tf.truncated_normal([3,3,192,384], dtype=tf.float32, stddev=1e-1, name="weights")) 50 conv = tf.nn.conv2d(pool2, kernel, [1, 1, 1, 1], padding="SAME") 51 biases = tf.Variable(tf.constant(0.0, dtype=tf.float32, shape=[384]), trainable=True, name="biases") 52 bias = tf.nn.bias_add(conv, biases) 53 conv3 = tf.nn.relu(bias, name=scope) 54 pass 55 56 # 定义卷积层4 57 with tf.name_scope("conv4") as scope: 58 kernel = tf.Variable(tf.truncated_normal([3,3,384,256], dtype=tf.float32, stddev=1e-1, name="weights")) 59 conv = tf.nn.conv2d(conv3, kernel, [1, 1, 1, 1], padding="SAME") 60 biases = tf.Variable(tf.constant(0.0, dtype=tf.float32, shape=[256]), trainable=True, name="biases") 61 bias = tf.nn.bias_add(conv, biases) 62 conv4 = tf.nn.relu(bias, name=scope) 63 pass 64 65 # 定义卷积层5 66 with tf.name_scope("conv5") as scope: 67 kernel = tf.Variable(tf.truncated_normal([3,3,256,256], dtype=tf.float32, stddev=1e-1, name="weights")) 68 conv = tf.nn.conv2d(conv4, kernel, [1, 1, 1, 1], padding="SAME") 69 biases = tf.Variable(tf.constant(0.0, dtype=tf.float32, shape=[256]), trainable=True, name="biases") 70 bias = tf.nn.bias_add(conv, biases) 71 conv5 = tf.nn.relu(bias, name=scope) 72 pass 73 74 # 最大池化层 75 pool5 = tf.nn.max_pool(conv5, ksize=[1,3,3,1], strides=[1,2,2,1], padding="VALID", name="pool5") 76 77 # 全连接层 78 flatten = tf.reshape(pool5, [-1, 6*6*256]) 79 80 weight1 = tf.Variable(tf.truncated_normal([6*6*256, 4096], mean=0, stddev=0.01)) 81 82 fc1 = tf.nn.sigmoid(tf.matmul(flatten, weight1)) 83 84 dropout1 = tf.nn.dropout(fc1, keepprob) 85 86 weight2 = tf.Variable(tf.truncated_normal([4096, 4096], mean=0, stddev=0.01)) 87 88 fc2 = tf.nn.sigmoid(tf.matmul(dropout1, weight2)) 89 90 dropout2 = tf.nn.dropout(fc2, keepprob) 91 92 weight3 = tf.Variable(tf.truncated_normal([4096, 10], mean=0, stddev=0.01)) 93 94 fc3 = tf.nn.sigmoid(tf.matmul(dropout2, weight3)) 95 96 return fc3 97 98 99def alexnet_main(): 100 # 加载使用的训练集文件名和标签。 101 files = np.load("label.npy", encoding='bytes')[()] 102 103 # 提取文件名。 104 keys = [i for i in files] 105 106 print(len(keys)) 107 108 myinput = tf.placeholder(dtype=tf.float32, shape=[None, 224, 224, 3], name='input') 109 mylabel = tf.placeholder(dtype=tf.float32, shape=[None, 10], name='label') 110 111 # 建立网络,keepprob为0.6112 myoutput = alexnet(myinput, 0.6) 113 114 # 定义训练的loss函数。 115 loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=myoutput, labels=mylabel)) 116 117 # 定义优化器,学习率设置为0.09,学习率可以设置为其他的数值。 118 optimizer = tf.train.GradientDescentOptimizer(learning_rate=0.09).minimize(loss) 119 120 # 定义准确率 121 valaccuracy = tf.reduce_mean( 122 tf.cast( 123 tf.equal( 124 tf.argmax(myoutput, 1), 125 tf.argmax(mylabel, 1)), 126 tf.float32)) 127 128 # tensorflow的saver,可以用于保存模型。 129 saver = tf.train.Saver() 130 init = tf.global_variables_initializer() 131 with tf.Session() as sess: 132 sess.run(init) 133 # 40个epoch 134 for loop in range(40): 135 136 # 生成并打乱训练集的顺序。 137 indices = np.arange(50000) 138 random.shuffle(indices) 139 140 # batch size此处定义为200141 # 训练集一共50000张图片,前40000张用于训练,后10000张用于验证集。 142 for i in range(0, 0+40000, 200): 143 photo = [] 144 label = [] 145 for j in range(0, 200): 146 # print(keys[indices[i + j]]) 147 photo.append(cv2.resize(cv2.imread(keys[indices[i + j]]), (224, 224))/225) 148 label.append(files[keys[indices[i + j]]]) 149 m = getOneHotLabel(label, depth=10) 150 a, b = sess.run([optimizer, loss], feed_dict={myinput: photo, mylabel: m}) 151 print("\r%lf"%b, end='') 152 153 acc = 0 154 # 每次取验证集的200张图片进行验证,返回这200张图片的正确率。 155 for i in range(40000, 40000+10000, 200): 156 photo = [] 157 label = [] 158 for j in range(i, i + 200): 159 photo.append(cv2.resize(cv2.imread(keys[indices[j]]), (224, 224))/225) 160 label.append(files[keys[indices[j]]]) 161 m = getOneHotLabel(label, depth=10) 162 acc += sess.run(valaccuracy, feed_dict={myinput: photo, mylabel: m}) 163 # 输出,一共有50次验证集数据相加,所以需要除以50164 print("Epoch ", loop, ': validation rate: ', acc/50) 165 # 保存模型。 166 saver.save(sess, "model/model.ckpt") 167 168if __name__ == '__main__': 169 alexnet_main() 170

以下为结果的部分输出:

150000 21.781297Epoch 0 : validation rate: 0.562699974775 31.775934Epoch 1 : validation rate: 0.547099971175 41.768913Epoch 2 : validation rate: 0.52679997623 51.719084Epoch 3 : validation rate: 0.548099977374 61.721695Epoch 4 : validation rate: 0.562299972177 71.745009Epoch 5 : validation rate: 0.56409997642 81.746290Epoch 6 : validation rate: 0.612299977541 91.726248Epoch 7 : validation rate: 0.574799978137 101.735083Epoch 8 : validation rate: 0.617399973869 111.722523Epoch 9 : validation rate: 0.61839998126 121.712282Epoch 10 : validation rate: 0.643999977112 131.697912Epoch 11 : validation rate: 0.63789998889 141.708088Epoch 12 : validation rate: 0.641699975729 151.716783Epoch 13 : validation rate: 0.64499997735 161.718689Epoch 14 : validation rate: 0.664099971056 171.712452Epoch 15 : validation rate: 0.659299976826 181.699410Epoch 16 : validation rate: 0.666799970865 191.682442Epoch 17 : validation rate: 0.660699977875 201.650028Epoch 18 : validation rate: 0.673199976683 211.662869Epoch 19 : validation rate: 0.692699990273 221.652857Epoch 20 : validation rate: 0.687699975967 231.672175Epoch 21 : validation rate: 0.710799975395 241.662848Epoch 22 : validation rate: 0.707699980736 251.653844Epoch 23 : validation rate: 0.708999979496 261.636483Epoch 24 : validation rate: 0.736199990511 271.658812Epoch 25 : validation rate: 0.688499983549 281.658808Epoch 26 : validation rate: 0.748899987936 291.642705Epoch 27 : validation rate: 0.751199992895 301.609915Epoch 28 : validation rate: 0.742099983692 311.610037Epoch 29 : validation rate: 0.757699984312 321.647516Epoch 30 : validation rate: 0.771899987459 331.615854Epoch 31 : validation rate: 0.762699997425 341.598617Epoch 32 : validation rate: 0.785299996138 351.579349Epoch 33 : validation rate: 0.791699982882 361.615915Epoch 34 : validation rate: 0.780799984932 371.586894Epoch 35 : validation rate: 0.790699990988 381.573043Epoch 36 : validation rate: 0.799299983978 391.580690Epoch 37 : validation rate: 0.812399986982 401.598764Epoch 38 : validation rate: 0.824699985981 411.566866Epoch 39 : validation rate: 0.821999987364 42

在实际的训练过程中,我进行了多次训练,每次在前一模型的基础上调整学习率继续进行训练。最后的loss值可以下降到1.31.4,验证集的正确率可以到0.960.97。

四、预测代码
预测代码:

1import tensorflow as tf 2import numpy as np 3import random 4import cv2 5 6def getOneHotLabel(label, depth): 7 m = np.zeros([len(label), depth]) 8 for i in range(len(label)): 9 m[i][label[i]] = 1 10 return m 11 12# 建立神经网络 13def alexnet(image, keepprob=0.5): 14 15 # 定义卷积层1,卷积核大小,偏置量等各项参数参考下面的程序代码,下同 16 with tf.name_scope("conv1") as scope: 17 kernel = tf.Variable(tf.truncated_normal([11, 11, 3, 64], dtype=tf.float32, stddev=1e-1, name="weights")) 18 conv = tf.nn.conv2d(image, kernel, [1, 4, 4, 1], padding="SAME") 19 biases = tf.Variable(tf.constant(0.0, dtype=tf.float32, shape=[64]), trainable=True, name="biases") 20 bias = tf.nn.bias_add(conv, biases) 21 conv1 = tf.nn.relu(bias, name=scope) 22 23 pass 24 25 # LRN26 lrn1 = tf.nn.lrn(conv1, 4, bias=1.0, alpha=0.001/9, beta=0.75, name="lrn1") 27 28 # 最大池化层 29 pool1 = tf.nn.max_pool(lrn1, ksize=[1,3,3,1], strides=[1,2,2,1],padding="VALID", name="pool1") 30 31 # 定义卷积层2 32 with tf.name_scope("conv2") as scope: 33 kernel = tf.Variable(tf.truncated_normal([5,5,64,192], dtype=tf.float32, stddev=1e-1, name="weights")) 34 conv = tf.nn.conv2d(pool1, kernel, [1, 1, 1, 1], padding="SAME") 35 biases = tf.Variable(tf.constant(0.0, dtype=tf.float32, shape=[192]), trainable=True, name="biases") 36 bias = tf.nn.bias_add(conv, biases) 37 conv2 = tf.nn.relu(bias, name=scope) 38 pass 39 40 # LRN41 lrn2 = tf.nn.lrn(conv2, 4, bias=1.0, alpha=0.001 / 9, beta=0.75, name="lrn2") 42 43 # 最大池化层 44 pool2 = tf.nn.max_pool(lrn2, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding="VALID", name="pool2") 45 46 # 定义卷积层3 47 with tf.name_scope("conv3") as scope: 48 kernel = tf.Variable(tf.truncated_normal([3,3,192,384], dtype=tf.float32, stddev=1e-1, name="weights")) 49 conv = tf.nn.conv2d(pool2, kernel, [1, 1, 1, 1], padding="SAME") 50 biases = tf.Variable(tf.constant(0.0, dtype=tf.float32, shape=[384]), trainable=True, name="biases") 51 bias = tf.nn.bias_add(conv, biases) 52 conv3 = tf.nn.relu(bias, name=scope) 53 pass 54 55 # 定义卷积层4 56 with tf.name_scope("conv4") as scope: 57 kernel = tf.Variable(tf.truncated_normal([3,3,384,256], dtype=tf.float32, stddev=1e-1, name="weights")) 58 conv = tf.nn.conv2d(conv3, kernel, [1, 1, 1, 1], padding="SAME") 59 biases = tf.Variable(tf.constant(0.0, dtype=tf.float32, shape=[256]), trainable=True, name="biases") 60 bias = tf.nn.bias_add(conv, biases) 61 conv4 = tf.nn.relu(bias, name=scope) 62 pass 63 64 # 定义卷积层5 65 with tf.name_scope("conv5") as scope: 66 kernel = tf.Variable(tf.truncated_normal([3,3,256,256], dtype=tf.float32, stddev=1e-1, name="weights")) 67 conv = tf.nn.conv2d(conv4, kernel, [1, 1, 1, 1], padding="SAME") 68 biases = tf.Variable(tf.constant(0.0, dtype=tf.float32, shape=[256]), trainable=True, name="biases") 69 bias = tf.nn.bias_add(conv, biases) 70 conv5 = tf.nn.relu(bias, name=scope) 71 pass 72 73 # 最大池化层 74 pool5 = tf.nn.max_pool(conv5, ksize=[1,3,3,1], strides=[1,2,2,1], padding="VALID", name="pool5") 75 76 # 全连接层 77 flatten = tf.reshape(pool5, [-1, 6*6*256]) 78 79 weight1 = tf.Variable(tf.truncated_normal([6*6*256, 4096], mean=0, stddev=0.01)) 80 81 fc1 = tf.nn.sigmoid(tf.matmul(flatten, weight1)) 82 83 dropout1 = tf.nn.dropout(fc1, keepprob) 84 85 weight2 = tf.Variable(tf.truncated_normal([4096, 4096], mean=0, stddev=0.01)) 86 87 fc2 = tf.nn.sigmoid(tf.matmul(dropout1, weight2)) 88 89 dropout2 = tf.nn.dropout(fc2, keepprob) 90 91 weight3 = tf.Variable(tf.truncated_normal([4096, 10], mean=0, stddev=0.01)) 92 93 fc3 = tf.nn.sigmoid(tf.matmul(dropout2, weight3)) 94 95 return fc3 96 97 98def alexnet_main(): 99 # 加载测试集的文件名和标签。 100 files = np.load("test-label.npy", encoding='bytes')[()] 101 keys = [i for i in files] 102 print(len(keys)) 103 104 myinput = tf.placeholder(dtype=tf.float32, shape=[None, 224, 224, 3], name='input') 105 mylabel = tf.placeholder(dtype=tf.float32, shape=[None, 10], name='label') 106 myoutput = alexnet(myinput, 0.6) 107 108 prediction = tf.argmax(myoutput, 1) 109 truth = tf.argmax(mylabel, 1) 110 valaccuracy = tf.reduce_mean( 111 tf.cast( 112 tf.equal( 113 prediction, 114 truth), 115 tf.float32)) 116 117 saver = tf.train.Saver() 118 119 with tf.Session() as sess: 120 # 加载训练好的模型,路径根据自己的实际情况调整 121 saver.restore(sess, r"model/model.ckpt") 122 123 cnt = 0 124 for i in range(10000): 125 photo = [] 126 label = [] 127 128 photo.append(cv2.resize(cv2.imread(keys[i]), (224, 224))/225) 129 label.append(files[keys[i]]) 130 m = getOneHotLabel(label, depth=10) 131 a, b= sess.run([prediction, truth], feed_dict={myinput: photo, mylabel: m}) 132 print(a, ' ', b) 133 if a[0] == b[0]: 134 cnt += 1 135 136 print("Epoch ", 1, ': prediction rate: ', cnt / 10000) 137 138if __name__ == '__main__': 139 alexnet_main() 140

预测结果:(这里只显示部分输出结果)

110000 2[3] [3] 3[8] [8] 4[6] [6] 5[4] [4] 6[5] [9] 7[2] [3] 8[9] [9] 9[5] [5] 10[1] [7] 11[3] [4] 12[4] [4] 13[4] [3] 14[9] [9] 15[5] [5] 16[8] [8] 17[3] [8] 18[0] [0] 19[8] [8] 20[7] [7] 21[7] [4] 22[7] [7] 23[5] [5] 24[6] [5] 25... 26[7] [7] 27[3] [3] 28[0] [0] 29[7] [4] 30[6] [2] 31[0] [0] 32[7] [7] 33[2] [5] 34[8] [8] 35[5] [3] 36[5] [5] 37[1] [1] 38[7] [7] 39Epoch 1 : prediction rate: 0.7685 40

五、结果分析
在测试集的表现上,自己训练的AlexNet网络的预测结果达到了0.7685,即76.85%的正确率。相比较LeNet,这个结果好很多,这是因为在网络结构中,使用了更多的卷积操作,可以提取更多的潜在特征。足以证明AlexNet在CIFAR-10数据集上表现比LeNet好。

但是0.7685的正确率还是不是很让人满意,所以后面可以选择继续调整网络的参数,调整网络的结构等手段继续进行网络的训练,或者可以选择使用预训练好的模型进行自己网络的训练,或者可以尝试使用其他更加优秀的网络结构。

接下来的任务是尝试使用GoogleNet模型进行CIFAR-10数据集的求解。

2018年6月13日更新
很多朋友在评论区问我两个npy文件怎么生成的,其实我就是把所有的图片都保存下来,然后把信息提取出来,保存了一下而已。下面是提取信息和保存的代码,非常简单。

1import numpy as np 2import os 3 4 5train_label = {} 6 7for i in range(10): 8 search_path = './data/train/{}'.format(i) 9 file_list = os.listdir(search_path) 10 for file in file_list: 11 train_label[os.path.join(search_path, file)] = i 12 13np.save('label.npy', train_label) 14 15test_label = {} 16 17for i in range(10): 18 search_path = './data/test/{}'.format(i) 19 file_list = os.listdir(search_path) 20 for file in file_list: 21 test_label[os.path.join(search_path, file)] = i 22 23np.save('test-label.npy', test_label) 24

如果目录结构和上面的是一样的话,把这些代码文件放在工程的根目录下面就可以运行,也可以根据自己需要调整,目的可以达到就可以了。

需要的同学可以去这里下载:CIFAR10 label

代码交流 2021