卷積神經(jīng)網(wǎng)絡(CNN)的基礎(chǔ)介紹見 ,這里主要以代碼實現(xiàn)為主。
CNN是一個多層的神經(jīng)網(wǎng)絡,每層由多個二維平面組成,而每個平面由多個獨立神經(jīng)元組成。
以MNIST作為數(shù)據(jù)庫,仿照LeNet-5和tiny-cnn( ) 設(shè)計一個簡單的7層CNN結(jié)構(gòu)如下:
輸入層Input:神經(jīng)元數(shù)量32*32=1024;
C1層:卷積窗大小5*5,輸出特征圖數(shù)量6,卷積窗種類6,輸出特征圖大小28*28,可訓練參數(shù)(權(quán)值+閾值(偏置))5*5*6+6=150+6,神經(jīng)元數(shù)量28*28*6=4704;
S2層:卷積窗大小2*2,輸出下采樣圖數(shù)量6,卷積窗種類6,輸出下采樣圖大小14*14,可訓練參數(shù)1*6+6=6+6,神經(jīng)元數(shù)量14*14*6=1176;
C3層:卷積窗大小5*5,輸出特征圖數(shù)量16,卷積窗種類6*16=96,輸出特征圖大小10*10,可訓練參數(shù)5*5*(6*16)+16=2400+16,神經(jīng)元數(shù)量10*10*16=1600;
S4層:卷積窗大小2*2,輸出下采樣圖數(shù)量16,卷積窗種類16,輸出下采樣圖大小5*5,可訓練參數(shù)1*16+16=16+16,神經(jīng)元數(shù)量5*5*16=400;
C5層:卷積窗大小5*5,輸出特征圖數(shù)量120,卷積窗種類16*120=1920,輸出特征圖大小1*1,可訓練參數(shù)5*5*(16*120)+120=48000+120,神經(jīng)元數(shù)量1*1*120=120;
輸出層Output:卷積窗大小1*1,輸出特征圖數(shù)量10,卷積窗種類120*10=1200,輸出特征圖大小1*1,可訓練參數(shù)1*(120*10)+10=1200+10,神經(jīng)元數(shù)量1*1*10=10。
下面對實現(xiàn)執(zhí)行過程進行描述說明:
1. 從MNIST數(shù)據(jù)庫中分別獲取訓練樣本和測試樣本數(shù)據(jù):
(1)、原有MNIST庫中圖像大小為28*28,這里縮放為32*32,數(shù)據(jù)值范圍為[-1,1],擴充值均取-1;總共60000個32*32訓練樣本,10000個32*32測試樣本;
(2)、輸出層有10個輸出節(jié)點,在訓練階段,對應位置的節(jié)點值設(shè)為0.8,其它節(jié)點設(shè)為-0.8.
2. 初始化權(quán)值和閾值(偏置):權(quán)值就是卷積圖像,每一個特征圖上的神經(jīng)元共享相同的權(quán)值和閾值,特征圖的數(shù)量等于閾值的個數(shù)
(1)、權(quán)值采用uniform rand的方法初始化;
(2)、閾值均初始化為0.
3. 前向傳播:根據(jù)權(quán)值和閾值,主要計算每層神經(jīng)元的值
(1)、輸入層:每次輸入一個32*32數(shù)據(jù)。
(2)、C1層:分別用每一個5*5的卷積圖像去乘以32*32的圖像,獲得一個28*28的圖像,即對應位置相加再求和,stride長度為1;一共6個5*5的卷積圖像,然后對每一個神經(jīng)元加上一個閾值,最后再通過tanh激活函數(shù)對每一神經(jīng)元進行運算得到最終每一個神經(jīng)元的結(jié)果。
(3)、S2層:對C1中6個28*28的特征圖生成6個14*14的下采樣圖,相鄰四個神經(jīng)元分別進行相加求和,然后乘以一個權(quán)值,再求均值即除以4,然后再加上一個閾值,最后再通過tanh激活函數(shù)對每一神經(jīng)元進行運算得到最終每一個神經(jīng)元的結(jié)果。
(4)、C3層:由S2中的6個14*14下采樣圖生成16個10*10特征圖,對于生成的每一個10*10的特征圖,是由6個5*5的卷積圖像去乘以6個14*14的下采樣圖,然后對應位置相加求和,然后對每一個神經(jīng)元加上一個閾值,最后再通過tanh激活函數(shù)對每一神經(jīng)元進行運算得到最終每一個神經(jīng)元的結(jié)果。
(5)、S4層:由C3中16個10*10的特征圖生成16個5*5下采樣圖,相鄰四個神經(jīng)元分別進行相加求和,然后乘以一個權(quán)值,再求均值即除以4,然后再加上一個閾值,最后再通過tanh激活函數(shù)對每一神經(jīng)元進行運算得到最終每一個神經(jīng)元的結(jié)果。
(6)、C5層:由S4中16個5*5下采樣圖生成120個1*1特征圖,對于生成的每一個1*1的特征圖,是由16個5*5的卷積圖像去乘以16個5*5的下采用圖,然后相加求和,然后對每一個神經(jīng)元加上一個閾值,最后再通過tanh激活函數(shù)對每一神經(jīng)元進行運算得到最終每一個神經(jīng)元的結(jié)果。
(7)、輸出層:即全連接層,輸出層中的每一個神經(jīng)元均是由C5層中的120個神經(jīng)元乘以相對應的權(quán)值,然后相加求和;然后對每一個神經(jīng)元加上一個閾值,最后再通過tanh激活函數(shù)對每一神經(jīng)元進行運算得到最終每一個神經(jīng)元的結(jié)果。
4. 反向傳播:主要計算每層神經(jīng)元、權(quán)值和閾值的誤差,以用來更新權(quán)值和閾值
(1)、輸出層:計算輸出層神經(jīng)元誤差;通過mse損失函數(shù)的導數(shù)函數(shù)和tanh激活函數(shù)的導數(shù)函數(shù)來計算輸出層神經(jīng)元誤差。
(2)、C5層:計算C5層神經(jīng)元誤差、輸出層權(quán)值誤差、輸出層閾值誤差;通過輸出層神經(jīng)元誤差乘以輸出層權(quán)值,求和,結(jié)果再乘以C5層神經(jīng)元的tanh激活函數(shù)的導數(shù),獲得C5層每一個神經(jīng)元誤差;通過輸出層神經(jīng)元誤差乘以C5層神經(jīng)元獲得輸出層權(quán)值誤差;輸出層誤差即為輸出層閾值誤差。
(3)、S4層:計算S4層神經(jīng)元誤差、C5層權(quán)值誤差、C5層閾值誤差;通過C5層權(quán)值乘以C5層神經(jīng)元誤差,求和,結(jié)果再乘以S4層神經(jīng)元的tanh激活函數(shù)的導數(shù),獲得S4層每一個神經(jīng)元誤差;通過S4層神經(jīng)元乘以C5層神經(jīng)元誤差,求和,獲得C5層權(quán)值誤差;C5層神經(jīng)元誤差即為C5層閾值誤差。
(4)、C3層:計算C3層神經(jīng)元誤差、S4層權(quán)值誤差、S4層閾值誤差;
(5)、S2層:計算S2層神經(jīng)元誤差、C3層權(quán)值誤差、C3層閾值誤差;
(6)、C1層:計算C1層神經(jīng)元誤差、S2層權(quán)值誤差、S2層閾值誤差;
(7)、輸入層:計算C1層權(quán)值誤差、C1層閾值誤差.
代碼文件:
CNN.hpp:
#ifndef _CNN_HPP_
#define _CNN_HPP_
#include
#include
namespace ANN {
#define width_image_input_CNN 32 //歸一化圖像寬
#define height_image_input_CNN 32 //歸一化圖像高
#define width_image_C1_CNN 28
#define height_image_C1_CNN 28
#define width_image_S2_CNN 14
#define height_image_S2_CNN 14
#define width_image_C3_CNN 10
#define height_image_C3_CNN 10
#define width_image_S4_CNN 5
#define height_image_S4_CNN 5
#define width_image_C5_CNN 1
#define height_image_C5_CNN 1
#define width_image_output_CNN 1
#define height_image_output_CNN 1
#define width_kernel_conv_CNN 5 //卷積核大小
#define height_kernel_conv_CNN 5
#define width_kernel_pooling_CNN 2
#define height_kernel_pooling_CNN 2
#define size_pooling_CNN 2
#define num_map_input_CNN 1 //輸入層map個數(shù)
#define num_map_C1_CNN 6 //C1層map個數(shù)
#define num_map_S2_CNN 6 //S2層map個數(shù)
#define num_map_C3_CNN 16 //C3層map個數(shù)
#define num_map_S4_CNN 16 //S4層map個數(shù)
#define num_map_C5_CNN 120 //C5層map個數(shù)
#define num_map_output_CNN 10 //輸出層map個數(shù)
#define num_patterns_train_CNN 60000 //訓練模式對數(shù)(總數(shù))
#define num_patterns_test_CNN 10000 //測試模式對數(shù)(總數(shù))
#define num_epochs_CNN 100 //最大迭代次數(shù)
#define accuracy_rate_CNN 0.985 //要求達到的準確率
#define learning_rate_CNN 0.01 //學習率
#define eps_CNN 1e-8
#define len_weight_C1_CNN 150 //C1層權(quán)值數(shù),5*5*6*1=150
#define len_bias_C1_CNN 6 //C1層閾值數(shù),6
#define len_weight_S2_CNN 6 //S2層權(quán)值數(shù),1*6=6
#define len_bias_S2_CNN 6 //S2層閾值數(shù),6
#define len_weight_C3_CNN 2400 //C3層權(quán)值數(shù),5*5*16*6=2400
#define len_bias_C3_CNN 16 //C3層閾值數(shù),16
#define len_weight_S4_CNN 16 //S4層權(quán)值數(shù),1*16=16
#define len_bias_S4_CNN 16 //S4層閾值數(shù),16
#define len_weight_C5_CNN 48000 //C5層權(quán)值數(shù),5*5*16*120=48000
#define len_bias_C5_CNN 120 //C5層閾值數(shù),120
#define len_weight_output_CNN 1200 //輸出層權(quán)值數(shù),120*10=1200
#define len_bias_output_CNN 10 //輸出層閾值數(shù),10
#define num_neuron_input_CNN 1024 //輸入層神經(jīng)元數(shù),32*32=1024
#define num_neuron_C1_CNN 4704 //C1層神經(jīng)元數(shù),28*28*6=4704
#define num_neuron_S2_CNN 1176 //S2層神經(jīng)元數(shù),14*14*6=1176
#define num_neuron_C3_CNN 1600 //C3層神經(jīng)元數(shù),10*10*16=1600
#define num_neuron_S4_CNN 400 //S4層神經(jīng)元數(shù),5*5*16=400
#define num_neuron_C5_CNN 120 //C5層神經(jīng)元數(shù),1*120=120
#define num_neuron_output_CNN 10 //輸出層神經(jīng)元數(shù),1*10=10
class CNN {
public:
CNN();
~CNN();
void init(); //初始化,分配空間
bool train(); //訓練
int predict(const unsigned char* data, int width, int height); //預測
bool readModelFile(const char* name); //讀取已訓練好的BP model
protected:
typedef std::vector wi_connections;
typedef std::vector wo_connections;
typedef std::vector io_connections;> 16) & 255;
ch4 = (i >> 24) & 255;
return((int)ch1 << 24) + ((int)ch2 << 16) + ((int)ch3 << 8) + ch4;
}
static void readMnistImages(std::string filename, double* data_dst, int num_image)
{
const int width_src_image = 28;
const int height_src_image = 28;
const int x_padding = 2;
const int y_padding = 2;
const double scale_min = -1;
const double scale_max = 1;
std::ifstream file(filename, std::ios::binary);
assert(file.is_open());
int magic_number = 0;
int number_of_images = 0;
int n_rows = 0;
int n_cols = 0;
file.read((char*)&magic_number, sizeof(magic_number));
magic_number = reverseInt(magic_number);
file.read((char*)&number_of_images, sizeof(number_of_images));
number_of_images = reverseInt(number_of_images);
assert(number_of_images == num_image);
file.read((char*)&n_rows, sizeof(n_rows));
n_rows = reverseInt(n_rows);
file.read((char*)&n_cols, sizeof(n_cols));
n_cols = reverseInt(n_cols);
assert(n_rows == height_src_image && n_cols == width_src_image);
int size_single_image = width_image_input_CNN * height_image_input_CNN;
for (int i = 0; i < number_of_images; ++i) {
int addr = size_single_image * i;
for (int r = 0; r < n_rows; ++r) {
for (int c = 0; c < n_cols; ++c) {
unsigned char temp = 0;
file.read((char*)&temp, sizeof(temp));
data_dst[addr + width_image_input_CNN * (r + y_padding) + c + x_padding] = (temp / 255.0) * (scale_max - scale_min) + scale_min;
}
}
}
}
static void readMnistLabels(std::string filename, double* data_dst, int num_image)
{
const double scale_max = 0.8;
std::ifstream file(filename, std::ios::binary);
assert(file.is_open());
int magic_number = 0;
int number_of_images = 0;
file.read((char*)&magic_number, sizeof(magic_number));
magic_number = reverseInt(magic_number);
file.read((char*)&number_of_images, sizeof(number_of_images));
number_of_images = reverseInt(number_of_images);
assert(number_of_images == num_image);
for (int i = 0; i < number_of_images; ++i) {
unsigned char temp = 0;
file.read((char*)&temp, sizeof(temp));
data_dst[i * num_map_output_CNN + temp] = scale_max;
}
}
bool CNN::getSrcData()
{
assert(data_input_train && data_output_train && data_input_test && data_output_test);
std::string filename_train_images = "E:/GitCode/NN_Test/data/train-images.idx3-ubyte";
std::string filename_train_labels = "E:/GitCode/NN_Test/data/train-labels.idx1-ubyte";
readMnistImages(filename_train_images, data_input_train, num_patterns_train_CNN);
readMnistLabels(filename_train_labels, data_output_train, num_patterns_train_CNN);
std::string filename_test_images = "E:/GitCode/NN_Test/data/t10k-images.idx3-ubyte";
std::string filename_test_labels = "E:/GitCode/NN_Test/data/t10k-labels.idx1-ubyte";
readMnistImages(filename_test_images, data_input_test, num_patterns_test_CNN);
readMnistLabels(filename_test_labels, data_output_test, num_patterns_test_CNN);
return true;
}
bool CNN::train()
{
out2wi_S2.clear();
out2bias_S2.clear();
out2wi_S4.clear();
out2bias_S4.clear();
in2wo_C3.clear();
weight2io_C3.clear();
bias2out_C3.clear();
in2wo_C1.clear();
weight2io_C1.clear();
bias2out_C1.clear();
calc_out2wi(width_image_C1_CNN, height_image_C1_CNN, width_image_S2_CNN, height_image_S2_CNN, num_map_S2_CNN, out2wi_S2);
calc_out2bias(width_image_S2_CNN, height_image_S2_CNN, num_map_S2_CNN, out2bias_S2);
calc_out2wi(width_image_C3_CNN, height_image_C3_CNN, width_image_S4_CNN, height_image_S4_CNN, num_map_S4_CNN, out2wi_S4);
calc_out2bias(width_image_S4_CNN, height_image_S4_CNN, num_map_S4_CNN, out2bias_S4);
calc_in2wo(width_image_C3_CNN, height_image_C3_CNN, width_image_S4_CNN, height_image_S4_CNN, num_map_C3_CNN, num_map_S4_CNN, in2wo_C3);
calc_weight2io(width_image_C3_CNN, height_image_C3_CNN, width_image_S4_CNN, height_image_S4_CNN, num_map_C3_CNN, num_map_S4_CNN, weight2io_C3);
calc_bias2out(width_image_C3_CNN, height_image_C3_CNN, width_image_S4_CNN, height_image_S4_CNN, num_map_C3_CNN, num_map_S4_CNN, bias2out_C3);
calc_in2wo(width_image_C1_CNN, height_image_C1_CNN, width_image_S2_CNN, height_image_S2_CNN, num_map_C1_CNN, num_map_C3_CNN, in2wo_C1);
calc_weight2io(width_image_C1_CNN, height_image_C1_CNN, width_image_S2_CNN, height_image_S2_CNN, num_map_C1_CNN, num_map_C3_CNN, weight2io_C1);
calc_bias2out(width_image_C1_CNN, height_image_C1_CNN, width_image_S2_CNN, height_image_S2_CNN, num_map_C1_CNN, num_map_C3_CNN, bias2out_C1);
int iter = 0;
for (iter = 0; iter < num_epochs_CNN; iter++) {
std::cout << "epoch: " << iter + 1;
for (int i = 0; i < num_patterns_train_CNN; i++) {
data_single_image = data_input_train + i * num_neuron_input_CNN;
data_single_label = data_output_train + i * num_neuron_output_CNN;
Forward_C1();
Forward_S2();
Forward_C3();
Forward_S4();
Forward_C5();
Forward_output();
Backward_output();
Backward_C5();
Backward_S4();
Backward_C3();
Backward_S2();
Backward_C1();
Backward_input();
UpdateWeights();
}
double accuracyRate = test();
std::cout << ", accuray rate: " << accuracyRate << std::endl;
if (accuracyRate > accuracy_rate_CNN) {
saveModelFile("E:/GitCode/NN_Test/data/cnn.model");
std::cout << "generate cnn model" << std::endl;
break;
}
}
if (iter == num_epochs_CNN) {
saveModelFile("E:/GitCode/NN_Test/data/cnn.model");
std::cout << "generate cnn model" << std::endl;
}
return true;
}
double CNN::activation_function_tanh(double x)
{
double ep = std::exp(x);
double em = std::exp(-x);
return (ep - em) / (ep + em);
}
double CNN::activation_function_tanh_derivative(double x)
{
return (1.0 - x * x);
}
double CNN::activation_function_identity(double x)
{
return x;
}
double CNN::activation_function_identity_derivative(double x)
{
return 1;
}
double CNN::loss_function_mse(double y, double t)
{
return (y - t) * (y - t) / 2;
}
double CNN::loss_function_mse_derivative(double y, double t)
{
return (y - t);
}
void CNN::loss_function_gradient(const double* y, const double* t, double* dst, int len)
{
for (int i = 0; i < len; i++) {
dst[i] = loss_function_mse_derivative(y[i], t[i]);
}
}
double CNN::dot_product(const double* s1, const double* s2, int len)
{
double result = 0.0;
for (int i = 0; i < len; i++) {
result += s1[i] * s2[i];
}
return result;
}
bool CNN::muladd(const double* src, double c, int len, double* dst)
{
for (int i = 0; i < len; i++) {
dst[i] += (src[i] * c);
}
return true;
}
int CNN::get_index(int x, int y, int channel, int width, int height, int depth)
{
assert(x >= 0 && x < width);
assert(y >= 0 && y < height);
assert(channel >= 0 && channel < depth);
return (height * channel + y) * width + x;
}
void CNN::calc_out2wi(int width_in, int height_in, int width_out, int height_out, int depth_out, std::vector& out2wi)
{
for (int i = 0; i < depth_out; i++) {
int block = width_in * height_in * i;
for (int y = 0; y < height_out; y++) {
for (int x = 0; x < width_out; x++) {
int rows = y * width_kernel_pooling_CNN;
int cols = x * height_kernel_pooling_CNN;
wi_connections wi_connections_;
std::pair pair_;
for (int m = 0; m < width_kernel_pooling_CNN; m++) {
for (int n = 0; n < height_kernel_pooling_CNN; n++) {
pair_.first = i;
pair_.second = (rows + m) * width_in + cols + n + block;
wi_connections_.push_back(pair_);
}
}
out2wi.push_back(wi_connections_);
}
}
}
}
void CNN::calc_out2bias(int width, int height, int depth, std::vector& out2bias)
{
for (int i = 0; i < depth; i++) {
for (int y = 0; y < height; y++) {
for (int x = 0; x < width; x++) {
out2bias.push_back(i);
}
}
}
}
void CNN::calc_in2wo(int width_in, int height_in, int width_out, int height_out, int depth_in, int depth_out, std::vector& in2wo)
{
int len = width_in * height_in * depth_in;
in2wo.resize(len);
for (int c = 0; c < depth_in; c++) {
for (int y = 0; y < height_in; y += height_kernel_pooling_CNN) {
for (int x = 0; x < width_in; x += width_kernel_pooling_CNN) {
int dymax = min(size_pooling_CNN, height_in - y);
int dxmax = min(size_pooling_CNN, width_in - x);
int dstx = x / width_kernel_pooling_CNN;
int dsty = y / height_kernel_pooling_CNN;
for (int dy = 0; dy < dymax; dy++) {
for (int dx = 0; dx < dxmax; dx++) {
int index_in = get_index(x + dx, y + dy, c, width_in, height_in, depth_in);
int index_out = get_index(dstx, dsty, c, width_out, height_out, depth_out);
wo_connections wo_connections_;
std::pair pair_;
pair_.first = c;
pair_.second = index_out;
wo_connections_.push_back(pair_);
in2wo[index_in] = wo_connections_;
}
}
}
}
}
}
void CNN::calc_weight2io(int width_in, int height_in, int width_out, int height_out, int depth_in, int depth_out, std::vector& weight2io)
{
int len = depth_in;
weight2io.resize(len);
for (int c = 0; c < depth_in; c++) {
for (int y = 0; y < height_in; y += height_kernel_pooling_CNN) {
for (int x = 0; x < width_in; x += width_kernel_pooling_CNN) {
int dymax = min(size_pooling_CNN, height_in - y);
int dxmax = min(size_pooling_CNN, width_in - x);
int dstx = x / width_kernel_pooling_CNN;
int dsty = y / height_kernel_pooling_CNN;
for (int dy = 0; dy < dymax; dy++) {
for (int dx = 0; dx < dxmax; dx++) {
int index_in = get_index(x + dx, y + dy, c, width_in, height_in, depth_in);
int index_out = get_index(dstx, dsty, c, width_out, height_out, depth_out);
std::pair pair_;
pair_.first = index_in;
pair_.second = index_out;
weight2io[c].push_back(pair_);
}
}
}
}
}
}
void CNN::calc_bias2out(int width_in, int height_in, int width_out, int height_out, int depth_in, int depth_out, std::vector& bias2out)
{
int len = depth_in;
bias2out.resize(len); max_value) {
max_value = neuron_output[i];
pos = i;
}
}
return pos;
}
bool CNN::readModelFile(const char* name)
{
FILE* fp = fopen(name, "rb");
if (fp == NULL) {
return false;
}
int width_image_input =0;
int height_image_input = 0;
int width_image_C1 = 0;
int height_image_C1 = 0;
int width_image_S2 = 0;
int height_image_S2 = 0;
int width_image_C3 = 0;
int height_image_C3 = 0;
int width_image_S4 = 0;
int height_image_S4 = 0;
int width_image_C5 = 0;
int height_image_C5 = 0;
int width_image_output = 0;
int height_image_output = 0;
int width_kernel_conv = 0;
int height_kernel_conv = 0;
int width_kernel_pooling = 0;
int height_kernel_pooling = 0;
int num_map_input = 0;
int num_map_C1 = 0;
int num_map_S2 = 0;
int num_map_C3 = 0;
int num_map_S4 = 0;
int num_map_C5 = 0;
int num_map_output = 0;
int len_weight_C1 = 0;
int len_bias_C1 = 0;
int len_weight_S2 = 0;
int len_bias_S2 = 0;
int len_weight_C3 = 0;
int len_bias_C3 = 0;
int len_weight_S4 = 0;
int len_bias_S4 = 0;
int len_weight_C5 = 0;
int len_bias_C5 = 0;
int len_weight_output = 0;
int len_bias_output = 0;
int num_neuron_input = 0;
int num_neuron_C1 = 0;
int num_neuron_S2 = 0;
int num_neuron_C3 = 0;
int num_neuron_S4 = 0;
int num_neuron_C5 = 0;
int num_neuron_output = 0;
fread(&width_image_input, sizeof(int), 1, fp);
fread(&height_image_input, sizeof(int), 1, fp);
fread(&width_image_C1, sizeof(int), 1, fp);
fread(&height_image_C1, sizeof(int), 1, fp);
fread(&width_image_S2, sizeof(int), 1, fp);
fread(&height_image_S2, sizeof(int), 1, fp);
fread(&width_image_C3, sizeof(int), 1, fp);
fread(&height_image_C3, sizeof(int), 1, fp);
fread(&width_image_S4, sizeof(int), 1, fp);
fread(&height_image_S4, sizeof(int), 1, fp);
fread(&width_image_C5, sizeof(int), 1, fp);
fread(&height_image_C5, sizeof(int), 1, fp);
fread(&width_image_output, sizeof(int), 1, fp);
fread(&height_image_output, sizeof(int), 1, fp);
fread(&width_kernel_conv, sizeof(int), 1, fp);
fread(&height_kernel_conv, sizeof(int), 1, fp);
fread(&width_kernel_pooling, sizeof(int), 1, fp);
fread(&height_kernel_pooling, sizeof(int), 1, fp);
fread(&num_map_input, sizeof(int), 1, fp);
fread(&num_map_C1, sizeof(int), 1, fp);
fread(&num_map_S2, sizeof(int), 1, fp);
fread(&num_map_C3, sizeof(int), 1, fp);
fread(&num_map_S4, sizeof(int), 1, fp);
fread(&num_map_C5, sizeof(int), 1, fp);
fread(&num_map_output, sizeof(int), 1, fp);
fread(&len_weight_C1, sizeof(int), 1, fp);
fread(&len_bias_C1, sizeof(int), 1, fp);
fread(&len_weight_S2, sizeof(int), 1, fp);
fread(&len_bias_S2, sizeof(int), 1, fp);
fread(&len_weight_C3, sizeof(int), 1, fp);
fread(&len_bias_C3, sizeof(int), 1, fp);
fread(&len_weight_S4, sizeof(int), 1, fp);
fread(&len_bias_S4, sizeof(int), 1, fp);
fread(&len_weight_C5, sizeof(int), 1, fp);
fread(&len_bias_C5, sizeof(int), 1, fp);
fread(&len_weight_output, sizeof(int), 1, fp);
fread(&len_bias_output, sizeof(int), 1, fp);
fread(&num_neuron_input, sizeof(int), 1, fp);
fread(&num_neuron_C1, sizeof(int), 1, fp);
fread(&num_neuron_S2, sizeof(int), 1, fp);
fread(&num_neuron_C3, sizeof(int), 1, fp);
fread(&num_neuron_S4, sizeof(int), 1, fp);
fread(&num_neuron_C5, sizeof(int), 1, fp);
fread(&num_neuron_output, sizeof(int), 1, fp);
fread(weight_C1, sizeof(weight_C1), 1, fp);
fread(bias_C1, sizeof(bias_C1), 1, fp);
fread(weight_S2, sizeof(weight_S2), 1, fp);
fread(bias_S2, sizeof(bias_S2), 1, fp);
fread(weight_C3, sizeof(weight_C3), 1, fp);
fread(bias_C3, sizeof(bias_C3), 1, fp);
fread(weight_S4, sizeof(weight_S4), 1, fp);
fread(bias_S4, sizeof(bias_S4), 1, fp);
fread(weight_C5, sizeof(weight_C5), 1, fp);
fread(bias_C5, sizeof(bias_C5), 1, fp);
fread(weight_output, sizeof(weight_output), 1, fp);
fread(bias_output, sizeof(bias_output), 1, fp);
fflush(fp);
fclose(fp);
out2wi_S2.clear();
out2bias_S2.clear();
out2wi_S4.clear();
out2bias_S4.clear();
calc_out2wi(width_image_C1_CNN, height_image_C1_CNN, width_image_S2_CNN, height_image_S2_CNN, num_map_S2_CNN, out2wi_S2);
calc_out2bias(width_image_S2_CNN, height_image_S2_CNN, num_map_S2_CNN, out2bias_S2);
calc_out2wi(width_image_C3_CNN, height_image_C3_CNN, width_image_S4_CNN, height_image_S4_CNN, num_map_S4_CNN, out2wi_S4);
calc_out2bias(width_image_S4_CNN, height_image_S4_CNN, num_map_S4_CNN, out2bias_S4);
return true;
}
bool CNN::saveModelFile(const char* name)
{
FILE* fp = fopen(name, "wb");
if (fp == NULL) {
return false;
}
int width_image_input = width_image_input_CNN;
int height_image_input = height_image_input_CNN;
int width_image_C1 = width_image_C1_CNN;
int height_image_C1 = height_image_C1_CNN;
int width_image_S2 = width_image_S2_CNN;
int height_image_S2 = height_image_S2_CNN;
int width_image_C3 = width_image_C3_CNN;
int height_image_C3 = height_image_C3_CNN;
int width_image_S4 = width_image_S4_CNN;
int height_image_S4 = height_image_S4_CNN;
int width_image_C5 = width_image_C5_CNN;
int height_image_C5 = height_image_C5_CNN;
int width_image_output = width_image_output_CNN;
int height_image_output = height_image_output_CNN;
int width_kernel_conv = width_kernel_conv_CNN;
int height_kernel_conv = height_kernel_conv_CNN;
int width_kernel_pooling = width_kernel_pooling_CNN;
int height_kernel_pooling = height_kernel_pooling_CNN;
int num_map_input = num_map_input_CNN;
int num_map_C1 = num_map_C1_CNN;
int num_map_S2 = num_map_S2_CNN;
int num_map_C3 = num_map_C3_CNN;
int num_map_S4 = num_map_S4_CNN;
int num_map_C5 = num_map_C5_CNN;
int num_map_output = num_map_output_CNN;
int len_weight_C1 = len_weight_C1_CNN;
int len_bias_C1 = len_bias_C1_CNN;
int len_weight_S2 = len_weight_S2_CNN;
int len_bias_S2 = len_bias_S2_CNN;
int len_weight_C3 = len_weight_C3_CNN;
int len_bias_C3 = len_bias_C3_CNN;
int len_weight_S4 = len_weight_S4_CNN;
int len_bias_S4 = len_bias_S4_CNN;
int len_weight_C5 = len_weight_C5_CNN;
int len_bias_C5 = len_bias_C5_CNN;
int len_weight_output = len_weight_output_CNN;
int len_bias_output = len_bias_output_CNN;
int num_neuron_input = num_neuron_input_CNN;
int num_neuron_C1 = num_neuron_C1_CNN;
int num_neuron_S2 = num_neuron_S2_CNN;
int num_neuron_C3 = num_neuron_C3_CNN;
int num_neuron_S4 = num_neuron_S4_CNN;
int num_neuron_C5 = num_neuron_C5_CNN;
int num_neuron_output = num_neuron_output_CNN;
fwrite(&width_image_input, sizeof(int), 1, fp);
fwrite(&height_image_input, sizeof(int), 1, fp);
fwrite(&width_image_C1, sizeof(int), 1, fp);
fwrite(&height_image_C1, sizeof(int), 1, fp);
fwrite(&width_image_S2, sizeof(int), 1, fp);
fwrite(&height_image_S2, sizeof(int), 1, fp);
fwrite(&width_image_C3, sizeof(int), 1, fp);
fwrite(&height_image_C3, sizeof(int), 1, fp);
fwrite(&width_image_S4, sizeof(int), 1, fp);
fwrite(&height_image_S4, sizeof(int), 1, fp);
fwrite(&width_image_C5, sizeof(int), 1, fp);
fwrite(&height_image_C5, sizeof(int), 1, fp);
fwrite(&width_image_output, sizeof(int), 1, fp);
fwrite(&height_image_output, sizeof(int), 1, fp);
fwrite(&width_kernel_conv, sizeof(int), 1, fp);
fwrite(&height_kernel_conv, sizeof(int), 1, fp);
fwrite(&width_kernel_pooling, sizeof(int), 1, fp);
fwrite(&height_kernel_pooling, sizeof(int), 1, fp);
fwrite(&num_map_input, sizeof(int), 1, fp);
fwrite(&num_map_C1, sizeof(int), 1, fp);
fwrite(&num_map_S2, sizeof(int), 1, fp);
fwrite(&num_map_C3, sizeof(int), 1, fp);
fwrite(&num_map_S4, sizeof(int), 1, fp);
fwrite(&num_map_C5, sizeof(int), 1, fp);
fwrite(&num_map_output, sizeof(int), 1, fp);
fwrite(&len_weight_C1, sizeof(int), 1, fp);
fwrite(&len_bias_C1, sizeof(int), 1, fp);
fwrite(&len_weight_S2, sizeof(int), 1, fp);
fwrite(&len_bias_S2, sizeof(int), 1, fp);
fwrite(&len_weight_C3, sizeof(int), 1, fp);
fwrite(&len_bias_C3, sizeof(int), 1, fp);
fwrite(&len_weight_S4, sizeof(int), 1, fp);
fwrite(&len_bias_S4, sizeof(int), 1, fp);
fwrite(&len_weight_C5, sizeof(int), 1, fp);
fwrite(&len_bias_C5, sizeof(int), 1, fp);
fwrite(&len_weight_output, sizeof(int), 1, fp);
fwrite(&len_bias_output, sizeof(int), 1, fp);
fwrite(&num_neuron_input, sizeof(int), 1, fp);
fwrite(&num_neuron_C1, sizeof(int), 1, fp);
fwrite(&num_neuron_S2, sizeof(int), 1, fp);
fwrite(&num_neuron_C3, sizeof(int), 1, fp);
fwrite(&num_neuron_S4, sizeof(int), 1, fp);
fwrite(&num_neuron_C5, sizeof(int), 1, fp);
fwrite(&num_neuron_output, sizeof(int), 1, fp);
fwrite(weight_C1, sizeof(weight_C1), 1, fp);
fwrite(bias_C1, sizeof(bias_C1), 1, fp);
fwrite(weight_S2, sizeof(weight_S2), 1, fp);
fwrite(bias_S2, sizeof(bias_S2), 1, fp);
fwrite(weight_C3, sizeof(weight_C3), 1, fp);
fwrite(bias_C3, sizeof(bias_C3), 1, fp);
fwrite(weight_S4, sizeof(weight_S4), 1, fp);
fwrite(bias_S4, sizeof(bias_S4), 1, fp);
fwrite(weight_C5, sizeof(weight_C5), 1, fp);
fwrite(bias_C5, sizeof(bias_C5), 1, fp);
fwrite(weight_output, sizeof(weight_output), 1, fp);
fwrite(bias_output, sizeof(bias_output), 1, fp);
fflush(fp);
fclose(fp);
return true;
}
double CNN::test()
{
int count_accuracy = 0;
for (int num = 0; num < num_patterns_test_CNN; num++) {
data_single_image = data_input_test + num * num_neuron_input_CNN;
data_single_label = data_output_test + num * num_neuron_output_CNN;
Forward_C1();
Forward_S2();
Forward_C3();
Forward_S4();
Forward_C5();
Forward_output();
int pos_t = -1;
int pos_y = -2;
double max_value_t = -9999.0;
double max_value_y = -9999.0;
for (int i = 0; i < num_neuron_output_CNN; i++) {
if (neuron_output[i] > max_value_y) {
max_value_y = neuron_output[i];
pos_y = i;
}
if (data_single_label[i] > max_value_t) {
max_value_t = data_single_label[i];
pos_t = i;
}
}
if (pos_y == pos_t) {
++count_accuracy;
}
Sleep(1);
}
return (count_accuracy * 1.0 / num_patterns_test_CNN);
}
}
測試代碼如下:
int test_CNN_train()
{
ANN::CNN cnn1;
cnn1.init();
cnn1.train();
return 0;
}
int test_CNN_predict()
{
ANN::CNN cnn2;
bool flag = cnn2.readModelFile("E:/GitCode/NN_Test/data/cnn.model");
if (!flag) {
std::cout << "read cnn model error" << std::endl;
return -1;
}
int width{ 32 }, height{ 32 };
std::vector target{ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };
std::string image_path{ "E:/GitCode/NN_Test/data/images/" };
for (auto i : target) {
std::string str = std::to_string(i);
str += ".png";
str = image_path + str;
cv::Mat src = cv::imread(str, 0);
if (src.data == nullptr) {
fprintf(stderr, "read image error: %s\n", str.c_str());
return -1;
}
cv::Mat tmp(src.rows, src.cols, CV_8UC1, cv::Scalar::all(255));
cv::subtract(tmp, src, tmp);
cv::resize(tmp, tmp, cv::Size(width, height));
auto ret = cnn2.predict(tmp.data, width, height);
fprintf(stdout, "the actual digit is: %d, correct digit is: %d\n", ret, i);
}
return 0;
}
通過執(zhí)行test_CNN_train()函數(shù)可生成cnn model文件,執(zhí)行結(jié)果如下:
通過執(zhí)行test_CNN_predict()函數(shù)來測試CNN的準確率,通過畫圖工具,每個數(shù)字生成一張圖像,共10幅,如下圖:
測試結(jié)果如下:
評論
查看更多