AnArchitectureCombiningConvolutionalNeuralNetwork(CNN)andSupportVectorMachine(SVM)forImageClassification

专业资料 > IT/计算机 > 计算机视觉 > 文档预览

4 页 1 下载 1747 浏览 0 评论 0 收藏 3.0分

温馨提示：如果当前文档出现乱码或未能正常浏览，请先下载原文档进行浏览。

AnArchitectureCombiningConvolutionalNeuralNetwork(CNN)andSupportVectorMachine(SVM)forImageClassification 第 1 页

AnArchitectureCombiningConvolutionalNeuralNetwork(CNN)andSupportVectorMachine(SVM)forImageClassification 第 2 页

AnArchitectureCombiningConvolutionalNeuralNetwork(CNN)andSupportVectorMachine(SVM)forImageClassification 第 3 页

AnArchitectureCombiningConvolutionalNeuralNetwork(CNN)andSupportVectorMachine(SVM)forImageClassification 第 4 页

AnArchitectureCombiningConvolutionalNeuralNetwork(CNN)andSupportVectorMachine(SVM)forImageClassification内容摘要：

An Architecture Combining Convolutional Neural Network (CNN) and Support Vector Machine (SVM) for Image Classification Abien Fred M. Agarap abienfred.agarap@gmail.com arXiv:1712.03541v2 [cs.CV] 7 Feb 2019 ABSTRACT Convolutional neural networks (CNNs) are similar to “ordinary” neural networks in the sense that they are made up of hidden layers consisting of neurons with “learnable” parameters. These neurons receive inputs, performs a dot product, and then follows it with a non-linearity. The whole network expresses the mapping between raw image pixels and their class scores. Conventionally, the Softmax function is the classifier used at the last layer of this network. However, there have been studies [2, 3, 11] conducted to challenge this norm. The cited studies introduce the usage of linear support vector machine (SVM) in an artificial neural network architecture. This project is yet another take on the subject, and is inspired by [11]. Empirical data has shown that the CNN-SVM model was able to achieve a test accuracy of ≈99.04% using the MNIST dataset[10]. On the other hand, the CNN-Softmax was able to achieve a test accuracy of ≈99.23% using the same dataset. Both models were also tested on the recently-published Fashion-MNIST dataset[13], which is suppose to be a more difficult image classification dataset than MNIST[15]. This proved to be the case as CNN-SVM reached a test accuracy of ≈90.72%, while the CNN-Softmax reached a test accuracy of ≈91.86%. The said results may be improved if data preprocessing techniques were employed on the datasets, and if the base CNN model was a relatively more sophisticated than the one used in this study. CCS CONCEPTS • Computing methodologies → Supervised learning by classification; Support vector machines; Neural networks; this approach, and that is the restriction to binary classification. As SVM aims to determine the optimal hyperplane separating two classes in a dataset, a multinomial case is seemingly ignored. With the use of SVM in a multinomial classification, the case becomes a one-versus-all, in which the positive class represents the class with the highest score, while the rest represent the negative class. In this paper, we emulate the architecture proposed by [11], which combines a convolutional neural network (CNN) and a linear SVM for image classification. However, the CNN employed in this study is a simple 2-Convolutional Layer with Max Pooling model, in contrast with the relatively more sophisticated model and preprocessing in [11]. 2 METHODOLOGY 2.1 Machine Intelligence Library Google TensorFlow[1] was used to implement the deep learning algorithms in this study. 2.2 The Dataset MNIST[10] is an established standard handwritten digit classification dataset that is widely used for benchmarking deep learning models. It is a 10-class classification problem having 60,000 training examples, and 10,000 test cases – all in grayscale. However, it is argued that the MNIST dataset is “too easy” and “overused”, and “it can not represent modern CV [Computer Vision] tasks”[15]. Hence, [13] proposed the Fashion-MNIST dataset. The said dataset consists of Zalando’s article images having the same distribution, the same number of classes, and the same color profile as MNIST. KEYWORDS artificial intelligence; artificial neural networks; classification; image classification; machine learning; mnist dataset; softmax; supervised learning; support vector machine 1 Table 1: Dataset distribution for both MNIST and FashionMNIST. INTRODUCTION A number of studies involving deep learning approaches have claimed state-of-the-art performances in a considerable number of tasks. These include, but are not limited to, image classification[9], natural language processing[12], speech recognition[4], and text classification[14]. The models used in the said tasks employ the softmax function at the classification layer. However, there have been studies[2, 3, 11] conducted that takes a look at an alternative to softmax function for classification – the support vector machine (SVM). The aforementioned studies have claimed that the use of SVM in an artificial neural network (ANN) architecture produces a relatively better results than the use of the conventional softmax function. Of course, there is a drawback to Dataset MNIST Fashion-MNIST Training Testing 60,000 60,000 10,000 10,000 Both datasets were used as they were, with no preprocessing such as normalization or dimensionality

本文档由 sddwt 于 2021-05-26 03:00:51上传分享

下载原文档(980.66 KB)

收藏分享

给文档打分

评论列表

暂时还没有评论，期待您的金玉良言