*This network is running live in your browser

The Convolutional Neural Network in this example is classifying images live in your browser using Javascript, at about 10 milliseconds per image. It takes an input image and transforms it through a series of functions into class probabilities at the end. The transformed representations in this visualization can be loosely thought of as the activations of the neurons along the way. The parameters of this function are learned with backpropagation on a dataset of (image, label) pairs. This particular network is classifying CIFAR-10 images into one of 10 classes and was trained with ConvNetJS. Its exact architecture is [conv-relu-conv-relu-pool]x3-fc-softmax, for a total of 17 layers and 7000 parameters. It uses 3x3 convolutions and 2x2 pooling regions. By the end of the class, you will know exactly what all these numbers mean.

L0CV Description

Computer Vision has become ubiquitous in our society, with applications in search, image understanding, apps, mapping, medicine, drones, and self-driving cars. Core to many of these applications are visual recognition tasks such as image classification, localization and detection. Recent developments in neural network (aka “deep learning”) approaches have greatly advanced the performance of these state-of-the-art visual recognition systems. This repo is a deep dive into the details of deep learning architectures with a focus on learning end-to-end models for these tasks, particularly image classification. During the 21-chapter course, learners will learn to implement and train their own neural networks and gain a detailed understanding of cutting-edge research in computer vision. Additionally, the self-build package L0CV and running-live Jupyter Notebook will give them the opportunity to train and apply multi-million parameter networks on real-world vision problems of their choice. Through multiple hands-on tasks and the elementary research project, learners will acquire the toolset for setting up deep learning tasks and practical engineering tricks for training and fine-tuning deep neural networks.

Book Logistics

本学习资源以计算机视觉的发展历程和自顶向下的学习过程为核心，为读者提供一个 人人可学习计算机视觉的开放平台。我们围绕这样的组织逻辑：什么是计算机视觉？计算机视觉解决什么问题，都是怎么解决的？传统方法——以卷积神经网络为中心的神经网络；现代方法——Transformer、强化学习、迁移学习、生成对抗等。各种方法是如何实现的，用到了什么框架？在本资源中，这些问题都将会给予解决。