In which direction will AI training and reasoning develop?

Embedded
8 min readDec 7, 2020

Last year, in my communication with Mr. Nigel Toon, the co-founder and CEO of Graphcore in the UK, he talked a few times about the “training and inference technologies of AI are essentially the same.” It seems that in many discussions, Nigel Toon has consciously or unconsciously mentioned that training and reasoning should not be overly strict.

At that time, Graphcore Senior Vice President and General Manager of China, Lu Tao also said: “The future application is to continuously train and develop this model. Some customers mentioned online training or streaming training before. Generally speaking, training is what everyone talks about. They are all offline. The data is fetched, and after training, it can be used for online inference. But now, many applications may be converted to online training and inference. That is, while using it, there are new data coming in and updating at the same time. knowledge base.”

What do these two passages mean?
advertising Graphcore also mentioned this paragraph in its promotional materials for IPU chips last year: “Users can use the same IPU chip for reasoning or training; from an architectural point of view, this is very important to us, because as machine learning evolves, the system will be able to Learn from experience. The keys to inference performance include low latency, the ability to use small models, small batches, and the possibility of trying to import sparsity training models; IPU can effectively accomplish all of these things. “

“So in a 4U chassis, 16 IPUs work together for training. Each IPU can perform independent training or inference tasks, and is controlled by a virtual machine executed on a CPU. In the end, you will get a training Hardware. So once the model is trained and deployed, as the model evolves and we start to want to learn from experience, we can use the same hardware.

AI training and reasoning
Consumer users know that AI or “deep neural network” should be based on contemporary mobile phone products. There may be a dedicated AI unit in the mobile phone SoC-Arm and Huawei call it NPU, and Apple calls it NE. Usually we say that there are two key points in the execution of neural network tasks, namely the execution and inference of neural network models (also translated as inference).

The training process is the same as the process of neural network learning. The trained neural network, put into use, is to apply what it has learned to specific scenarios, such as recognizing pictures and voices-this process is the process of reasoning. Only with training can there be reasoning.
In the process of training a neural network, training data is first added to the first layer of the network, and a single neuron assigns weights to the input based on the tasks performed. For example, in the image recognition network, the first layer will look for the edge of the picture object; then the shape formed by the edge-rectangle or circle; the third layer will find specific features, such as the eyes and nose in the picture. ….. Up to the last layer, the final output is determined by all generated weights.

A typical example is the recognition of cat pictures: input a large number of training pictures to the neural network, and the neural network needs to judge whether it is a cat or not-the response of the training algorithm is only “correct” and “error”. If the algorithm tells the neural network that a certain judgment is “wrong”, the error will be passed back through the neural network level, and the network needs to make other attempts. Each time you try, you need to consider other attributes; and weight the attributes checked at each layer. Until the correct weight is finally obtained, the correct answer can be obtained every time.

Then there is the data structure and all the weights, which are obtained through the process of “learning”. The biggest feature of the whole process is that the demand for computing power is very large.
And hand over small batches of real-world data to the neural network and get very fast answers-this is the process of reasoning. In the above example, it is to give a brand new photo and let the trained neural network determine whether there is a cat in this picture. But usually, when the reasoning part is dealing with giant neural networks, to ensure speed and delay in the application, some solutions are needed to “prun” them, for example, part of the neural network is not activated; or the neural network is multi-level fusion Into a separate calculation step.

The reasoning process usually uses simpler and optimized runtime performance to obtain similar prediction accuracy. If you look at the NPU on the mobile phone and the AI ​​chips carried by many cars with ADAS systems, if you look at it according to this division, the leading role is to do local reasoning. Reasoning is also a highly parallel calculation, but the computing power and accuracy requirements are usually much lower than training.

Talking about reasoning may be too narrow
Nigel Toon wrote a special article many years ago about training and reasoning. Graphcore’s IPU supports both training and inference, but “If you look at IPU based on training and inference, then you may have some misunderstandings about machine learning hardware.” “As for whether IPU is for training or inference, my answer may be a lot Surprised.”

Graphcore officials prefer to mention the word “machine intelligence”, calling its IPU products machine intelligence hardware, and “we design IPU to help change makers develop a new generation of machine intelligence systems.” I didn’t understand this term well before, and basically equated it with traditional machine learning solutions. Perhaps for this word, a good solution lies in Graphcore’s understanding of training and inference.

Toon mentioned at the time: “Today’s machine learning system is still relatively simple.” Conventional solutions such as convolutional neural networks. As mentioned above, for example, training a classification system, then the system can recognize pictures and images relatively accurately. Text and other objects. The whole process requires a lot of labeled data, and more importantly, because of the separation of training and inference, the system does not learn from new experiences.

And Graphcore’s ideal “machine intelligence” hardware can learn and use the learned knowledge to build a system. In this way, “machine intelligence systems are likely to improve and provide better and better results.” Toon said, “These systems should not only be training, but should be able to evolve rapidly even after deployment and become learning systems.”

“Inference is a relatively narrow way of expression; inference can actually be rapidly developed into a much more complex machine learning system.” “For example, in the field of autonomous driving, we expect cars to continue to learn, and what we encounter on the road This is a new situation. If an autonomous car finds a ball rolling on the road and a child is behind it, we hope that the autonomous driving system can learn that a ball appears on the road, it is very likely that it will be followed Of a child.”

“And this new knowledge shouldn’t just be put in a car. We hope to share this knowledge. For example, if you connect to the learning system in the cloud that night, the knowledge system will be updated. In this way, all All cars can update their models and get the latest understanding. Ideally, we expect other cars to be able to realize that a child may be behind when they see a toy tractor suddenly appear on the road — — It reacts the same as when it sees a ball.”

This description actually shows quite clearly why Toon believes that “reasoning” is a relatively one-sided and narrow vocabulary in the application of machine intelligence, because in his opinion, the deployment of machine intelligence is the right direction.

The ideal hardware form of machine intelligence
So if you don’t talk about reasoning, but use a machine intelligence system like the above concept, what is the actual hardware deployment? Toon also gave his own answer before. In an ideal situation, machine intelligent computing should be performed in the cloud, while the edge location performs “intelligent processing”, which can be embedded in end-side products or edge servers-for certain specific tasks, Do local support.
The cloud needs to support many different users and different intelligent tasks-and run highly in parallel in the cloud. In the deployment of more specific equipment, computing power needs to be quite flexible-I believe this should also be an important reason why the second-generation IPU now has a considerable degree of flexible scalability. Those who are interested can refer to the article “Second-generation IPU: How to make a chip that kills supercomputers?”

At the edge and end side, in order to ensure real-time response, the system still requires hardware deployment of “intelligent processing”. For example, self-driving cars, “L5 autonomous driving requires a lot of local intelligent processing. It is impractical to connect to the cloud through the mobile Internet. The system needs to respond in real time to deal with emergencies, such as making quick decisions when danger occurs. Then the car It requires considerable intelligent processing power.
But as mentioned earlier, as an integral part of the machine intelligence system, the local is still involved in the learning process, and more knowledge can be obtained from the cloud.

“Innovation is still going on at a fairly fast pace. We will soon have a new hardware platform that will allow innovators in the field of machine learning to push the boundaries. Training now and then inferring will give way to the machine Intelligent systems. After deployment, machine intelligence systems can learn and continue to evolve. Machine intelligence will dramatically change computing.” Toon wrote.

In fact, we are now seeing more and more AI systems that are developing in this direction, even if they may not be called “machine intelligence” or continue the tradition of training and reasoning. But when reviewing the first few paragraphs now, does it feel very clear? In particular, “the system will be able to learn from experience” and support both training and reasoning.

--

--

Embedded
0 Followers

EmbedIc - Embedded Technology Full-Service Platform