SemifreddoNets: Partially Frozen Neural Networks for Efficient Computer Vision Systems

August 12, 2020

Video Transcript

We propose a partially-frozen neural network architecture that is optimized to have an efficient hardware implementation.

Unlike traditional layer-level freezing approaches, our method vertically freezes a portion of the weights, distributed across layers. This leaves room for both adapting to new tasks and new kinds of data. We named this architecture SemifreddoNets, after the Italian dessert, Semifreddo, due to its partially-frozen nature.

Our system consists of one frozen core and two parallel trainable cores. The trainable cores selectively transfer and enrich frozen core features using trainable alpha blending parameters. An optional core shuffle module lets the two trainable cores exchange feature maps to act together more efficiently.

Both the frozen and trainable cores have their topology hard-wired, in fully-pipelined hardware. Fixing the topology and some of the weights in hardware reduces the silicon area, logic delay, and memory requirements, leading to significant savings in cost and power consumption.

Furthermore, SemifreddoNets can implement deeper and larger neural network architectures by reusing the last blocks repeatedly in a single inference pass. This block modularity provides the flexibility to find a reasonable balance between accuracy and speed, without requiring any hardware change.

Thanks for watching and check out our paper to learn more.

Paper: SemifreddoNets: Partially Frozen Neural Networks for Efficient Computer Vision Systems