Invited talk: “Scaling CNN inference for extreme throughput”

Go back to programme

Invited talk: “Scaling CNN inference for extreme throughput”

  • Zoom

    Watch on zoom

    watch on youtube

    * Register (or log in) to the AI4G Neural Network to add this session to your agenda or watch the replay

  • Performance scaling with traditional computing architectures becomes increasingly challenging as next generation technology nodes provide diminishing benefits. Semiconductor companies aim to unleash new levels of performance through further specialization of compute and memory subsystems for specific application domains.

    During this talk, we will discuss examples of extreme forms of specialization that help scaling CNN inference to 100s of millions of inputs/second to handle ML workloads in novel applications such as network intrusion detection.

    Share this session