Invited talk: “Scaling CNN inference for extreme throughput”

Go back to programme

Invited talk: “Scaling CNN inference for extreme throughput”

Performance scaling with traditional computing architectures becomes increasingly challenging as next generation technology nodes provide diminishing benefits. Semiconductor companies aim to unleash new levels of performance through further specialization of compute and memory subsystems for specific application domains.

During this talk, we will discuss examples of extreme forms of specialization that help scaling CNN inference to 100s of millions of inputs/second to handle ML workloads in novel applications such as network intrusion detection.

Share this session
Scroll Up