Intel® Nervana Graph

A modular library for fast deep learning
An open source Python-based library used for converting descriptions of neural networks
into programs that run efficiently on a various platforms


Intel® Nervana Graph is currently a preview release and the API’s and implementation are subject to change.
Join us by making pull requests, suggestions, and comments in our discussion group to shape the future of this technology.

Why Intel® Nervana™ Graph?

Modern data scientists need:

  • the freedom of choice to use the right frontend interface for the job to specify models at the desired level of granularity.
  • to be able to mix and match models built across these frameworks for ever more complicated topologies.
  • to rely on the execution runtime to perform algebraic simplifications, automated tensor layout, and memory sharing optimizations for us by default so we don’t have to.
  • these optimizations to work out of the box while still exposing the compilation machinery when you need it.
  • to execute these models efficiently across a wide variety of target hardware platforms such as heterogeneous mixtures of CPUs, GPUs, and/or Nervana Silicon Technology.

To enable these capabilities, as tool builders, we need:

  • the ability to write new frontends easily which leverage existing backend hardware targets and optimizations.
  • the ability to try new compiler techniques which all frontend users can try with a single configuration switch.
  • these new compilation modules to achieve high performance by leveraging the shared optimization machinery used by existing backends.
  • to expose new hardware, network, storage, and data processing systems without writing new libraries from scratch by plugging into an existing system which has its batteries included.

From our years of experience maintaining one of the fastest deep learning libraries, and over a year iterating on graph based designs, we are building Intel® Nervana™ Graph (ngraph) to address these aims. Intel® Nervana™ Graph is composed of three parts:

  1. An API for creating computational ngraphs.
  2. Two higher level frontend APIs (TensorFlow and neon) utilizing the ngraph API for common deep learning workflows.
  3. A transformer API for compiling these graphs and executing them on GPUs and CPUs.