Nervana Cloud is a full-stack hosted platform for deep learning that enables businesses to develop and deploy high-accuracy deep learning solutions at a fraction of the cost of building their own infrastructure and data science teams. We recently updated Nervana Cloud’s ncloud command-line interface (CLI) syntax to support subcommands and shortcuts for improved usability and consistency.

Previously, our syntax did not support subcommands, so even closely related commands were completely separate.  For example, under the old syntax, if you wanted to list models you would type ncloud list but if you wanted to list your uploaded datasets you would type ncloud dataset-list — two completely separate commands.  To achieve the same results with the new syntax, you use the list subcommand and type ncloud model list and ncloud dataset list, respectively. The new syntax also supports tab completion and prefix shortcuts, so these command/subcommand combinations could be shortened to to ncloud m l and ncloud d l. To see ncloud and the new syntax in action, watch this ncloud demo. For a complete list of commands, subcommands, and shortcuts, see ncloud Commands.

In addition to these syntax improvements, Nervana Cloud now includes a number of other enhancements including multi-instance model deployment, batch inference, GPU-based inference, and support for additional data files and custom scripts and extensions.

Multi-instance model deployment

Under certain scenarios, you might want to deploy multiple instances of the same trained model.  For example, you might want different output formatters or you might want to have different user groups or apps work with separate instances. Nervana now allows you to deploy multiple “streams” (instances) of the same model and direct predictions to a given stream based on a pre-signed token.  When you deploy a model using cloud model deploy <model_id>, you will get back a <token> and a <stream_id>. You can then run predictions against the stream using ncloud stream predict <token> <input>. You can undeploy a stream using ncloud stream undeploy <stream_id>.

ncloud stream predict replaces the older cloud predict command for one-at-a-time inference.  Note, too, that one must pass a token instead of the older approach of passing a model_id to each prediction call.

Batch inference

You can now run predictions in bulk against a pre-trained model. Use the command ncloud batch predict <model_id> <datset_id> to kick off the batch prediction.  The process will assign an ID number to the batch prediction which you can retrieve using the batch list command. Then, you can issue ncloud batch results <batch_id> to download the results in a .csv file. For complete syntax and options, see batch in the ncloud Commands documentation.

GPU-based inference

In addition to previous CPU-based inference deployments, you now have the ability to do inference using allocated GPU resources to improve throughput.  When deploying a model you can now append the -g/–gpus flag to ncloud model deploy to add one or more GPUs to the instance.  Subsequent ncloud stream predict calls to that instance will then run on the GPU.  Note that the default behavior remains to do CPU based model deployments if no flags are set.

Additional data files

When deploying models, you can now include additional auxiliary files that you can then reference in inference tasks.  For example, you can provide a file containing classification labels for use in formatting the prediction output. To include a zipfile during deployment, add the -f/–extra_files flag to the ncloud model deploy command. The contents of the zipfile will be automatically extracted and available under the /code directory inside your deployed instance container so you can reference it in inference scripts. See model deploy in the ncloud Commands documentation for complete syntax.

Custom scripts and extensions

When deploying models, you can now reference a custom function that can be used either to pre-process the data before it is passed to the trained model for inference or to post-process the inference results after they are output from a deployed model. The first step is to create a python script named that defines the preprocess(input, model) and/or postprocess(output) functions.  Next, add the script to your own repository. Finally,  reference this repository during deployment by adding the –custom_code_url flag to the ncloud model deploy command. See model deploy in the ncloud Commands documentation for complete syntax.

Defaulting to neon v1.5.4

We’ve updated the base neon instance used to process cloud training and inference jobs from v1.3.0 to v1.5.4. The biggest improvements between these versions are data loading enhancements and some faster kernels. For a full list of the updates between these releases see the ChangeLog. Note that one can still override and use a specific version/commit of neon by utilizing the -f/–framework-version flag with ncloud model train.