Caffe2

The following guide is for the DGX-1 cluster only

 

Run the container image. A typical command to launch the
container is:

nvidia-docker run -it --rm -v local_dir:container_dir nvcr.io/nvidia/caffe2:<xx.xx>

Where:

  • -it means run in interactive mode
  • --rm will delete the container when finished
  • -v is the mounting directory
  • local_dir is the directory or file from your host
    system (absolute path) that you want to access from inside your
    container. For example, the local_dir in the following
    path is /homes/tim/data/mnist.

    -v ~tim/data/mnist:/data/mnist
    

    If you are inside the container, for example, ls
    /data/mnist
    , you will see the same files as if you issued
    the ls /homes/tim/data/mnist command from outside
    the container.

  • container_dir is the target directory when you are
    inside your container. For example, /data/mnist is the
    target directory in the example:

    -v ~tim/data/mnist:/data/mnist
    
  • <xx.xx> is the tag. For example,
    17.06.

You have pulled the latest files and run the container
image.

 Note: In order to share data between ranks, NCCL may require
shared system memory for IPC and pinned (page-locked) system memory
resources. The operating system’s limits on
these resources may need to be increased accordingly. Refer to your
system’s documentation for details. In
particular, Docker containers default to limited shared and pinned
memory resources. 

When using NCCL inside a container, it is
recommended that you increase these resources by issuing:

--shm-size=1g --ulimit memlock=-1

in the command line to nvidia-docker run.

  • See /workspace/README.md inside
    the container for information on customizing your Caffe2 image.

Suggested Reading

For the latest Release Notes, see the
Caffe2 Release Notes Documentation website
.

For more information about Caffe2, including tutorials,
documentation, and examples, see: