CNTK

What is the Cognitive Toolkit?

The Cognitive Toolkit, formerly known as CNTK, is a unified deep-learning toolkit that describes neural networks as a series of computational steps via a directed graph. In this directed graph, leaf nodes represent input values or network parameters, while other nodes represent matrix operations upon their inputs.

Running the Cognitive Toolkit

  • Run the container image. A typical command to launch the container is:
    nvidia-docker run -it --rm -v local_dir:container_dir nvcr.io/nvidia/cntk:<xx.xx>
    
    

    Where:

    • -it means run in interactive mode
    • --rm will delete the container when finished
    • -v is the mounting directory
    • local_dir is the directory or file from your host system (absolute path) that you want to access from inside your container. For example, the local_dir in the following path is /home/jsmith/data/mnist.
      -v /home/jsmith/data/mnist:/data/mnist
      
      

      If you are inside the container, for example, ls /data/mnist, you will see the same files as if you issued the ls /home/jsmith/data/mnist command from outside the container.

    • container_dir is the target directory when you are inside your container. For example, /data/mnist is the target directory in the example:
      -v /home/jsmith/data/mnist:/data/mnist
      
      
    • <xx.xx> is the tag. For example, 17.06.

    a. When running on a single GPU, the Cognitive Toolkit can be invoked using a command similar to the following: cntk configFile=myscript.cntk ...

    
    

    b. When running on multiple GPUs, run the Cognitive Toolkit through MPI. The following example uses four GPUs, numbered 0..3, for training:

    export OMP_NUM_THREADS=10
    export CUDA_DEVICE_ORDER=PCI_BUS_ID
    export CUDA_VISIBLE_DEVICES=0,1,2,3
    mpirun --allow-run-as-root --oversubscribe --npernode 4 \
           -x OMP_NUM_THREADS -x CUDA_DEVICE_ORDER -x CUDA_VISIBLE_DEVICES \
           cntk configFile=myscript.cntk ...
    
    

    c. When running all eight GPUs of DGX-1 together is even more simple:

    export OMP_NUM_THREADS=10
    mpirun --allow-run-as-root --oversubscribe --npernode 8 \
           -x OMP_NUM_THREADS cntk configFile=myscript.cntk ...
    
    

    When running the Cognitive Toolkit containers, it is important to include at least the following options:

    nvidia-docker run --shm-size=1g --ulimit memlock=-1 --ulimit stack=67108864 ... nvcr.io/nvidia/cntk:17.02 ...
    
    

    You might want to pull in data and model descriptions from locations outside the container for use by the Cognitive Toolkit. To accomplish this, the easiest method is to mount one or more host directories as Docker data volumes. You have pulled the latest files and run the container image.

Note: In order to share data between ranks, NCCL may require shared system memory for IPC and pinned (page-locked) system memory resources. The operating system’s limits on these resources may need to be increased accordingly. Refer to your system’s documentation for details. In particular, Docker containers default to limited shared and pinned memory resources. When using NCCL inside a container, it is recommended that you increase these resources by issuing:

   --shm-size=1g --ulimit memlock=-1

in the command line to **`nvidia-docker run`**.

  1. See /workspace/README.md inside the container for information on customizing your the Cognitive Toolkit image.

Suggested Reading

For the latest Release Notes, see the Cognitive Toolkit Release Notes Documentation website.

For more information about the Cognitive Toolkit, including tutorials, documentation, and examples, see the CNTK wiki.