How do I install Theano on Ubuntu 14.04 x64 and configure it to use the GPU?

I tried to follow the instructions "Easy installation of optimized theano on current Ubuntu" but it doesn't work: whenever I run the Theano script using the GPU it gives me an error message:

CUDA is installed but gpu device is not available (error: Could not get available gpus count: CUDA enabled device not found)


More specifically, following the instructions on the linked web page, I followed these steps:

# Install Theano
sudo apt-get install python-numpy python-scipy python-dev python-pip python-nose g++ libopenblas-dev git
sudo pip install Theano

# Install Nvidia drivers and CUDA
sudo apt-get install nvidia-current
sudo apt-get install nvidia-cuda-toolkit

      

Then I rebooted and tried to run:

THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python gpu_test.py # gpu_test.py comes from http://deeplearning.net/software/theano/tutorial/using_gpu.html

      

But I am getting:

f@f-Aurora-R4:~$ THEANO_FLAGS=’mode=FAST_RUN,device=gpu,floatX=float32,cuda.root=/usr/lib/nvidia-cuda-toolkit’ python gpu_test.py WARNING (theano.sandbox.cuda): CUDA is installed, but device gpu is not available (error: Unable to get the number of gpus available: no CUDA-capable device is detected) [Elemwise{exp,no_inplace}(<TensorType(float32, vector)>)] Looping 1000 times took 2.199992 seconds Result is [ 1.23178029 1.61879337 1.52278066 ..., 2.20771813 2.29967761 1.62323284] Used the cpu

      

+3
theano ubuntu


source to share


1 answer


(I tested the following on Ubuntu 14.04.4 LTS x64 and Kubuntu 14.04.4 LTS x64, I think it should work on most Ubuntu flavors)

Installing Theano and Configuring the GPU (CUDA)

The instructions on the official website are out of date. Instead, you can use the following instructions (assuming you recently installed Kubuntu 14.04 LTS x64):

# Install Theano
sudo apt-get install python-numpy python-scipy python-dev python-pip python-nose g++ libopenblas-dev git
sudo pip install Theano

# Install Nvidia drivers, CUDA and CUDA toolkit, following some instructions from http://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html
wget http://developer.download.nvidia.com/compute/cuda/7.5/Prod/local_installers/cuda-repo-ubuntu1404-7-5-local_7.5-18_amd64.deb # Got the link at https://developer.nvidia.com/cuda-downloads
sudo dpkg -i cuda-repo-ubuntu1404-7-5-local_7.5-18_amd64.deb
sudo apt-get update
sudo apt-get install cuda

sudo reboot

      

At this point, it nvidia-smi

should work, but the launch nvcc

won't work.

# Execute in console, or (add in ~/.bash_profile then run "source ~/.bash_profile"):
export PATH=/usr/local/cuda-7.5/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-7.5/lib64:$LD_LIBRARY_PATH

      

At this point, we need to work as a nvidia-smi

, and so nvcc

.

To check if Theano can use the GPU:

Copy-paste the following into gpu_test.py

:

# Start gpu_test.py
# From http://deeplearning.net/software/theano/tutorial/using_gpu.html#using-gpu
from theano import function, config, shared, sandbox
import theano.tensor as T
import numpy
import time

vlen = 10 * 30 * 768  # 10 x #cores x # threads per core
iters = 1000

rng = numpy.random.RandomState(22)
x = shared(numpy.asarray(rng.rand(vlen), config.floatX))
f = function([], T.exp(x))
print(f.maker.fgraph.toposort())
t0 = time.time()
for i in xrange(iters):
    r = f()
t1 = time.time()
print("Looping %d times took %f seconds" % (iters, t1 - t0))
print("Result is %s" % (r,))
if numpy.any([isinstance(x.op, T.Elemwise) for x in f.maker.fgraph.toposort()]):
    print('Used the cpu')
else:
    print('Used the gpu')
# End gpu_test.py

      

and run it:

THEANO_FLAGS='mode=FAST_RUN,device=gpu,floatX=float32' python gpu_test.py

      

which should return:

f@f-Aurora-R4:~$ THEANO_FLAGS='mode=FAST_RUN,device=gpu,floatX=float32' python gpu_test.py
Using gpu device 0: GeForce GTX 690
[GpuElemwise{exp,no_inplace}(<CudaNdarrayType(float32, vector)>), HostFromGpu(GpuElemwise{exp,no_inplace}.0)]
Looping 1000 times took 0.658292 seconds
Result is [ 1.23178029  1.61879349  1.52278066 ...,  2.20771813  2.29967761
  1.62323296]
Used the gpu

      

To find out the CUDA version:

​nvcc -V

      

Example:

username@server:~$ nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2015 NVIDIA Corporation
Built on Tue_Aug_11_14:27:32_CDT_2015
Cuda compilation tools, release 7.5, V7.5.17

      


Adding cuDNN

To add cuDNN (instructions from http://deeplearning.net/software/theano/library/sandbox/cuda/dnn.html ):

  • Download cuDNN from https://developer.nvidia.com/rdp/cudnn-download (free registration required)
  • tar -xvf cudnn-7.0-linux-x64-v3.0-prod.tgz

  • Perform one of the following actions

Option 1: Copy the files *.h

in the file CUDA_ROOT/include

and *.so*

on CUDA_ROOT/lib64

(the default CUDA_ROOT

is, /usr/local/cuda

in Linux).

sudo cp cuda/lib64/* /usr/local/cuda/lib64/
sudo cp cuda/include/cudnn.h /usr/local/cuda/include/

      

Option 2:

export LD_LIBRARY_PATH=/home/user/path_to_CUDNN_folder/lib64:$LD_LIBRARY_PATH
export CPATH=/home/user/path_to_CUDNN_folder/include:$CPATH
export LIBRARY_PATH=/home/user/path_to_CUDNN_folder/lib64:$LD_LIBRARY_PATH

      

By default, Theano will detect if it can use cuDNN. If so, he will use it. If not, Anano optimizations will not introduce cuDNN operations. Therefore Theano will still work unless the user manually submits them.

To get an error message if you can not use Theano cuDNN, use this flag Theano: optimizer_including=cudnn

.

Example:

THEANO_FLAGS='mode=FAST_RUN,device=gpu,floatX=float32,optimizer_including=cudnn' python gpu_test.py

      



To check your cuDNN version:

cat /usr/local/cuda/include/cudnn.h | grep CUDNN_MAJOR -A 2

      


Adding CNMeM

CNMeM Library is "A simple library that helps Deep Learning frameworks manage CUDA memory."

# Build CNMeM without the unit tests
git clone https://github.com/NVIDIA/cnmem.git cnmem
cd cnmem
mkdir build
cd build
sudo apt-get install -y cmake
cmake ..
make

# Copy files to proper location
sudo cp ../include/cnmem.h /usr/local/cuda/include
sudo cp *.so /usr/local/cuda/lib64/
cd ../..

      

For use with Theano you need to add a flag lib.cnmem

. Example:

THEANO_FLAGS='mode=FAST_RUN,device=gpu,floatX=float32,lib.cnmem=0.8,optimizer_including=cudnn' python gpu_test.py

      

The first script output should be:

Using gpu device 0: GeForce GTX TITAN X (CNMeM is enabled with initial size: 80.0% of memory, cuDNN 5005)

      

lib.cnmem=0.8

means it can use up to 80% of the GPU.

CNMeM is reported to provide some interesting speed improvements and is supported by Theano, Torch and Caffee.

Theano - source 1 :

The speed depends on many factors, such as the shape and the model itself. The speed increases from 0 to 2x faster.

Theano - source 2 :

If you don't change the Theano's allow_gc flag, you can expect up to 20% faster GPU performance. In some cases (small models) we saw 50% speed.


Running Theano on Multiple CPU Cores

As a side note, you can run Theano on multiple CPU cores with OMP_NUM_THREADS=[number_of_cpu_cores]

flag
. Example:

OMP_NUM_THREADS=4 python gpu_test.py 

      

The script theano/misc/check_blas.py

outputs information about which BLAS is being used:

cd [theano_git_directory]
OMP_NUM_THREADS=4 python theano/misc/check_blas.py

      


To run the Theano test suite:

nosetests theano

      

or

sudo pip install nose-parameterized
import theano
theano.test()

      

Common problems:

  • Import theano: AttributeError: 'module' object has no attribute 'find_graphviz'
+6


source to share







All Articles