Hey Guys. I have spent days if not months installing CUDA 7.0 and CUDA 7.5 in Ubuntu 14.04 and Ubuntu 16.04 over different laptops(Dell and Asus). I finally got them working. While there are numerous tutorials present over the web, most are incomplete and error prone. The difficulty comes in installing the Nvidia drivers. If one does a mistake then the system crashes(since you need Nvidia drivers for Cuda only but not for graphics display. The default settings messes the OpenGL display drivers) and you need to do a fresh install (again there are many messy ways to escape without a fresh install but lets try to make it correct from the start anyways).
Instructions on Ubuntu 16.04/14.04 after a fresh install
(1.) Install build essentials.
$ sudo apt-get install build-essential
(2.) Go to https://developer.nvidia.com/cuda-downloads and download CUDA toolkit 7.5 for Ubuntu 15.04 (No Version supports 16.04 yet) or if you are on Ubuntu 14.04, just choose that. I have tested the 64 bit version but I think the 32 bit will work too if your machine is 32 bit.
(3.) Open up a terminal and extract the separate installers via:
$ mkdir ~/Downloads/nvidia_installers; $ cd ~/Downloads $ ./cuda_7.5.18_linux.run -extract=~/Downloads/nvidia_installers;
(4.) Completely uninstall anything in the ubuntu repositories with
nvidia-*. I used synaptic and did a purge, AKA completely uninstall programs and configuration.
$ sudo apt-get --purge remove nvidia-*
(5.) No need to create an xorg.conf file. If you have one, remove it (assuming you have a fresh OS install).
$ sudo rm /etc/X11/xorg.conf
(6.) Create the
/etc/modprobe.d/blacklist-nouveau.conf file with the 2 following lines:
blacklist nouveau options nouveau modeset=0
Then do a
$ sudo update-initramfs -u
(7.) Reboot computer. Nothing should have changed in loading up menu. You should be taken to the login screen. Once there type: Ctrl + Alt + F1, and login to your user. Keep the next commands handy in another machine since now you are in tty.
(8.) In tty:
cd ~/Downloads/nvidia_installers; sudo service lightdm stop
The top line is a necessary step for installing the driver.
(9.) [For Ubuntu 14.04]
sudo ./NVIDIA-Linux-x86_64-352.39.run --no-opengl-files
[For Ubuntu 16.04] –> Download nvidia-367 instead of the default nvidia-352 that comes with the toolkit from here: http://in.download.nvidia.com/XFree86/Linux-x86_64/367.27/NVIDIA-Linux-x86_64-367.27.run then do
sudo ./NVIDIA-Linux-x86_64-367.27.run --no-opengl-files
I cannot stress how important is the opengl flag in the above command. If you miss that, either you will get stuck in “login loop” or your computer would boot with a black screen at all times.
(10.) Now install the toolkit also
sudo ./cuda-linux64-rel-6.0.37-18176142.run sudo ./cuda-samples-linux-6.0.37-18176142.run
(11.) Set Environment path variables in .bashrc:
$ export PATH=/usr/local/cuda-7.5/bin:$PATH $ export LD_LIBRARY_PATH=/usr/local/cuda-7.5/lib64:$LD_LIBRARY_PATH
(12.) Verify the driver version:
$ cat /proc/driver/nvidia/version My current resutls are: NVRM version: NVIDIA UNIX x86_64 Kernel Module 367.27 Thu Jun 9 18:53:27 PDT 2016 GCC version: gcc version 5.3.1 20160413 (Ubuntu 5.3.1-14ubuntu2.1)
(13.) Check CUDA driver version:
$ nvcc -V
(14.) At this point you can switch the lightdm back on again by doing:
$ sudo service lightdm start.
You are done if on Ubuntu 14.04 & go to step 17. If on Ubuntu 16.04, the gcc version is higher than what is supported by any CUDA toolkit right now.
READ ON FOR UBUNTU 16.04 ONLY
(15.) Fix/break the header file that doesn’t want to let us use gcc > 4.8. All we are going to do is comment out (//) the error line that drops you out of a build.
$ sudo vim /usr/local/cuda/include/host_config.h
line: 115 comment out error //#error – unsupported GNU version! gcc versions later than 4.9 are not supported!
(16.) To see if we are properly done with the installation, we need to run the samples that came along the downloaded toolkit runfile. By default it is installed in /usr/local/cuda/samples. Go there.
$ cd /usr/local/cuda/samples $ grep -r nvidia-352 -l --null . | sudo xargs -0 sed -i 's#nvidia-352#nvidia-367#g'
The above command replaces all the places where sample’s default nvidia-352 driver was used with nvidia-367
(17.) BOTH 16.04 and 14.04
$ cd /usr/local/cuda/samples/1_Utilities/deviceQuery $ sudo make $ ./deviceQuery
Something like this should show up
magneto@magneto-dell:/usr/local/cuda/samples/1_Utilities/deviceQuery$ ./deviceQuery ./deviceQuery Starting... CUDA Device Query (Runtime API) version (CUDART static linking) Detected 1 CUDA Capable device(s) Device 0: "Quadro M1000M" CUDA Driver Version / Runtime Version 8.0 / 7.5 CUDA Capability Major/Minor version number: 5.0 Total amount of global memory: 2002 MBytes (2099642368 bytes) ( 4) Multiprocessors, (128) CUDA Cores/MP: 512 CUDA Cores GPU Max Clock rate: 1072 MHz (1.07 GHz) Memory Clock rate: 2505 Mhz Memory Bus Width: 128-bit L2 Cache Size: 2097152 bytes Maximum Texture Dimension Size (x,y,z) 1D=(65536), 2D=(65536, 65536), 3D=(4096, 4096, 4096) Maximum Layered 1D Texture Size, (num) layers 1D=(16384), 2048 layers Maximum Layered 2D Texture Size, (num) layers 2D=(16384, 16384), 2048 layers Total amount of constant memory: 65536 bytes Total amount of shared memory per block: 49152 bytes Total number of registers available per block: 65536 Warp size: 32 Maximum number of threads per multiprocessor: 2048 Maximum number of threads per block: 1024 Max dimension size of a thread block (x,y,z): (1024, 1024, 64) Max dimension size of a grid size (x,y,z): (2147483647, 65535, 65535) Maximum memory pitch: 2147483647 bytes Texture alignment: 512 bytes Concurrent copy and kernel execution: Yes with 1 copy engine(s) Run time limit on kernels: No Integrated GPU sharing Host Memory: No Support host page-locked memory mapping: Yes Alignment requirement for Surfaces: Yes Device has ECC support: Disabled Device supports Unified Addressing (UVA): Yes Device PCI Domain ID / Bus ID / location ID: 0 / 1 / 0 Compute Mode: < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) > deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 8.0, CUDA Runtime Version = 7.5, NumDevs = 1, Device0 = Quadro M1000M Result = PASS