CUDA Memory

CUDA is widely used in deep learning. Though many of deep learning professionals are not exposed to CUDA directly, most people are already using CUDA as frameworks like PyTorch are providing GPU support through CUDA.

To optimize the computational efficiency of our models, knowledge about the data transfer inside the devices is crucial. In this note, we build up the fundamentals of memory transfer for CUDA.

Segmented Memory and Paged Memory

CUDA Can not Use Paged Memory

A CPU host uses paged memory. However, GPU can not directly take data from paged memory on the host1. Before accessing the data, CUDA has to pin the memory so that the memory is page-locked2. Pinned memory stays on the physical memory and won’t be moved to secondary memory so that GPU doesn’t need CPU to page-in/out memory.

Harris M. How to Optimize Data Transfers in CUDA C/C++. In: NVIDIA Technical Blog [Internet]. 5 Dec 2012 [cited 19 Oct 2022]. Available: https://developer.nvidia.com/blog/how-optimize-data-transfers-cuda-cc/

Harris M. How to Optimize Data Transfers in CUDA C/C++. In: NVIDIA Technical Blog [Internet]. 5 Dec 2012 [cited 19 Oct 2022]. Available: https://developer.nvidia.com/blog/how-optimize-data-transfers-cuda-cc/

Pinned Memory is Fast

I took two screenshots from a video by CoffeeBeforeArch.

CoffeeBeforeArch. CUDA Crash Course (v2): Pinned Memory. YouTube. 2019. Available: https://www.youtube.com/watch?v=ShT7raBPP8k
Unpinned Momory

CoffeeBeforeArch. CUDA Crash Course (v2): Pinned Memory. YouTube. 2019. Available: https://www.youtube.com/watch?v=ShT7raBPP8k

CoffeeBeforeArch. CUDA Crash Course (v2): Pinned Memory. YouTube. 2019. Available: https://www.youtube.com/watch?v=ShT7raBPP8k
Pinned Memory

CoffeeBeforeArch. CUDA Crash Course (v2): Pinned Memory. YouTube. 2019. Available: https://www.youtube.com/watch?v=ShT7raBPP8k

Why don’t we pin memory all the time in PyTorch DataLoader

The DataLoader in PyTorch provide the option pin_memory. By default this option is set to False. It is tempting to set this to True all the time.

However, memory pinning also takes time and computing capacity, and may cause issues34.

Planted: by ;

Dynamic Backlinks to cards/machine-learning/practice/cuda-memory:

L Ma (2022). 'CUDA Memory', Datumorphism, 10 April. Available at: https://datumorphism.leima.is/cards/machine-learning/practice/cuda-memory/.