NVTX C API Reference v3
NVIDIA Tools Extension Library
Loading...
Searching...
No Matches
Data Structures | Macros | Typedefs | Functions
Memory CUDA Runtime

Data Structures

struct  nvtxMemCudaArrayRangeDesc_v1
 structure to describe memory in a CUDA array object More...
 
struct  nvtxMemCuArrayRangeDesc_v1
 structure to describe memory in a CUDA array object More...
 
struct  nvtxMemMarkInitializedBatch_v1
 Mark memory ranges as initialized. More...
 

Macros

#define NVTX_MEM_TYPE_CUDA_ARRAY   0x11
 The memory is from a CUDA runtime array.
 
#define NVTX_MEM_TYPE_CU_ARRAY   0x12
 The memory is from a CUDA device array.
 
#define NVTX_MEM_CUDA_PEER_ALL_DEVICES   -1
 

Typedefs

typedef nvtxMemCudaArrayRangeDesc_v1 nvtxMemCudaArrayRangeDesc_t
 
typedef nvtxMemCuArrayRangeDesc_v1 nvtxMemCuArrayRangeDesc_t
 
typedef nvtxMemMarkInitializedBatch_v1 nvtxMemMarkInitializedBatch_t
 

Functions

NVTX_DECLSPEC nvtxMemPermissionsHandle_t NVTX_API nvtxMemCudaGetProcessWidePermissions (nvtxDomainHandle_t domain)
 Get the permission object that represent the CUDA runtime device or cuda driver context.
 
NVTX_DECLSPEC nvtxMemPermissionsHandle_t NVTX_API nvtxMemCudaGetDeviceWidePermissions (nvtxDomainHandle_t domain, int device)
 Get the permission object that represent the CUDA runtime device or cuda driver context.
 
NVTX_DECLSPEC void NVTX_API nvtxMemCudaSetPeerAccess (nvtxDomainHandle_t domain, nvtxMemPermissionsHandle_t permissions, int devicePeer, uint32_t flags)
 Change the default behavior for all memory mapped in from a particular device.
 
NVTX_DECLSPEC void NVTX_API nvtxMemCudaMarkInitialized (nvtxDomainHandle_t domain, cudaStream_t stream, uint8_t isPerThreadStream, nvtxMemMarkInitializedBatch_t const *desc)
 Register a region of memory inside of a heap of linear process virtual memory.
 

Detailed Description

See page PAGE_MEMORY_CUDART.

Macro Definition Documentation

◆ NVTX_MEM_CUDA_PEER_ALL_DEVICES

#define NVTX_MEM_CUDA_PEER_ALL_DEVICES   -1

Definition at line 101 of file nvToolsExtMemCudaRt.h.

◆ NVTX_MEM_TYPE_CU_ARRAY

#define NVTX_MEM_TYPE_CU_ARRAY   0x12

The memory is from a CUDA device array.

Relevant functions: cuArrayCreate, cuArray3DCreate Also CUarray from other types such as CUmipmappedArray

NVTX_MEM_HEAP_HANDLE_PROCESS_WIDE is not supported

nvtxMemHeapRegister receives a heapDesc of type cudaArray_t because the description can be retrieved by tools through cudaArrayGetInfo() nvtxMemRegionRegisterEx receives a regionDesc of type nvtxMemCuArrayRangeDesc_t

Definition at line 84 of file nvToolsExtMemCudaRt.h.

◆ NVTX_MEM_TYPE_CUDA_ARRAY

#define NVTX_MEM_TYPE_CUDA_ARRAY   0x11

The memory is from a CUDA runtime array.

Relevant functions: cudaMallocArray, cudaMalloc3DArray Also cudaArray_t from other types such as cudaMipmappedArray_t

NVTX_MEM_HEAP_HANDLE_PROCESS_WIDE is not supported

nvtxMemHeapRegister receives a heapDesc of type cudaArray_t because the description can be retrieved by tools through cudaArrayGetInfo() nvtxMemRegionRegisterEx receives a regionDesc of type nvtxMemCudaArrayRangeDesc_t

Definition at line 58 of file nvToolsExtMemCudaRt.h.

Typedef Documentation

◆ nvtxMemCuArrayRangeDesc_t

Definition at line 97 of file nvToolsExtMemCudaRt.h.

◆ nvtxMemCudaArrayRangeDesc_t

Definition at line 71 of file nvToolsExtMemCudaRt.h.

◆ nvtxMemMarkInitializedBatch_t

Definition at line 184 of file nvToolsExtMemCudaRt.h.

Function Documentation

◆ nvtxMemCudaGetDeviceWidePermissions()

NVTX_DECLSPEC nvtxMemPermissionsHandle_t NVTX_API nvtxMemCudaGetDeviceWidePermissions ( nvtxDomainHandle_t  domain,
int  device 
)

Get the permission object that represent the CUDA runtime device or cuda driver context.

This object will allow developers to adjust permissions applied to work executed on the GPU. It may be inherited or overridden by permissions object bound with NVTX_MEM_PERMISSIONS_BIND_SCOPE_CUDA_STREAM, depending on the binding flags.

Ex. change the peer to peer access permissions between devices in entirety or punch through special holes

By default, all memory is accessible that naturally would be to a CUDA kernel until modified otherwise by nvtxMemCudaSetPeerAccess or changing regions.

This object should also represent the CUDA driver API level context.

◆ nvtxMemCudaGetProcessWidePermissions()

NVTX_DECLSPEC nvtxMemPermissionsHandle_t NVTX_API nvtxMemCudaGetProcessWidePermissions ( nvtxDomainHandle_t  domain)

Get the permission object that represent the CUDA runtime device or cuda driver context.

This object will allow developers to adjust permissions applied to work executed on the GPU. It may be inherited or overridden by permissions object bound with NVTX_MEM_PERMISSIONS_BIND_SCOPE_CUDA_STREAM, depending on the binding flags.

Ex. change the peer to peer access permissions between devices in entirety or punch through special holes

By default, all memory is accessible that naturally would be to a CUDA kernel until modified otherwise by nvtxMemCudaSetPeerAccess or changing regions.

This object should also represent the CUDA driver API level context.

◆ nvtxMemCudaMarkInitialized()

NVTX_DECLSPEC void NVTX_API nvtxMemCudaMarkInitialized ( nvtxDomainHandle_t  domain,
cudaStream_t  stream,
uint8_t  isPerThreadStream,
nvtxMemMarkInitializedBatch_t const *  desc 
)

Register a region of memory inside of a heap of linear process virtual memory.

stream is the CUDA stream where the range was accessed and initialized.

◆ nvtxMemCudaSetPeerAccess()

NVTX_DECLSPEC void NVTX_API nvtxMemCudaSetPeerAccess ( nvtxDomainHandle_t  domain,
nvtxMemPermissionsHandle_t  permissions,
int  devicePeer,
uint32_t  flags 
)

Change the default behavior for all memory mapped in from a particular device.

While typically all memory defaults to readable and writable, users may desire to limit access to reduced default permissions such as read-only and a per-device basis.

Regions can used to further override smaller windows of memory.

devicePeer can be NVTX_MEM_CUDA_PEER_ALL_DEVICES