本帖最后由 cs.yeung 于 2016-12-30 22:31 编辑
在MATLAB中使用多GPU做并行计算实际上是将多块GPU分别赋予个计算机中的每个CPU核心(线程),即为MATLAB中worker,而MATLAB中涉及多线程并行计算往往跟pool有关。一台计算机的CPU核心或者GPU都是有限的,但是可以利用MATLAB提供的MDCS服务奖多台计算并联起来,然后在建立并行池,然后将GPU赋予给每个workers,最后使用MATLAB提供的多线程并行计算函数,如parfoor,spmd等等即可实现多GPU并行计算。如何建立分布式集群,请参看:http://developer.nvidia-china.co ... 7994&extra=page%3D1,如果你使用过MATLAB的CPU并行应该知道matlabpool,当前,在matlab中如果调用多GPU那么需要开启多个pool,一个pool对应一个GPU,也就是一个CPU worker对应一块GPU,如
- matlabpool 2
- spmd
- gpuDevice
- end
- spmd
- if labindex ==1
- gpuDevice(2);
- end
- end
- spmd
- gpuDevice
- end
复制代码 如果计算机中存在两块GPU,则得到如下结果:
- Lab 1:
-
- ans =
-
- CUDADevice with properties:
-
- Name: 'Quadro FX 370'
- Index: 2
- ComputeCapability: '1.1'
- SupportsDouble: 0
- DriverVersion: 5.5000
- ToolkitVersion: 5
- MaxThreadsPerBlock: 512
- MaxShmemPerBlock: 16384
- MaxThreadBlockSize: [512 512 64]
- MaxGridSize: [65535 65535 1]
- SIMDWidth: 32
- TotalMemory: 268435456
- FreeMemory: NaN
- MultiprocessorCount: 2
- ClockRateKHz: 720000
- ComputeMode: 'Default'
- GPUOverlapsTransfers: 1
- KernelExecutionTimeout: 1
- CanMapHostMemory: 1
- DeviceSupported: 0
- DeviceSelected: 1
-
- Lab 2:
-
- ans =
-
- CUDADevice with properties:
-
- Name: 'Tesla K20c'
- Index: 1
- ComputeCapability: '3.5'
- SupportsDouble: 1
- DriverVersion: 5.5000
- ToolkitVersion: 5
- MaxThreadsPerBlock: 1024
- MaxShmemPerBlock: 49152
- MaxThreadBlockSize: [1024 1024 64]
- MaxGridSize: [2.1475e+09 65535 65535]
- SIMDWidth: 32
- TotalMemory: 5.0330e+09
- FreeMemory: 4.9166e+09
- MultiprocessorCount: 13
- ClockRateKHz: 705500
- ComputeMode: 'Default'
- GPUOverlapsTransfers: 1
- KernelExecutionTimeout: 0
- CanMapHostMemory: 1
- DeviceSupported: 1
- DeviceSelected: 1
复制代码
以上转载自:https://zhidao.baidu.com/question/580164775.html
|