concurrency - CUDA: Concurrent, Unique Kernels on the Same Multiprocessor? -
is possible, using streams, have multiple unique kernels on same streaming multiprocessor in kepler 3.5 gpus? i.e run 30 kernels of size <<<1,1024>>> @ same time on kepler gpu 15 sms?
on compute capability 3.5 device, might possible.
those devices support 32 concurrent kernels per gpu , 2048 threads peer multi-processor. 64k registers per multi-processor, 2 blocks of 1024 threads run concurrently if register footprint less 16 per thread, , less 24kb shared memory per block.
you can find of hardware description found in appendices of cuda programming guide.
Comments
Post a Comment