NVIDIA NVCC and CUDA: Cubin vs. PTX?

You have mixed up the options to select a compilation phase ( -ptx and -cubin ) with the options to control which devices to target ( -code ), so you should revisit the documentation.

Up vote 0 down vote favorite share g+ share fb share tw.

First sorry my english. Second this my question: So I use the CUDA 4.0 arch. With a Compute_Capability 2.0 device (GTX460).

What are the differents between the cubin and the ptx file? I think the cubin is a native code for the gpu so this is arch. Specific, and the ptx is an intermediate language that run on Fermi devices (e.g. Geforce GTX 460) via JIT compilation.

When I compiling a cu source, I can choose between the ptx or cubin target. If I want the cubin file I choose the "code=sm_20" but if I want a ptx file I use the "code=compute_20". Is it correct?

Laci cuda nvidia nvcc link|improve this question asked Oct 8 '11 at 10:35user9737641.

1 . Cubin is a CUDA binary, . Ptx is CUDA assembler source (text) that gets passed to the ptxas assembler – Paul R Oct 8 '11 at 11:46.

You have mixed up the options to select a compilation phase (-ptx and -cubin) with the options to control which devices to target (-code), so you should revisit the documentation. NVCC is the NVIDIA compiler driver. The -ptx and -cubin options are used to select specific phases of compilation, by default, without any phase-specific options nvcc will attempt to produce an executable from the inputs.

Most people use the -c option to cause nvcc to produce an object file which will later be linked into an executable by the default platform linker, the -ptx and -cubin options are only really useful if you are using the Driver API. For more information on the intermediate stages, check out the nvcc manual which is installed when you install the CUDA Toolkit. The output from -ptx is a plain-text PTX file.

PTX is an intermediate assembly language for NVIDIA GPUs which has not yet been fully optimised and will later be assembled to the device-specific code (different devices have different register counts for example, hence fully optimising PTX would be wrong). The output from -cubin is a fat binary which may contain one or more device-specific binary images as well as (optionally) PTX. The -code argument you refer to has a different purpose entirely.

I'd encourage you to check out the nvcc documentation which contains several examples, in general I would advise using the -gencode option instead since it allows more control and allows you to target multiple devices in one binary. As a quick example: -gencode arch=compute_xx,code=\'compute_xx,sm_yy,sm_zz\' causes nvcc to target all devices with compute capability xx (that's the arch= bit) and to embed PTX (code=compute_xx) as well as device specific binaries for sm_yy and sm_zz into the final fat binary.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

NVIDIA NVCC and CUDA: Cubin vs. PTX?

Related Questions

Cannot load .cubin module in CUDA Driver API?

Close a file pointer in Cuda (nvcc)?

Hi I am confused, which one is better 128 Bits with Cuda cores 92 or 64 Bits with Cuda cores 384?

Prefetching in Nvidia CUDA?

Where does Cuda kernel code reside on nvidia GPU?

How do you calculate the load on a nvidia (cuda capable), gpu card?