As an alternate answer to my other answer, here's how to do this with inline assembly rather than an intrinsic. (As Thomas Pornin notes on my other answer, intrinsics are generally better because they're more portable, but sometimes you want something like this too. ).
Up vote 1 down vote favorite share g+ share fb share tw.
Hello Everyone I want to execute an inline assembly instruction that is of the following form BLENDPD xmm1,xmm2/m128, imm8 I am new to inline assembly so I am having some difficulties. My code is: #include using namespace std; int main() { long long y; __asm("blendpd %0,$0xabcd000000001111abcd000000001111,$0x1": "=r" (y): ); cout #include using namespace std; int main() { const int mask=5; __m128d v2 = _mm_set_pd(1.0, 2.0); __m128d v1; v1=_mm_blend_pd(v1, v2, mask); return 0; } c++ gcc assembly inline-assembly link|improve this question edited Jan 8 '11 at 20:30Brooks Moses4,379724 asked Jan 7 '11 at 0:08Syntax_Error573111 96% accept rate.
Well one problem is that you have an odd number of quotes there. – Falmarri Jan 7 '11 at 0:09 5 Pro tip: You can find SSE extensions for any major compiler. Don't bother with inline assembler.
– DeadMG Jan 7 '11 at 0:10 @Falmarri: copy-paste error! Thanks:) @DeadMG: I want to insert this in a loop to check how much power it is consuming! Its not rly a computation...but I want to be able to show y – Syntax_Error Jan 7 '11 at 0:20 Error: Why not just use a profiler?
– DeadMG Jan 7 '11 at 0:54 Syntax_Error: When you edit a question to ask about a partial answer, please don't overwrite the original question -- remember that the point of StackOverflow is to create things that will be also be useful to other people who read it in the future, and for that they need to understand the question as well as the answer. (I've edited the original code version back in; hope that's okay. ) – Brooks Moses Jan 7 '117 at 19:34.
As an alternate answer to my other answer, here's how to do this with inline assembly rather than an intrinsic. (As Thomas Pornin notes on my other answer, intrinsics are generally better because they're more portable, but sometimes you want something like this too. ) First, I cheated -- I took the version with an intrinsic function, and compiled it with -S, and looked at the resulting assembly code, which is: movsd -64(%rbp), %xmm0 movhpd -56(%rbp), %xmm0 movsd -48(%rbp), %xmm1 movhpd -40(%rbp), %xmm1 blendpd $3, %xmm1, %xmm0 movlpd %xmm0, -64(%rbp) movhpd %xmm0, -56(%rbp) You can see here a few things different from your original code.
First, note that the two 128-bit arguments are not immediates -- they're the xmm0 and xmm1 registers. Also, you've got the operands in the wrong order -- the mask goes first, and the register that contains the output goes last. Fix those, and the code compiles.
The second problem here is that you're storing the result from a general register into y, and the blendpd instruction doesn't touch general registers, so that's just storing garbage. You want the xmm0 register, which you get with =Yz (See GCC's documentation here). And you can't store that into a long long, which is 64 bits; you need a 128-bit vector variable.
Solving all of those problems, the corrected code is: #include #include using namespace std; int main() { __m128d y; __asm("blendpd $0x3,%%xmm1,%%xmm0": "=Yz" (y): ); // cout Of course, the input values are undefined (whatever is randomly in those registers), so the output is garbage anyway -- you'd need to add something to load values into the registers first if you wanted to have a meaningful result.
Thank u! My bad abt editing the question! – Syntax_Error Jan 9 '11 at 11:46 You're welcome.
And no worries about the editing; everyone has to learn things sometime! :) – Brooks Moses Jan 10 '11 at 8:42.
First, for this sort of thing you very rarely need to use inline assembly. GCC generally provides "compiler intrinsic" functions which allow you to call a given special instruction using C function syntax rather than assembly syntax. In this case, the intrinsic function you want is _mm_blend_pd(), and it has this function signature #include __m128d _mm_blend_pd(__m128d v1, __m128d v2, const int mask); The compiler will replace that with the single blendpd instruction; this is not actually a function call.
The __m128d data type is a vector containing two double-precision float values; you can create one from an array of doubles like so: __m128d v = _mm_set_pd(1.0, 2.0); To retrieve the values from a vector to print them, you can store the vector into an array of double-precision floats: double a2; _mm_store_pd(a, v); All of this is based on the Intel Intrinsics manual at info.univ-angers.fr/~richer/ens/l3info/ao/in...; although this refers to the Intel C++ compiler, GCC supports the same syntax. Edit: Replaced erroneous emmintrin. H with correct smmintrin.h.
Also, note that the mask value needs to be 2-bit (one bit per value in the vector); values other than 0, 1, 2, or 3 produce an error. And of course you need to compile this with the -msse4 GCC option.
3 It is worth noticing that actually comes from the Intel compiler and will work with GCC, the Intel C compiler ICC, and Microsoft Visual C. This is much more portable than inline assembly: the exact same code will work with GCC on Linux and Visual C on Windows. Also, it works better with the GCC optimizer, because GCC understands what the intrinsics are about, and can allocate XMM registers accordingly; whereas inline assembly is an opaque dump-to-assembly-output thing with GCC.
– Thomas Pornin Jan 7 '11 at 14:32 I did as you requested and got an undefined reference to the function! Please refer to the edited question:) >thanks – Syntax_Error Jan 7 '11 at 22:55 Corrected; please see notes in the "Edit" I added. – Brooks Moses Jan 8 '11 at 19:22.
I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.