Suppose that:
1) you create a new C++ (Empty C++) in Visual Studio 2008 Express.
2) you have a Core 2 Duo 6400@2.13GHz like mine or something
3) we write, compile and execute this code:
Code:
# include <math.h>
int main (void)
{
sqrt (71634.4174f);
return 0;
}
The implementation of sqrt (71634.4174f); going to map a (or a set of) instruction (s) Math (s) established (s) in the processor, or are we going to use an implementation of calculating the square root of this issue found in the library by pointing math.h. And if so, are there ways to ensure that the program should make use of mathematical instructions located in the CPU, without diving into the writing of microcode assembler?
For info, I found this, about the Core 2 Duos:
Sqrt is an intrinsic on the SSE-platform, and thus reduces to a single instruction. The implementation for ppu/spu is based on the inverse-square-root estimate intrinsic of these platforms. This instruction is combined with one iteration of the Newton-Raphson algorithm to provide the final result.
I wonder in passing (and admiration) that their implementation is able to obtain a sufficiently accurate with a single iteration of the Newton algo. Their basic approximation must be devilishly good. Reminds me of 0x5f3759df
I ask this because I have tested the execution time of 10 million once a sqrt () and 10 million times a root obtained by the Newton algo precisely, and I was surprised to get a time 2/3 or 3/4 lower with a code made by hand.
So, what is VC++ compiler will do with a code like that?
Bookmarks