[PlanetCCRMA] Pentium-4 and denormal numbers on planetccrma

Fernando Lopez-Lezcano nando@ccrma.Stanford.EDU
Fri Jan 28 14:52:03 2005


On Wed, 2005-01-12 at 03:40, Steve Harris wrote:
> On Mon, Jan 10, 2005 at 11:07:55 +0100, andersvi@extern.uio.no wrote:
> NB, using the SSE instruction set uses a more efficient denormal handler
> (about 40x slower than processing a normal number), but it still doesnt
> zero them, if you also call this function I hacked up when the program
> starts:

Hi Steve, I'm trying to use this to see if I can get freeverb back from 
denormal hell, but I can't compile it as position independent code as in
that mode the compiler uses the bx register (I did some searches but
could not find a solution that both compiled and did not segfault :-)

-- Fernando

> #ifdef __SSE__
> #include <xmmintrin.h>
> #endif
> 
> void set_denormal_flags()
> {
>     unsigned long a, b, c, d;
> 
> #ifdef __SSE__
> 
>     asm("cpuid": "=a" (a), "=b" (b), "=c" (c), "=d" (d) : "a" (1));
>     if (d & 1<<25) { /* It has SSE support */
>         _MM_SET_FLUSH_ZERO_MODE(_MM_FLUSH_ZERO_ON);
> 
>         asm("cpuid": "=a" (a), "=b" (b), "=c" (c), "=d" (d) : "a" (0));
>         if (b == 0x756e6547) { /* It's an Intel */
>             int stepping, model, family, extfamily;
> 
>             family = (a >> 8) & 0xf;
>             extfamily = (a >> 20) & 0xff;
>             model = (a >> 4) & 0xf;
>             stepping = a & 0xf;
>             if (family == 15 && extfamily == 0 && model == 0 && stepping < 7) {
>                 return;
>             }
>         }
>         asm("cpuid": "=a" (a), "=b" (b), "=c" (c), "=d" (d) : "a" (1));
>         if (d & 1<<26) { /* bit 26, SSE2 support */
>             _mm_setcsr(_mm_getcsr() | 0x40);
>         }
>     } else {
>         fprintf(stderr, "This code has been built with SSE support, but your processor does not support\nthe SSE instruction set.\nexiting\n");
>         exit(1);
>     }
> #endif
> }
> 
> The FPU will zero any denormals when it encounters them.