并行向量处理机
基本特点
平行向量处理机最大的特点是系统中的CPU是专门定制的向量处理器(VP)。系统还提供共享存储器以及与VP相连的高速交叉开关。
来自现实世界的例子: x86 架构中的向量指令应用
// 改自英文維基 /wiki/Vector_processor//SSE simd function for vectorized multiplication of 2 arrays with single-precision floatingpoint numbers//t param pointer on source/destination array, 2nd param 2. source array, 3rd param number of floats per arrayvoidmul_asm(float*out,float*in,unsignedintleng){unsignedintcount,rest;//compute if array is big enough for vector operationrest=(leng*4)%16;count=(leng*4)-rest;// vectorized part; 4 floats per loop iterationif(count>0){__asm__volatile__(".intel_syntax noprefix\n\t""loop: \n\t""sub ecx,16 \n\t"// decrease address pointer by 4 floats"movups xmm0,[ebx+ecx] \n\t"// loads 4 floats in first register (xmm0)"movups xmm1,[eax+ecx] \n\t"// loads 4 floats in second register (xmm1)"mulps xmm0,xmm1 \n\t"// multiplies both vector registers"movups [eax+ecx],xmm0 \n\t"// write back the result to memory"jnz loop \n\t"".att_syntax prefix \n\t"::"a"(out),"b"(in),"c"(count),"d"(rest):"xmm0","xmm1");}// scalar part; 1 float per loop iterationif(rest!=0){__asm__volatile__(".intel_syntax noprefix\n\t""add eax,ecx \n\t""add ebx,ecx \n\t""rest: \n\t""sub edx,4 \n\t""movss xmm0,[ebx+edx] \n\t"// load 1 float in first register (xmm0)"movss xmm1,[eax+edx] \n\t"// load 1 float in second register (xmm1)"mulss xmm0,xmm1 \n\t"// multiplies both scalar parts of registers"movss [eax+edx],xmm0 \n\t"// write back the result\n\t""jnz rest \n\t"".att_syntax prefix \n\t"::"a"(out),"b"(in),"c"(count),"d"(rest):"xmm0","xmm1");}return;}
参阅
并行计算
免责声明:以上内容版权归原作者所有,如有侵犯您的原创版权请告知,我们将尽快删除相关内容。感谢每一位辛勤著写的作者,感谢每一位的分享。

- 有价值
- 一般般
- 没价值








24小时热门
推荐阅读


关于我们

APP下载


{{item.time}} {{item.replyListShow ? '收起' : '展开'}}评论 {{curReplyId == item.id ? '取消回复' : '回复'}}