欢迎来到Doc100.Net免费学习资源知识分享平台!
您的位置:首页 > 程序异常 >

lies, damned lies, and benchmarks(r13a smp性能测试) - erlang非业余研究

更新时间: 2015-05-04 00:00:00 责任编辑: Author_N14

 

原文地址: http://www.erlangatwork.com/2009/03/lies-damned-lies-and-benchmarks.html

Erlang/OTP R13A was released today with a number of major SMP improvements. I've been playing with R13 snapshots for a while and wrote a simple HTTP server to compare the SMP performance on R12 and R13. This server uses {packet, http} to decode requests, increments a counter with a transactional mnesia:read/3 and mnesia:write/1, and responds with the counter's previous value. You'll find the source here.

I ran the HTTP server on a x86_64 CentOS 5 machine running Linux 2.6.18-53.el5. The server has two quad-Core Intel Xeon E5450 CPUs and 8GB of RAM. Erlang/OTP R12B-5 and R13A were compiled from source and run as erl -pa ebin +SN -s ehttpd start where N indicated the number of schedulers to run.

To get performance numbers I ran ab on another server connected via a 100 Mb/s private VLAN as ab -c N -n 100000 http://10.0.0.32:8889/ where N was the number of concurrent requests. ab was run 3 times for each value of N and the following chart shows the average requests/sec with 4 and 8 schedulers.


[img]schedulers.png [/img]

R13A's SMP improvements include multiple run queues and improved locking. It also supports binding schedulers to specific CPU cores and hardware threads. Binding isn't enabled by default, so the following chart shows the result of setting erlang:system_flag(scheduler_bind_type, thread_no_node_processor_spread) and running with 100 concurrent requests.

[img]requests_sec.png [/img]

There is a lot missing from these benchmarks, I didn't test kernel polling and only generated load from one client machine. The drop between 500 and 1000 concurrent requests on R13A +S8 looks too steep and may be the result of using ab. That said, the SMP optimizations in R13 are looking very promising!

根据我在ecug上做的实验:8核心的cpu
[spawn(ring, run,[["100", "10000000000"]]) || _X <- lists:seq(1,1000)].

R12B5:
CPU  User%  Sys% Wait% Idle|0          |25         |50          |75       100|                                                    3
3 1  21.3  62.4   0.0   16.3|UUUUUUUUUUsssssssssssssssssssssssssssssss       >|                                                    3
3 2  20.9  61.7   0.0   17.4|UUUUUUUUUUssssssssssssssssssssssssssssss      >  |                                                    3
3 3  19.9  63.2   0.0   16.9|UUUUUUUUUsssssssssssssssssssssssssssssss         >                                                    3
3 4  18.9  64.2   0.0   16.9|UUUUUUUUUssssssssssssssssssssssssssssssss        >                                                    3
3 5  19.9  62.7   0.0   17.4|UUUUUUUUUsssssssssssssssssssssssssssssss         >                                                    3
3 6  20.9  63.2   0.0   15.9|UUUUUUUUUUsssssssssssssssssssssssssssssss        >                                                    3
3 7  19.4  62.7   0.0   17.9|UUUUUUUUUsssssssssssssssssssssssssssssss       > |                                                    3
3 8  19.4  63.7   0.0   16.9

R13A:

CPU  User%  Sys% Wait% Idle|0          |25         |50          |75       100|                                                    3
3 1  61.2  31.8   0.0    7.0|UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUsssssssssssssss >  |                                                    3
3 2  64.7  29.9   0.0    5.5|UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUssssssssssssss > |                                                    3
3 3  62.7  29.9   0.0    7.5|UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUssssssssssssss >  |                                                    3
3 4  61.0  32.5   0.0    6.5|UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUssssssssssssssss > |                                                    3
3 5  62.5  30.5   0.0    7.0|UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUsssssssssssssss > |                                                    3
3 6  64.2  29.4   0.0    6.5|UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUssssssssssssss  >|                                                    3
3 7  63.7  29.9   0.0    6.5|UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUssssssssssssss >  |                                                    3
3 8  65.7  27.9   0.0    6.5|UUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUUsssssssssssss >  |                                                    3
3                           +-------------------------------------------------+                                                    3
3Avg 63.2  30.2   0.0    6.5


sys的调用主要是futex 所有对锁的依赖大量减少!

结论: 速度提高了将近2倍 效果真的很好yeah!
上一篇:上一篇
下一篇:下一篇

 

随机推荐程序问答结果

 

 

如对文章有任何疑问请提交到问题反馈,或者您对内容不满意,请您反馈给我们DOC100.NET论坛发贴求解。
DOC100.NET资源网,机器学习分类整理更新日期::2015-05-04 00:00:00
如需转载,请注明文章出处和来源网址:http://www.doc100.net/bugs/t/1623314/
本文WWW.DOC100.NET DOC100.NET版权所有。