[Spce-user] Help - many packet loss and packets to unknown port

Walter Klomp walter at myrepublic.com.sg
Wed Jun 22 23:01:14 EDT 2016


Hi,

Running SPCE 3.8.5 on dedicated ESXi host (Dell R320 with Xeon E2460 & 16GB RAM) with ~30.000 registered subscribers (and online).

Last week we were having horrible statistics and packet-loss galore… After tweaking the network settings with the below, I have managed to minimize the packet-loss.. but still there is.

sysctl -w net.core.rmem_max=33554432
sysctl -w net.core.wmem_max=33554432
sysctl -w net.core.rmem_default=65536
sysctl -w net.core.wmem_default=65536
sysctl -w net.ipv4.tcp_mem='8388608 8388608 8388608'
sysctl -w net.ipv4.udp_mem='4096 174760 33554432'
sysctl -w net.ipv4.tcp_rmem='4096 87380 8388608'
sysctl -w net.ipv4.tcp_wmem='4096 65536 8388608'
sysctl -w net.ipv4.route.flush=1

I am currently still seeing around 300 packets per second going to unknown ports. Below are the statistics.  That’s about 1/5th of all the packets received are not being processed… That seems a lot to me.

 10:43:40 up 2 days,  5:11,  3 users,  load average: 1.52, 2.05, 2.17

Every 1.0s: netstat -anus|grep -A 7 Udp:                                                                                                                   Thu Jun 23 10:40:45 2016

Udp:
    310870895 packets received
    61212884 packets to unknown port received.
    103338 packet receive errors
    312245302 packets sent
    RcvbufErrors: 103249
    SndbufErrors: 765
    InCsumErrors: 75



I had to do a lot of buffer tweaking to get the RcvbufErrors down and even the SndbufErrors as every time it happens (at bursts - sporadically every 10 minutes, but definitely every half hour), one would get silence and the packet receive errors would should up by about between 200 and 800 packets.

The load average can shoot up to 4.x at times.   Knowing that Sipwise Pro is on the same hardware, and they support up to 50.000 users, what am I missing?

rtpengine is running in kernel. major contributor of CPU usage is actually MySQL regularly maxing out at 100%. Especially when it’s doing the fraud check. Below is a snapshot of top….

top - 10:56:53 up 2 days,  5:24,  3 users,  load average: 2.39, 2.14, 1.94
Tasks: 184 total,   1 running, 183 sleeping,   0 stopped,   0 zombie
%Cpu(s): 25.3 us,  7.0 sy,  0.0 ni, 63.7 id,  1.0 wa,  0.0 hi,  2.9 si,  0.0 st
KiB Mem:  12334464 total, 12157676 used,   176788 free,   144944 buffers
KiB Swap:  2096124 total,        0 used,  2096124 free,  4430336 cached

  PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+  COMMAND
 4063 mysql     20   0 6127m 5.6g 7084 S  54.7 47.7 809:35.18 mysqld
 2576 root      20   0  253m 7176 1816 S   9.9  0.1 164:02.97 rsyslogd
 5058 root      20   0 67176  11m 5308 S   6.0  0.1   7:05.16 rate-o-mat
15432 root      20   0  276m  12m 3696 S   5.0  0.1 117:56.92 rtpengine
 5257 sems      20   0  873m  37m 7624 S   4.0  0.3 139:44.03 ngcp-sems
30996 kamailio  20   0  539m 100m  53m S   4.0  0.8   6:02.68 kamailio

Does anybody have any pointers I can try to completely eliminate the packet loss and where do these unknown port packets go to?

Thanks
Walter.



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sipwise.com/mailman/private/spce-user_lists.sipwise.com/attachments/20160623/b173d4f3/attachment.html>


More information about the Spce-user mailing list