[Spce-user] Help - many packet loss and packets to unknown port

Skyler skchopperguy at gmail.com
Thu Jun 23 22:16:12 EDT 2016


Hi,

Try looking up udp.analysis.retransmission and tcp.analysis.retransmission
in regards to wireshark. Maybe those filters will isolate the traffic.

- Skyler

On Jun 23, 2016 7:06 PM, "Walter Klomp" <walter at myrepublic.com.sg> wrote:
>
> Hi,
>
> On the suggestion on let kamalio-lb listen on the port, that's where I am
stuck. How do I find the unknown port? There is so much traffic going on
that I do not know which port is unattended. Is there a simple command for
that?
>
> Yours sincerely,
>
> Walter Klomp
>
>
> On 24 Jun 2016, at 05:27, Skyler <skchopperguy at gmail.com> wrote:
>
>> Hi,
>>
>> On Jun 23, 2016 3:18 AM, "Walter Klomp" <walter at myrepublic.com.sg> wrote:
>> >
>> > Hi,
>> >
>> > MySQL is not locking up other than due to anti-fraud script which runs
every half an hour.
>> >
>>
>> Oh you mentioned mysql pinning cpu so I assumed we may have had the same
problem.
>>
>> > I think I can also rule out DDOS because it’s a steady 300-350 packets
per second that go to unknown port.
>> >
>>
>> Wow, so one device is doing that you figure? How do you know it's that
many pps if the port is unknown?
>>
>> If they are udp, I'd set kamailio lb to listen on that unknown port and
look in the logs to see what shows up.
>>
>> > What I have not figured out yet is how the heck I find out which
packets are the actual culprits…  Even doing a tcpdump on UDP packets only
and excluding the hosts I know are legit and the ports I know are legit,
still gives me a heck of a lot of traffic, probably actual payload traffic
of ongoing voice calls (around 250 concurrent)…
>> >
>> > Now the packets to unknown port could also be some equipment sending
some garbage (Grandstream ATA’s like to do this) to keep the NAT port open,
and it may not actually be a problem, but I still can’t seem to figure out
what causes the RcvbufErrors which periodically happen and when I listen to
for instance the conference bridge music, it will break for a while…
>> >
>>
>> I've never heard of grandstream devices sending that kind of pps before.
Unless it's like 3000 of them all misconfigured and pointing at you. All
UA's do nat ping on the port configured on the device, so 5060 usually.
Can't see devices being the problem here. The pps is too high.
>>
>> > How to find out when the rcvbuferror occurs, what application is
causing it?
>>
>> First find out where the packets are coming from and why. Then you'll
know if it can be dropped or what app to look at.
>>
>> > Thanks for any suggestions.
>> > Walter
>> >
>> >> On 23 Jun 2016, at 4:25 PM, Skyler <skchopperguy at gmail.com> wrote:
>> >>
>> >> Dang these thumbs..now to the list.
>> >>
>> >> On Jun 23, 2016 2:06 AM, "Skyler" <skchopperguy at gmail.com> wrote:
>> >>>
>> >>> Sorry, in the list now.
>> >>>
>> >>> I had a similar issue last month. Basically mysql locking up the
box. I think there's an update for hackers out there. Kamailio is
tuff...but mysql can be broken..
>> >>>
>> >>> It was resolved by exiting/dropping on common hacker UA which were
retreived from logs and the IP's. Eventually they gave up and moves along.
>> >>>
>> >>> Ddos type attack.
>> >>>
>> >>> -Skyler
>> >>>
>> >>> On Jun 23, 2016 1:59 AM, "Skyler" <skchopperguy at gmail.com> wrote:
>> >>>>
>> >>>> Looks like a flood to me. Yer spec is 2 days here, are you seeing
anything in lb or proxy log when tailing?
>> >>>>
>> >>>> - Skyler
>> >>>>
>> >>>> On Jun 22, 2016 9:01 PM, "Walter Klomp" <walter at myrepublic.com.sg>
wrote:
>> >>>>>
>> >>>>> Hi,
>> >>>>>
>> >>>>> Running SPCE 3.8.5 on dedicated ESXi host (Dell R320 with Xeon
E2460 & 16GB RAM) with ~30.000 registered subscribers (and online).
>> >>>>>
>> >>>>> Last week we were having horrible statistics and packet-loss
galore… After tweaking the network settings with the below, I have managed
to minimize the packet-loss.. but still there is.
>> >>>>>
>> >>>>> sysctl -w net.core.rmem_max=33554432
>> >>>>> sysctl -w net.core.wmem_max=33554432
>> >>>>> sysctl -w net.core.rmem_default=65536
>> >>>>> sysctl -w net.core.wmem_default=65536
>> >>>>> sysctl -w net.ipv4.tcp_mem='8388608 8388608 8388608'
>> >>>>> sysctl -w net.ipv4.udp_mem='4096 174760 33554432'
>> >>>>> sysctl -w net.ipv4.tcp_rmem='4096 87380 8388608'
>> >>>>> sysctl -w net.ipv4.tcp_wmem='4096 65536 8388608'
>> >>>>> sysctl -w net.ipv4.route.flush=1
>> >>>>>
>> >>>>> I am currently still seeing around 300 packets per second going to
unknown ports. Below are the statistics.  That’s about 1/5th of all the
packets received are not being processed… That seems a lot to me.
>> >>>>>
>> >>>>>  10:43:40 up 2 days,  5:11,  3 users,  load average: 1.52, 2.05,
2.17
>> >>>>>
>> >>>>> Every 1.0s: netstat -anus|grep -A 7 Udp:

            Thu Jun 23 10:40:45 2016
>> >>>>>
>> >>>>> Udp:
>> >>>>>     310870895 packets received
>> >>>>>     61212884 packets to unknown port received.
>> >>>>>     103338 packet receive errors
>> >>>>>     312245302 packets sent
>> >>>>>     RcvbufErrors: 103249
>> >>>>>     SndbufErrors: 765
>> >>>>>     InCsumErrors: 75
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> I had to do a lot of buffer tweaking to get the RcvbufErrors down
and even the SndbufErrors as every time it happens (at bursts -
sporadically every 10 minutes, but definitely every half hour), one would
get silence and the packet receive errors would should up by about between
200 and 800 packets.
>> >>>>>
>> >>>>> The load average can shoot up to 4.x at times.   Knowing that
Sipwise Pro is on the same hardware, and they support up to 50.000 users,
what am I missing?
>> >>>>>
>> >>>>> rtpengine is running in kernel. major contributor of CPU usage is
actually MySQL regularly maxing out at 100%. Especially when it’s doing the
fraud check. Below is a snapshot of top….
>> >>>>>
>> >>>>> top - 10:56:53 up 2 days,  5:24,  3 users,  load average: 2.39,
2.14, 1.94
>> >>>>> Tasks: 184 total,   1 running, 183 sleeping,   0 stopped,   0
zombie
>> >>>>> %Cpu(s): 25.3 us,  7.0 sy,  0.0 ni, 63.7 id,  1.0 wa,  0.0 hi,
 2.9 si,  0.0 st
>> >>>>> KiB Mem:  12334464 total, 12157676 used,   176788 free,   144944
buffers
>> >>>>> KiB Swap:  2096124 total,        0 used,  2096124 free,  4430336
cached
>> >>>>>
>> >>>>>   PID USER      PR  NI  VIRT  RES  SHR S  %CPU %MEM    TIME+
 COMMAND
>> >>>>>  4063 mysql     20   0 6127m 5.6g 7084 S  54.7 47.7 809:35.18
mysqld
>> >>>>>  2576 root      20   0  253m 7176 1816 S   9.9  0.1 164:02.97
rsyslogd
>> >>>>>  5058 root      20   0 67176  11m 5308 S   6.0  0.1   7:05.16
rate-o-mat
>> >>>>> 15432 root      20   0  276m  12m 3696 S   5.0  0.1 117:56.92
rtpengine
>> >>>>>  5257 sems      20   0  873m  37m 7624 S   4.0  0.3 139:44.03
ngcp-sems
>> >>>>> 30996 kamailio  20   0  539m 100m  53m S   4.0  0.8   6:02.68
kamailio
>> >>>>>
>> >>>>> Does anybody have any pointers I can try to completely eliminate
the packet loss and where do these unknown port packets go to?
>> >>>>>
>> >>>>> Thanks
>> >>>>> Walter.
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> _______________________________________________
>> >>>>> Spce-user mailing list
>> >>>>> Spce-user at lists.sipwise.com
>> >>>>> https://lists.sipwise.com/listinfo/spce-user
>> >>>>>
>> >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.sipwise.com/mailman/private/spce-user_lists.sipwise.com/attachments/20160623/adaf48a1/attachment.html>


More information about the Spce-user mailing list