[Spce-user] mysql 100% cpu after upgrade to mr6.5.2

Rene Krenn rkrenn at sipwise.com
Mon Dec 3 19:15:03 EST 2018


What is your exact setup, ie.
- how many contracts use billing profiles with monthly fraud limits, how many with profiles with daily fraud limits?
- how many contracts have contract fraudpreferences in place?

As a general tip for now, try avoid using fraud limts with trunk subscriber contracts or system contracts, which tend to produce exceptional big call histories. To avoid piling up, run auto-lock and fraud-daily with a proper job period.
*update* since you reported the count() query of the fraud job appear in the slow query log in particular, we will prepare hotfixes in next days to strictly suppress them. This should mitigate the issues you see right away.
Thanks for pointing/insisting here.

some backgrounds:

- the fraud queries in older versions were incorrect or not accurate at least. 
- in 5.x branches there were multiple fixes tot he query, at the cost of a berformance draw. Also it was finally changed to iterate the /api/customerfraudevents/ rail directly (which added the unneccesary count queries), which is one more definite difference to 3.8. 
- in 6.3 a particular aspect of the query was optimized by introducing rasterized billing mapping intervals. Pure query performnce got 7x better in tested sceanrios, but still it's not scalabale.

People around (Walter iirc) reported improved query times by increasing ram and/or innodb buffer sizes (make sure perf settings are in effect in configs). This is because the query structure still relies on preparing (big) temptables, and indexes of involved tables must fit into ram. This not really scalable and will however only help to some limited extend -> ie. as long the stuff still fits into ram.

One fundamental problem with the job/query approach is calculating cost sums from cdrs on-the-fly, which is - and ever was - just painfully slow (also remember ie. dashboard loading), as any sum() aggregate inherently requires to scan each value to add.

It therefore originally was intended to run the queries/jobs once per day at most.

Refactoring is on the roadmap and will look like follows:
1. get rid of sum() by using (and populating) daily/monthly cost sums in (sip-)stats db
2. entirely drop fraud and credit warning jobs and change to scalable "lazy "approach (trigger event like fraud notification directly at the time the call is rated and a limit is exceeded). It's the proven appraoch already used for ie. the balance underrung machanism of profile packages and will ultimatively satisfy any obsession to run fraud checks in shortest intervals as possible.

We then could also add more detailed controls how to consider intervals in a more flxible way (ie. previous month) for the limit (since user requirements differ strongly in this regards).

Regards, rene

PS. Btw., how is call list loading for you now with new recent version, can you confirm the improvement?

-----Ursprüngliche Nachricht-----
Von: Jon Bonilla (Manwe) [mailto:manwe at sipdoc.net] 
Gesendet: Montag, 3. Dezember 2018 21:57
An: Rene Krenn <rkrenn at sipwise.com>
Cc: spce-user at lists.sipwise.com
Betreff: Re: [Spce-user] mysql 100% cpu after upgrade to mr6.5.2

El Fri, 30 Nov 2018 19:50:07 +0100
"Jon Bonilla (Manwe)" <manwe at sipdoc.net> escribió:


> 
> 
> I think the query is the daily fraud calc. I hacve several of them 
> stuck in the server.
> 
> The lab server is an exact copy of the production server but upgraded 
> to version mr6.5.2: 8 cores, 20GB RAM and SSD disks. The CDR has about 8M rows.
> I have 2.8 and 3.8 servers with much more rows and never had an issue before. 
> 
> The complete query is:
> 
> 
> Top shows 766% CPU usage by mysqld, 0%wa
> 


Any hints on this?


-- 
https://pekepbx.com
https://www.issabel.com/multitenant




More information about the Spce-user mailing list