[Spce-user] Billing fee selection by rate-o-mat

Tue May 17 03:09:11 EDT 2016

hi Julian,

thanks for starting this interesting discussion.

by specifying source and destination patterns, fee selection can be
controlled on most granular level, you can define a fee that only
matches a particular caller-callee sip uri pair. while there might be no
doubt about this approach, it comes down to the actual sip uri matching
method. changing the system's match method for rating fee means moving
from regexp to something faster at cost of expression power, so by
definition it will be impossible to translate existing patterns. if this
will ever be required, there is a plan and space for movement: we can
maintain backward compatibility by introducing a "fee type"
discriminator for fee records, to support multiple matching methods
including the existing (regexp) one, e.g.:

1. "regexp": regexp matching - current implementation
2. "trie": prefix tree matching; one table column to store upper and
lower lexical limit per e164 number segment. fast range queries, could
cover patterns in the form of [11..22][666..777][000001..999999]
3. "static": dialplan described by fixed e164 segments (except
suffix/subscriber number digits). fastest(!) by using
multicolumn-indexes, one column per segemnt (i.e. digit) as in you n=10
example

as you stated already, from these options 1. will be very flexible but
slowest, while 3. is less flexible (or requires a verbose setup at
least) but fastest. 2. and 3. have the downside to be somewhat limited
to e164 sip uris. They are therefore accellerated by compound
(multi-column database) indexes, which to my knowledge is not the same
as multidimensional indexes (e.g. quadtrees known from spatial
databases).

To my knowledge, the ngcp billing fees ever used regular expressions.
There is the good point of a concise design, as regular expressions are
used throughout the whole platform for sip uri matching (fees, rewrite
rules, ..). We came across the rate-o-mat performance topic in recent
discussions, but its is not identified as a bottleneck atm. although
ngcp 4.x provides performances enhancements, those primarily come from
reducing interferrence (deadlock rollbacks) of sql queries from
different components against the cdr table. Large setups basically scale
by having rate-o-mat instances running on each node. another recent
enhancement for rate-omat allows to run multiple rate-o-mat instances on
a single node (see rateomat-05-parallel.t), if this will ever be needed.

To pin down, the key metrics for rate-o-mat are the dial plan size
(number of fees) and throughput (calls per second). Encapsulating the
regexp approach with the tricky mysql query we currently use might be
optimal, so its about the max frequency this query can be performed.

I know there are multiple customers using a dialplan with mutliple 10k's
of fee entries, but no request for an optimized fee matching method was
requested so far (you are the first one :-)). the rate-o-mat testsuite
coming into existence some months ago does not include such performance
related test cases yet. However it's in preparation, e.g. rating time
measurement is already introduced, but no stats yet. We have a
"large-fee-table" testcase on the roadmap, i could priorize it. You are
invited to send me a large sample set of fees to use for that test if
you would like to.

>do I understand correctly that it is not really longest _match_ but
longest regular expression that has _a_ match
that is being used to determine the billingzone?
(so If my number is 499561123 and I have ^499561123 as regex and ^(00|\
+)4[9][5][6][1][1] (Yes ARTIFICIALLY constructed) 
the latter one would be the "best match" albeit the first being a
_direct_ one to one match?

yes, thats what i originally wanted to point out here. it's a
performance/usability trade-off. the idea is that precedence of more
specific patterns like ^(00|\+)499561123 over ^(00|\+)4[9]{1,2} still
holds when just comparing the pattern length.

picking the fee by actual match length instead of pattern length would
be possible by the query below, which requires
https://github.com/mysqludf/lib_mysqludf_preg to be linked to the mysql
server of ngcp nodes. I have not tried it, but i expect a considerable
performance decrease.

SELECT
  id,
  source,
  destination,
  PREG_CAPTURE(destination,?) as destination_match,
  PREG_CAPTURE(source,?) as source_match,
  onpeak_init_rate, onpeak_init_interval,
  onpeak_follow_rate, onpeak_follow_interval, 
  offpeak_init_rate, offpeak_init_interval, 
  offpeak_follow_rate, offpeak_follow_interval, 
  billing_zones_history_id, use_free_time 
FROM billing.billing_fees_history WHERE billing_profile_id = ? 
AND bf_id IS NOT NULL AND type = ? 
AND direction = ? AND ? REGEXP(source) AND ? REGEXP(destination) 
ORDER BY LENGTH(destination_match) DESC, LENGTH(source_match) DESC LIMIT
1

anyway, it would be possible to implement this matching mode as well,
and distinguish by providing an additional fee type mentioned earlier,
like "regexp/pattern-length" and this new "regexp/match-length".

regards, rene