OUTBOUND DIALING OPTIMIZATIONS DOC	Started: 2016-11-15	Updated: 2016-12-03


This document is a work in progress and is meant for reference only, to go over approaches to optimizing and speeding up the outbound auto-dialing processes



Places to look at for speeding up outbound auto-dialing call routing speed:
1. Checking for Local/ channel resolution
2. Database load and query speed
3. AGI script compilation and execution time





SECTION 1.  LOCAL/ CHANNEL RESOLUTION

First, and explanation of what exactly the "Local/ channel resolution" issue is. Outbound auto-dial calls are placed from VICIdial to Asterisk using a Local/ channel, which goes to the dialplan and places the calls out through a carrier. This process creates up to 4 "Local" channels for a single call. When the call is placed, there is no audio stream yet, only signaling, and it can stay that way until after the Answer signal is received from the carrier. At that point, the Local/ channel will resolve to it's proper channel pointer(usually SIP/... if using a sip carrier) and the pseudo "Local" channels that were created go away. Where the issues come in sometimes, is how long it takes for the audio to begin being received from the carrier so that the channel pointer can resolve. This is extremely variable, and depends on the end customer carrier, as well as all of the carrier equipment in-between the dialer and the end customer. We have seen Local/ channels resolve anywhere from 0.01 seconds up to 2 seconds, and in a small percentage of cases, the channel never resolves because no audio stream is ever received, even after 30 seconds of waiting post-Answer-signal. Another thing to remember related to this, Asterisk isn't perfect, and sometimes those temporary pseudo "Local" channels will register as Answered for a split second before disappearing, which is another reason why it is important for the AGI routing script to not attempt to route Local/ channels.

Currently, the outbound call agent routing process will immediately attempt to route the call if the Local/ channel has been resolved, and if it isn't, it will wait 1 second before trying again, then if it is still not resolved, giving up and logging the call as LRERR status. One of the optimizations we are trying in the BETA script is to try multiple attempts at much faster intervals to check for Local/ channel resolution. The compilation of the AGI script can take anywhere from 0.1 to 0.3 seconds, depending on the server hardware and system load, then if the AGI script detects a Local/ channel, it quits and is tried again immediately. This also allows for a longer amount of time to wait for the channel to resolve because the number of loops is adjustable.

The AST_vdad_debug_log_report.php report was created to help evaluate the log entries from the BETA AGI scripts.

In our first three rounds of live testing, we have seen the following results:
- Calls that had no resolution delay, routed in 0.1 seconds on average post-Answer-signal (85-90% of Answered calls)
- Calls that had a resolution delay, routed in 1.5 seconds on average post-Answer-signal (1-5% of Answered calls)
- Calls that never resolved in over 30 seconds made up from 5-10% of total Answered calls
- The Asterisk 11 server had a lower rate of LRERR calls, and faster overall routing compared to 1.4 and 1.8 servers

Some conclusions based upon more rounds of live testing:
- Allowing more than 1 second for channel resolution increased delayed resolution calls being routed by about 30%, although this is still only 1-2% of total Answered calls
- Using different carriers at different times can greatly affect the rate of LRERRs, from less than 1% to over 30%
- If a call does not resolve within 3 seconds post-Answer, there is very little chance it will ever resolve, although we did have some calls resolve at 25+ seconds
- Asterisk 11 appears to be more successful at forcing resolutions, and routing calls faster, when compared to earlier branches of Asterisk
- The quality of your telco carrier, and the state of that telco network, is a big factor in the LRERR rate for post-answer calls
- The speed of your database is a much larger factor in the speed of the routing of outbound VDAD calls post-Answer
- The loadavg of the dialer will be LARGELY affected by use of the BETA AGI script(loadavg over 20.00 on quad-core), although it does not seem to affect functionality






SECTION 2. DATABASE LOAD AND QUERY SPEED

Given the number of database queries in the outbound AGI routing process(there are over 100 queries), the speed of the database has a huge effect on the speed of the AGI call routing process. Because of this, we recommend keeping less than 2 million leads in the vicidial_list table for an average VICIdial system. That recommendation is of course variable depending on the level of hardware that is used. Our high-end live production database system had 5.5 million leads in that table while testing, and there was no measurable change in the speed of the call routing after removing large numbers of leads. However, when that system had over 10 million leads, the call routing was measurably slower. For reference, the hardware specs of that database server are: 4 x 4-core CPUs, 24GB RAM, 4 x SSD drives with a LSI Logic MegaRAID caching RAID controller.


Other Recommendations:

Basic my.cnf and proper hardware optimizations.

Faster DB = Faster call routing

Use LSI Logic MegaRAID controllers in RAID-10 configuration with fast SSD drives with database data on separate partition from the OS and applications.





SECTION 3. AGI SCRIPT COMPILATION AND EXECUTION TIME

This section goes over the option of stripping down the AGI routing process(agi-VDAD_ALL_outbound.agi) as much as possible to improve the speed of routing outbound auto-dialed calls.

The following features could safely be removed and still allow for basic outbound call routing functions within the AGI:
	"LO" agent search method(try to route to agent on dialed server first, not used in default configs)
	CPD AMD, Sangoma call progress, answering machine detection
	QueueMetrics logging
	concurrent transfers if set to AUTO
	extension append CIDname
	grade random next-agent-call routing
	reminder message, play only
	remote-agent call routing
	routing initiated recordings
	survey recording
	survey, play message and wait for dtmf response
	text-to-speech survey
	inactive trigger process

Removing these features will result in the removal of over 65% of the code in the AGI script. This should greatly improve both perl interpretation time and reduce RAM usage, but might not have a very significant effect on execution time due to all of these removed code segments being enclosed inside of "if" statements that aren't run unless those features are activated anyway.

Testing this change will require either NOT removing the "remote-agent call routing", or testing on a production system that does not use remote-agents.

The "BETA2" AGI script was created to test this, and it has about 40% of the base code removed, while still keeping the basic routing and new enhanced logging in-tact. We did two rounds of live calling using the BETA2 script. Overall, the removal of 40% of the code had no measurable effect on the overall speed of outbound calls routing post-Answer. In our controlled test environment, we did see an overall improvement of 2-3 hundredths of a second for the initiation time for the AGI script to start running, but in live calling there was no measurable improvement over the base BETA AGI script. The variances in the telco network were much greater than any small improvements that this optimization option might yield. Because of this, and because of the large increase in resources it would take to maintain stripped-down AGI scripts for each set of possible features, we will not be perusing this as an optimization path.


Possible outbound AGI optimizations(sections to consider removing):
688	747		survey recording
750	988		text-to-speech
1026	1097		QueueMetrics logging
1101	1176		reminder message, play only
1182	2052		survey, play message and wait for dtmf response
2081	2126		CPD AMD, Sangoma call progress
2155	2176		concurrent transfers if set to AUTO
2298	2352		grade random next-agent-call routing
2419	2580		remote agent call routing
2583	2691		routing initiated recordings
2691	2712		extension append CIDname
2725	2811		more QueueMetrics logging
2149	2850		*LO agent search method
2856	2877		concurrent transfers if set to AUTO
2999	3053		grade random next-agent-call routing
3133	3294		remote agent call routing
3297	3405		routing initiated recordings
3407	3427		extension append CIDname
3442	3528		more QueueMetrics logging
3689	3705		more QueueMetrics logging
3721	3737		more QueueMetrics logging
3847	3908		DTMF detection for survey feature
3993	4146		CPD AMD, Sangoma call progress
4192	4211		text-to-speech
4214	4233		commented-out trigger process











NOTES:

The new script we will be testing changes on will be named "agi-VDAD_ALL_outboundBETA.agi".


To be added to dialplan
; BETA VICIDIAL_auto_dialer transfer script Load Balanced Survey:
exten => 8376,1,Playback(sip-silence)
exten => 8376,n,AGI(agi://127.0.0.1:4577/call_log)
exten => 8376,n,Set(LRct=1)
exten => 8376,n,While($[${LRct} < 100])
exten => 8376,n,AGI(agi-VDAD_ALL_outboundBETA.agi,SURVEYCAMP-----LB)
exten => 8376,n,Set(LRct=$[${LRct} + 1])
exten => 8376,n,EndWhile()
exten => 8376,n,AGI(agi-VDAD_ALL_outboundBETA.agi,SURVEYCAMP-----LB)
exten => 8376,n,Hangup()

; BETA VICIDIAL_auto_dialer transfer script Load Balanced:
exten => 8377,1,Playback(sip-silence)
exten => 8377,n,AGI(agi://127.0.0.1:4577/call_log)
exten => 8377,n,Set(LRct=1)
exten => 8377,n,While($[${LRct} < 100])
exten => 8377,n,AGI(agi-VDAD_ALL_outboundBETA.agi,NORMAL-----LB)
exten => 8377,n,Set(LRct=$[${LRct} + 1])
exten => 8377,n,EndWhile()
exten => 8377,n,AGI(agi-VDAD_ALL_outboundBETA.agi,NORMAL-----LB)
exten => 8377,n,Hangup()


; BETA2 VICIDIAL_auto_dialer transfer script Load Balanced:
exten => 8386,1,Playback(sip-silence)
exten => 8386,n,AGI(agi://127.0.0.1:4577/call_log)
exten => 8386,n,Set(LRct=1)
exten => 8386,n,While($[${LRct} < 100])
exten => 8386,n,AGI(agi-VDAD_ALL_outboundBETA2.agi,SURVEYCAMP-----LB)
exten => 8386,n,Set(LRct=$[${LRct} + 1])
exten => 8386,n,EndWhile()
exten => 8386,n,AGI(agi-VDAD_ALL_outboundBETA2.agi,SURVEYCAMP-----LB)
exten => 8386,n,Hangup()

; BETA2 VICIDIAL_auto_dialer transfer script Load Balanced:
exten => 8387,1,Playback(sip-silence)
exten => 8387,n,AGI(agi://127.0.0.1:4577/call_log)
exten => 8387,n,Set(LRct=1)
exten => 8387,n,While($[${LRct} < 100])
exten => 8387,n,AGI(agi-VDAD_ALL_outboundBETA2.agi,NORMAL-----LB)
exten => 8387,n,Set(LRct=$[${LRct} + 1])
exten => 8387,n,EndWhile()
exten => 8387,n,AGI(agi-VDAD_ALL_outboundBETA2.agi,NORMAL-----LB)
exten => 8387,n,Hangup()



Added vicidial_vdad_log table to store debug and timestamp data(examples below). To enabled this level of logging in the BETA script, you must run the following SQL statement:

update system_settings set vdad_debug_logging='1';


| V1150814130000849778 | 192.168.198.5 | 2016-11-15 08:14:17 | 1479215657.011447 | 2016-11-15 08:14:15 | 0.004011 | agi-VDAD_ALL_outbound.agi |  849778 | end-drop-1.25          |
| V1150814130000849778 | 192.168.198.5 | 2016-11-15 08:14:15 | 1479215656.778370 | 2016-11-15 08:14:15 | 0.233063 | agi-VDAD_ALL_outbound.agi |  849778 | waiting-for-agent-1.25 |
| V1150814130000849778 | 192.168.198.5 | 2016-11-15 08:14:15 | 1479215656.545376 | 2016-11-15 08:14:15 | 0.232980 | agi-VDAD_ALL_outbound.agi |  849778 | waiting-for-agent-1    |
| V1150814130000849778 | 192.168.198.5 | 2016-11-15 08:14:15 | 1479215656.312302 | 2016-11-15 08:14:15 | 0.233063 | agi-VDAD_ALL_outbound.agi |  849778 | waiting-for-agent-0.75 |
| V1150814130000849778 | 192.168.198.5 | 2016-11-15 08:14:15 | 1479215656.079066 | 2016-11-15 08:14:15 | 0.233225 | agi-VDAD_ALL_outbound.agi |  849778 | waiting-for-agent-0.5  |
| V1150814130000849778 | 192.168.198.5 | 2016-11-15 08:14:15 | 1479215655.844731 | 2016-11-15 08:14:14 | 0.234320 | agi-VDAD_ALL_outbound.agi |  849778 | waiting-for-agent-0.25 |
| V1150814130000849778 | 192.168.198.5 | 2016-11-15 08:14:15 | 1479215655.841086 | 2016-11-15 08:14:14 | 0.003636 | agi-VDAD_ALL_outbound.agi |  849778 | prerouteQM             |
| V1150814130000849778 | 192.168.198.5 | 2016-11-15 08:14:15 | 1479215655.820757 | 2016-11-15 08:14:14 | 0.020319 | agi-VDAD_ALL_outbound.agi |  849778 | preroute               |
| V1150814130000849778 | 192.168.198.5 | 2016-11-15 08:14:15 | 1479215655.623380 | 2016-11-15 08:14:14 | 0.014294 | agi-VDAD_ALL_outbound.agi |  849778 | LocalEXIT-5 11         |
| V1150814130000849778 | 192.168.198.5 | 2016-11-15 08:14:15 | 1479215655.426731 | 2016-11-15 08:14:14 | 0.016263 | agi-VDAD_ALL_outbound.agi |  849778 | LocalEXIT-5 10         |
| V1150814130000849778 | 192.168.198.5 | 2016-11-15 08:14:15 | 1479215655.232397 | 2016-11-15 08:14:14 | 0.014853 | agi-VDAD_ALL_outbound.agi |  849778 | LocalEXIT-5 9          |
| V1150814130000849778 | 192.168.198.5 | 2016-11-15 08:14:15 | 1479215655.037638 | 2016-11-15 08:14:13 | 0.014383 | agi-VDAD_ALL_outbound.agi |  849778 | LocalEXIT-5 8          |
| V1150814130000849778 | 192.168.198.5 | 2016-11-15 08:14:14 | 1479215654.843460 | 2016-11-15 08:14:13 | 0.014945 | agi-VDAD_ALL_outbound.agi |  849778 | LocalEXIT-5 7          |
| V1150814130000849778 | 192.168.198.5 | 2016-11-15 08:14:14 | 1479215654.651362 | 2016-11-15 08:14:13 | 0.014983 | agi-VDAD_ALL_outbound.agi |  849778 | LocalEXIT-5 6          |
| V1150814130000849778 | 192.168.198.5 | 2016-11-15 08:14:14 | 1479215654.457621 | 2016-11-15 08:14:13 | 0.014685 | agi-VDAD_ALL_outbound.agi |  849778 | LocalEXIT-5 5          |
| V1150814130000849778 | 192.168.198.5 | 2016-11-15 08:14:14 | 1479215654.262349 | 2016-11-15 08:14:13 | 0.014732 | agi-VDAD_ALL_outbound.agi |  849778 | LocalEXIT-5 4          |
| V1150814130000849778 | 192.168.198.5 | 2016-11-15 08:14:14 | 1479215654.066875 | 2016-11-15 08:14:12 | 0.014224 | agi-VDAD_ALL_outbound.agi |  849778 | LocalEXIT-5 3          |
| V1150814130000849778 | 192.168.198.5 | 2016-11-15 08:14:13 | 1479215653.873137 | 2016-11-15 08:14:12 | 0.014942 | agi-VDAD_ALL_outbound.agi |  849778 | LocalEXIT-5 2          |
| V1150814130000849778 | 192.168.198.5 | 2016-11-15 08:14:13 | 1479215653.677652 | 2016-11-15 08:14:12 | 0.015336 | agi-VDAD_ALL_outbound.agi |  849778 | LocalEXIT-5 1          |



| V1150818290000833387 | 192.168.198.5 | 2016-11-15 08:18:35 | 1479215915.585579 | 2016-11-15 08:18:34 | 0.233213 | agi-VDAD_ALL_outbound.agi |  833387 | waiting-for-agent-0.5  |
| V1150818290000833387 | 192.168.198.5 | 2016-11-15 08:18:35 | 1479215915.351982 | 2016-11-15 08:18:34 | 0.233585 | agi-VDAD_ALL_outbound.agi |  833387 | waiting-for-agent-0.25 |
| V1150818290000833387 | 192.168.198.5 | 2016-11-15 08:18:35 | 1479215915.348672 | 2016-11-15 08:18:34 | 0.003301 | agi-VDAD_ALL_outbound.agi |  833387 | prerouteQM             |
| V1150818290000833387 | 192.168.198.5 | 2016-11-15 08:18:35 | 1479215915.329398 | 2016-11-15 08:18:34 | 0.019263 | agi-VDAD_ALL_outbound.agi |  833387 | preroute               |
| V1150818290000833387 | 192.168.198.5 | 2016-11-15 08:18:35 | 1479215915.133643 | 2016-11-15 08:18:33 | 0.014761 | agi-VDAD_ALL_outbound.agi |  833387 | LocalEXIT-5 1          |
