[Chapter 3] 3.10 Choosing the Best Performer

3.10 Choosing the Best Performer

If there is a difference in execution time between the performance of my two string-repeaters, it will not be a big one. I will need to execute the functions many times to compare the cumulative difference. The best way to calculate the elapsed time of PL/SQL code execution is with the GET_TIME function of the DBMS_UTILITY builtin package. I have encapsulated GET_TIME inside the PLVtmr package (PL/Vision TiMeR) to make it easier to use. Example 3.9 shows the kind of script I used.[4]

[4] By the way, this code was for the most part generated for me with the PLVgen package to compare the performance of repeated, rep_rpad, and also the recursion-based implementation of repeated (see sidebar).

This SQL*Plus script (stored in the file timerep.sql on the disk) takes three arguments. The first, &1, accepts the number of times to execute each function. The second, &2, accepts a string that is to be duplicated. The third, &3, accepts the number of repetitions of the string. I ran the script several times as shown below:

SQL> @timerep 100 abc 1
duprpad Elapsed: .77 seconds. Factored: .0077 seconds.
duploop Elapsed: .66 seconds. Factored: .0066 seconds.
recrep Elapsed: .71 seconds. Factored: .0071 seconds.

SQL> @timerep 100 abc 10
duprpad Elapsed: .71 seconds. Factored: .0071 seconds.
duploop Elapsed: .99 seconds. Factored: .0099 seconds.
recrep Elapsed: 1.54 seconds. Factored: .0154 seconds.

I ran each of these tests several times to allow the numbers to stabilize. The results are very interesting and certainly reinforce the need for a careful test plan. When repeating the string just once, the recursion-based implementation is superior. Upon reflection, this should not be a surprise. It handles a single repetition as a special case: an unmediated concatenation of two strings. The loop-based implementation comes in second, but all of the timings are very close. When we move to multiple repetitions of the string, however, the recrep function becomes extremely slow; again, I would expect that behavior because of the extra work performed by the PL/SQL runtime engine to manage a recursive program. The big news from this round, however, is that the RPAD implementation of repeated establishes itself clearly as the fastest technique.

Example 3.9: A Performance Comparison Script

DECLARE
   a VARCHAR2(100) := '&2';
   aa VARCHAR2(10000);
BEGIN
   PLVtmr.set_factor (&1);
   PLVtmr.capture;
   FOR rep IN 1 .. &1
   LOOP
      aa := rep_rpad (a, 'UL', &3);
   END LOOP;
   PLVtmr.show_elapsed ('duprpad');
   
   PLVtmr.set_factor (&1);
   PLVtmr.capture;
   FOR rep IN 1 .. &1
   LOOP
      aa := repeated (a, 'UL', &3);
   END LOOP;
   PLVtmr.show_elapsed ('duploop');

   PLVtmr.set_factor (&1);
   PLVtmr.capture;
   FOR rep IN 1 .. &1
   LOOP
      aa := recrep (a, 'UL', &3);
   END LOOP;
   PLVtmr.show_elapsed ('recrep');      
END;
/

Just when you think you've covered the bases, someone comes along and shows you a new way. I often use the twice function in my classes to demonstrate the development spiral. While training a group of 35 students at the Oracle Netherlands training center in De Meern, I reached the point where it was time to expand the scope of twice to allow any number of repetitions of the string. So I asked my class for some ideas. Immediately, a quiet voice piped up from the first row: "Use recursion." Recursion? It had never crossed my mind. I must admit that recursion is not an approach to which my brain readily turns. But it certainly seemed like a logical approach to take with the repeated function.

Never one to scorn a student's idea, we quickly cobbled together the version of repeated you see in Example 3.10. It is called recrep for RECursive REPeater.

Of course, I also need to compare the performance for different kinds of strings. I ran the same timer script as follows to see how each function handled NULL values:

SQL> @timerep 200 null 10
duprpad Elapsed: 1.59 seconds. Factored: .00795 seconds.
duploop Elapsed: 2.03 seconds. Factored: .01015 seconds.
recrep Elapsed: 2.91 seconds. Factored: .01455 seconds.

In this scenario, the RPAD implementation was considerably faster than the loop and recursion techniques (though, once again, I found that if the number of repetitions was set to 1, the recrep function was faster). Finally, I greatly increased the number of string repetitions and then all became clear:

SQL> @timerep 100 abc 100
duprpad Elapsed: .77 seconds. Factored: .0077 seconds.
duploop Elapsed: 4.28 seconds. Factored: .0428 seconds.
recrep Elapsed: 5.22 seconds. Factored: .0522 seconds.

The conclusion I draw from my tests is that the RPAD technique offers a much more stable solution than that based on the FOR loop. Regardless of the number of repetitions, RPAD takes about the same amount of time. With the FOR loop and recursion approaches, as the repetitions increase, the performance degrades. That is not the sign of a healthy algorithm.

Given the results, it would make sense to implement the repeated function using the RPAD technique. You could possibly optimize further by using the FOR loop approach for small numbers of repetitions, and then switch to RPAD for larger repetitions. The gain with the FOR loop for minimal repetitions is, however, minimal -- it's probably not worth the trouble.

I was glad to see that the RPAD approach is faster. You should always use a builtin if it exists, rather than build your own. The FOR loop technique arose quite naturally from the way I expanded the scope of the twice function. It turned out, however, that it was not the path to the optimal solution. As for recursion, well, it is always an interesting phenomenon to watch and puzzle out, but it rarely offers the best implementation (except when it is the only implementation feasible).

Example 3.10: The Code for the Recursive Implementation of repeated

CREATE OR REPLACE FUNCTION recrep 
   (string_in IN VARCHAR2, 
    action_in IN VARCHAR2 := NULL,
    num_in IN INTEGER := 1)
   RETURN VARCHAR2
IS
   v_action VARCHAR2(10) := UPPER (action_in);  
   initval VARCHAR2(32767);
   nextval VARCHAR2(32767); 
   v_retval VARCHAR2(32767);
   
BEGIN
   assert
      (v_action IN ('UL', 'LU', 'N'),
       'Please enter UL LU or N');
   assert
      (num_in >= 0, 'Duplication count must be at least 0.');

   IF v_action = 'UL'
   THEN
      initval := UPPER (string_in);   
      nextval := LOWER (string_in);
   ELSIF v_action = 'LU'
   THEN
      initval := LOWER (string_in);      
      nextval := UPPER (string_in);
   ELSE
      initval := string_in;      
      nextval := string_in;
   END IF; 

   IF num_in = 1
   THEN
      RETURN initval || nextval;
   ELSE
      /* No more case conversions performed... */
      RETURN (initval || repeated (nextval, 'N' , num_in-1));
   END IF;
END recrep;
/


3.9 Considering Implementation Options		3.11 Don't Forget Backward Compatibility

3.10 Choosing the Best Performer

Example 3.9: A Performance Comparison Script

Using Recursion to Repeat the String

Example 3.10: The Code for the Recursive Implementation of repeated