Vladimir Zabrodsky: About MODIFIND

ABOUT MODIFIND

I. Correspondence with Paul E. Black

I sent (2 April 1999) to the forum comp.programming and comp.theory the following subject Three Selection Algorithms:

Hello,
could you read my article FIND, SELECT, MODIFIND, please? See:
http://geocities.datacellar.net/SiliconValley/Garage/3323/3alg.html
Abstract:
A comparative study of three selection algorithms for finding K-th smallest of N elements:
Hoare's FIND (the H algorithm);
my modification of FIND (the Z algorithm);
Floyd's and Rivest's SELECT (the FR algorithm).
Algorithms Z and FR are always better than H; FR has lesser the number of comparison than Z; Z has lesser the number of swaps than FR. When the number of swaps isn't critical then is FR better than Z - and vice versa.

Paul E. Black wrote me on the 6th April 1999:

Dear Dr. Zabrodsky:
I saw your posting on comp.theory and was impressed with your article "Three Algorithms for Finding K-th Smallest of N Elements". I volunteered to be the area editor for Algorithms, Data Structures, and Problems of the new CRC Dictionary of Computer Science, Engineering and Technology.
http://hissa.nist.gov/dads/
I plan for this site to be an on-line reference for the entire computer science community.
I am recruiting people to add terms and definitions or check current definitions. Would you contribute?
Sincerely,
-paul-
--
Paul E. Black (p.black@acm.org)    100 Bureau Drive, Stop 8970
paul.black@nist.gov        Gaithersburg, Maryland 20899-8970
voice: +1 301 975-4794       fax: +1 301 926-3696
Web: http://hissa.ncsl.nist.gov/~black/black.html      KC7PKT

I replied him 12 April 1999:

Dear Dr Black,
I can offer you:
1) A succint definition of the Selection problem (entries Selection problem, select k-th element in your dictionary).
Given an array A of N elements and an the positive integer K <= N, determine the K-th smallest element of A and rearrange the array in such a way that

A[1], A[2], ..., A[K-1] <= A[K] <= A[K+1], ..., A[N]

2) The N. Wirth's implementation of the Hoare's algorithm FIND in Pascal (for the array X of integers).

procedure FIND(K:integer);
  var I,J,L,R,W,X:integer;
begin L:=1; R:=N;
  while L<R do
  begin X:=A[K]; I:=L; J:=R;
    repeat
      while A[I]<X do I:=I+1;
      while X<A[J] do J:=J-1;
      if I<=J then
        begin W:=A[I]; A[I]:=A[J]; A[J]:=W;
              I:=I+1; J:=J-1
        end
    until I>J;
    if J<K then L:=I;
    if K<I then R:=J
  end
end

3) V. Zabrodsky's algorithm MODIFIND (in Pascal), the simple improvement of FIND, which has lesser the number of swaps than FIND.

procedure MODIFIND(K:integer);
  var I,J,L,R,W,X:integer;
begin L:=1; R:=N;
  while L<R do
  begin X:=A[K]; I:=L; J:=R;
    repeat
      while A[I]<X do I:=I+1;
      while X<A[J] do J:=J-1;
      W:=A[I]; A[I]:=A[J]; A[J]:=W;
      I:=I+1; J:=J-1
    until (J<K) or (K<I);
    if J<K then L:=I;
    if K<I then R:=J
  end
end

--------------------------------------------------------------------------
You can use an arbitrary part of my article "Three Algorithms for Finding K-th Smallest of N Elements" as well as.
Regards
Vladimir Zabrodsky

And Dr. Black replied me on the 12th April 1999:

Dear Mr. Zabrodsky;
Thank you for your contribution. I have added the definitions and algorithms.
Sincerely,

-paul-

II. Why MODIFIND? Short History of my Algorithm

I discovered a strange behaviour of the Find algorithm in June 1989. I found the simple modification of the Find algorithm. A first implementation was written in the Rexx language but later I worked only with Pascal's versions. During autumn 1990 I wrote my article Finding K-th Smallest of N Elements. I sent this paper to the Czech computer magazin Bajt. But my paper didn't even printed in winter 1991. I revised, corrected and extended my article and I sent it again with poetic or music title Variation on the classic theme to another Czech computer magazin Elektronika (with permission from Bajt). It was published in No. 6 (June) 1992 on pages 33-34 [5]. But in Bajt, No. 8 1992 on pages 130-131 [6], was published the old version my paper, too! In these papers I called the Hoare's algorithm Find as P1 (the abbreviation from first procedure) and my modification as P2 (I resisted temptation named it QUICKFIND). In my notes (three notebooks A4 every with 40 sheets of paper) I described my algorithm sometimes as Fnd. In WEB's and Rexx's version I used the name Z (and H and FR). But for Dr. Black's dictionary I think Z it isn't a lifelike name for selection algorithm - I wrote MODIFIND in my mail ...

III. Notes on Proof of S(N, K)

The number of swaps made by MODIFIND in average case is:

S(N, K) = 0.5((N + 1)H_N - (N - K)H_{N - K + 1} - (K - 1)H_K)

To do the average-case analysis I used exercises 5.2.22 and following from the D. E. Knuth's book [7]. Is it the same method as for Find? Yes, it is. My explanation follows. See on this example.

In the first step Find pushs aside left all elements lesser or equal 4 (i.e. four elements) and it will find in the second step 3rd smallest element in the subarray 9, 8, 7, 5, 6.
MODIFIND stops work with the number 4, when it know that 4 can't be 7th smallest element. It will find 5th smallest element in the subarray 8, 9, 1, 2, 7, 5, 6. But the swaps the numbers 1 and 2 are unnecessary! This numbers don't change your position during execution of the MODIFIND algorithm. We can suppose (for the average-case analysis, counting swaps) that MODIFIND finds in the second step 3rd smallest element in the subarray 8, 9, 7, 5, 6.
The average number of swaps that will have been made in the first step is:
((K - 1) (N - K) / 2N) + (N - 1) / N
We thus obtain the reccurence relation for the average number of swaps in the complete MODIFIND (inspiration from D. E. Knuth):

IV. Succint definition of selection problem

C. A. R. Hoare wrote in [1]:
The purpose of the program Find is to find that element of an array A[1:N] whose value is fth in order of magnitude; and to rearrange the array in such a way that this element is placed in A[f]; and furthermore, all elements with subscribts lower than f have lesser values, and all elements with subscripts greater than f have greater values. Thus on completion of the program, the following relationship will hold:

A[1], A[2], ... , A[f - 1] <= A[f] <= A[f + 1], ... , A[N]

In [2] R. W. Floyd and R. L. Rivest wrote: The selection problem can be succinctly stated as follows: given a set X of n distinct numbers and an integer i, 1 <= i <= n, determine the ith smallest element of X with as few comparisons as possible. The ith smallest element, denoted by i o X, is that element which is larger than exactly i - 1 other elements, so that 1 o X is the smallest, and n o X the largest, element in X.
In [3] R. W. Floyd and R. L. Rivest wrote: SELECT will rearrange the values of array segment X[L : R] so that X[K] (for some given K; L <= K <= R) will contain the (K - L + 1)-th smallest value, L <= I <= K will imply X[I] <= X[K], and K <= I <= R will imply X[I] >= X[K].
In [4] we can read: The input to our selection routine will be the positive integer N, the array X[1 .. N], and the positive integer K <= N. The program must permute the array so that X[1 .. K - 1] <= X[K] <= X[K + 1 .. N]. At that point, the K^th-smallest element in the set resides in its proper position, X[K].
My article FIND, SELECT, MODIFIND includes the definition: The selection problem can be stated as follows (C. A. R. Hoare in [1]): given an array A of N distinct elements and an integer K, 1 <= K <= N, determine the Kth smallest element of A and rearrange the array in such a way that this element is placed in A[K] and all elements with subscripts lower than K have lesser values and all elements with subscripts greater than K have greater values. Thus on completion of the program, the following relationship will hold:

A[1], A[2], ..., A[K - 1] <= A[K] <= A[K + 1], ..., A[N]

My succint definition of selection problem:

I send to Dr. Black following definition:
Given an array A of N elements and an the positive integer K <= N, determine the K-th smallest element of A and rearrange the array in such a way that

A[1], A[2], ..., A[K - 1] <= A[K] <= A[K + 1], ..., A[N]

In this definition Dr. Black in New CRC Dictionary of Computer Science (Algorithms, Data Structures, and Problems) (term select and partition problem) mistaked the word rearrange with the word partition:
The dictionary says:
select and partition

(classic problem)
Definition: Given an array A of n elements and a positive integer k <= n, find the k^th smallest element of A and partition the array such that A[1], ..., A[k-1] <= A[k] <= A[k+1], ..., A[n].

Literature

1. C. A. R. HOARE: Proof of a Program: FIND
CACM, Vol 14, No. 1, January 1971, pp. 39-45

2. R. W. FLOYD, R. L. RIVEST: Expected Time Bounds for Selection.
CACM, March 1975, Vol. 18, No. 3, pp. 165-172

3. R. W. FLOYD, R. L. RIVEST: Algorithm 489 The Algorithm SELECT - for Finding the i-th Smallest of n Elements [M1]
CACM, March 1975, Vol. 18, No. 3, p. 173

4. J. BENTLEY: Selection (in Programming Pearls)
CACM, November 1985, Vol. 28, No. 11, pp. 1121-1127

5. V. ZABRODSKY: Variace na klasicke tema
Elektronika, No. 6 1992, p. 33-34

6. V. ZABRODSKY: Nalezeni k-teho nejmensiho prvku
Bajt, No. 8 1992, p. 130-131

7. D. E. KNUTH: The Art of Computer Programming. Sorting and Searching
Addison-Wesley, Reading, Mass., 1973