Random sampling

The Algorithm selects K elements at random from an array A. of N elements. For comparison, listed below are two algorithms: Knuth's SAMPLING_K and Bentley's SAMPLING_B.

IMPLEMENTATION
Unit: internal subroutine
 
Global variables: input array A. of arbitrary elements, output array B.
 
Parameters: a positive integer N - number of elements in A., positive integer K - size of the sample
 
Result: Array B. - the sample
 


SAMPLING_K: procedure expose A. B.
parse arg N, K
T = 0; M = 0
do while M < K
  if RANDOM(0, N - T - 1) < K - M
    then do
      M = M + 1; T = T + 1; B.M = A.T
    end
    else T = T + 1
end
return

 

 


SAMPLING_B: procedure expose A. B.
parse arg N, K
NotMember. = 1
S = 0
do while S < K
  T = RANDOM(1, N)
  if NotMember.T
    then do
      S = S + 1; B.S = A.T; NotMember.T = 0
    end
end
return

 

COMPARISON
For N=10000; K=5000 and 1000 trials, the average time SAMPLING_K was about 0.5 seconds and the average time SAMPLING_B was about 0.3 seconds.

 

CONNECTIONS

Literatura
Bentley J. Programming Pearls - A Sample of Brilliance
CACM September 1987 No. 9, p. 754-757
Knuth D. E. Seminumerical Algorithms, vol. 2 of The Art of Computer Programming
Addison-Wesley, 1973


Cover Contents Index Main page Rexx page   Mail

last modified 6th August 2001
Copyright © 2000-2001 Vladimir Zabrodsky
Czech Republic
 

1