Random sampling
The Algorithm selects K elements at random from an array A. of N elements. For comparison, listed below are two algorithms: Knuth's SAMPLING_K and Bentley's SAMPLING_B.
IMPLEMENTATION
Unit: internal subroutine
Global variables: input array A. of arbitrary elements, output array B.
Parameters: a positive integer N - number of elements in A., positive integer K - size of the sample
Result: Array B. - the sample
SAMPLING_K: procedure expose A. B.
parse arg N, K
T = 0; M = 0
do while M < K
if RANDOM(0, N - T - 1) < K - M
then do
M = M + 1; T = T + 1; B.M = A.T
end
else T = T + 1
end
return
|
SAMPLING_B: procedure expose A. B.
parse arg N, K
NotMember. = 1
S = 0
do while S < K
T = RANDOM(1, N)
if NotMember.T
then do
S = S + 1; B.S = A.T; NotMember.T = 0
end
end
return
|
COMPARISON
For N=10000; K=5000 and 1000 trials, the average time SAMPLING_K was about 0.5 seconds and the average time SAMPLING_B was about 0.3 seconds.
CONNECTIONS
Literatura
Bentley J. Programming Pearls - A Sample of Brilliance
CACM September 1987 No. 9, p. 754-757
Knuth D. E. Seminumerical Algorithms, vol. 2 of The Art of Computer Programming
Addison-Wesley, 1973