python - Weighted random selection with and without replacement -
recently needed weighted random selection of elements list, both , without replacement. while there known , algorithms unweighted selection, , weighted selection without replacement (such modifications of resevoir algorithm), couldn't find algorithms weighted selection replacement. wanted avoid resevoir method, selecting significant fraction of list, small enough hold in memory.
does have suggestions on best approach in situation? have own solutions, i'm hoping find more efficient, simpler, or both.
one of fastest ways make many replacement samples unchanging list alias method. core intuition can create set of equal-sized bins weighted list can indexed efficiently through bit operations, avoid binary search. turn out that, done correctly, need store 2 items original list per bin, , can represent split single percentage.
let's take example of 5 equally weighted choices, (a:1, b:1, c:1, d:1, e:1)
to create alias lookup:
normalize weights such sum
1.0
.(a:0.2 b:0.2 c:0.2 d:0.2 e:0.2)
probability of choosing each weight.find smallest power of 2 greater or equal number of variables, , create number of partitions,
|p|
. each partition represents probability mass of1/|p|
. in case, create8
partitions, each able contain0.125
.take variable least remaining weight, , place of it's mass possible in empty partition. in example, see
a
fills first partition.(p1{a|null,1.0},p2,p3,p4,p5,p6,p7,p8)
(a:0.075, b:0.2 c:0.2 d:0.2 e:0.2)
if partition not filled, take variable weight, , fill partition variable.
repeat steps 3 , 4, until none of weight original partition need assigned list.
for example, if run iteration of 3 , 4, see
(p1{a|null,1.0},p2{a|b,0.6},p3,p4,p5,p6,p7,p8)
(a:0, b:0.15 c:0.2 d:0.2 e:0.2)
left assigned
at runtime:
get
u(0,1)
random number, binary0.001100000
bitshift
lg2(p)
, finding index partition. thus, shift3
, yielding001.1
, or position 1, , partition 2.if partition split, use decimal portion of shifted random number decide split. in case, value
0.5
, ,0.5 < 0.6
, returna
.
here code , explanation, unfortunately doesn't use bitshifting technique, nor have verified it.
Comments
Post a Comment