python - Weighted random selection with and without replacement -


recently needed weighted random selection of elements list, both , without replacement. while there known , algorithms unweighted selection, , weighted selection without replacement (such modifications of resevoir algorithm), couldn't find algorithms weighted selection replacement. wanted avoid resevoir method, selecting significant fraction of list, small enough hold in memory.

does have suggestions on best approach in situation? have own solutions, i'm hoping find more efficient, simpler, or both.

one of fastest ways make many replacement samples unchanging list alias method. core intuition can create set of equal-sized bins weighted list can indexed efficiently through bit operations, avoid binary search. turn out that, done correctly, need store 2 items original list per bin, , can represent split single percentage.

let's take example of 5 equally weighted choices, (a:1, b:1, c:1, d:1, e:1)

to create alias lookup:

  1. normalize weights such sum 1.0. (a:0.2 b:0.2 c:0.2 d:0.2 e:0.2) probability of choosing each weight.

  2. find smallest power of 2 greater or equal number of variables, , create number of partitions, |p|. each partition represents probability mass of 1/|p|. in case, create 8 partitions, each able contain 0.125.

  3. take variable least remaining weight, , place of it's mass possible in empty partition. in example, see a fills first partition. (p1{a|null,1.0},p2,p3,p4,p5,p6,p7,p8) (a:0.075, b:0.2 c:0.2 d:0.2 e:0.2)

  4. if partition not filled, take variable weight, , fill partition variable.

repeat steps 3 , 4, until none of weight original partition need assigned list.

for example, if run iteration of 3 , 4, see

(p1{a|null,1.0},p2{a|b,0.6},p3,p4,p5,p6,p7,p8) (a:0, b:0.15 c:0.2 d:0.2 e:0.2) left assigned

at runtime:

  1. get u(0,1) random number, binary 0.001100000

  2. bitshift lg2(p), finding index partition. thus, shift 3, yielding 001.1, or position 1, , partition 2.

  3. if partition split, use decimal portion of shifted random number decide split. in case, value 0.5, , 0.5 < 0.6, return a.

here code , explanation, unfortunately doesn't use bitshifting technique, nor have verified it.


Comments

Popular posts from this blog

javascript - Karma not able to start PhantomJS on Windows - Error: spawn UNKNOWN -

c# - Display ASPX Popup control in RowDeleteing Event (ASPX Gridview) -

Nuget pack csproj using nuspec -