Solving Knapsack with Powershell
One of the most common tasks when designing the migration from a Microsoft Exchange organisation to Exchange Online, is to separate the mailboxes into batches. I recently had to solve a quiz like that where I had to select the right mailboxes to migrate on each day, given the restriction imposed by the network bandwidth available. The way I finally approached it - and trigger for this post - was to apply the knapsack algorithm in order to make the best selection for each migration batch. There are of course many other parameters that you should take into account in cases like that, such as the role of the user and any permission delegations.
A short description of the knapsack problem, for those that have not heard it before. We are given a collection of items that each one has a specific weight and value and a knapsack that has specific weight capacity. The goal is to identify the optimal selection of items to put in the knapsack that will not only keep its total weight under the maximum allowed, but also provide the maximum value. A more detailed description of the Knapsack problem is available on Wikipedia.
To solve problems like Knapsack that may arise in the life of the administrator or engineer, I've decided to convert the C# code I've developed over the years to cmdlets and publish them as Powershell Core module. The name of the module is CPolydorou.AdvancedAlgorithms and it is available on the Powershell Gallery over here. At the time of this post, it only contains the Get-Knapsack cmdlet, but I'm going to update it in the coming weeks.
Let's go through some examples.
Assuming a collection of items that have Name, Value and Weight properties and are created using the below code:
# Create an array $items = New-Object -TypeName System.Collections.ArrayList # Create objects and add them to the array $item1 = New-Object -TypeName PSObject -Property @{Name="Item1"; Value=60; Weight=10} $items.Add($item1) | Out-Null $item2 = New-Object -TypeName PSObject -Property @{Name="Item2"; Value=100; Weight=20} $items.Add($item2) | Out-Null $item3 = New-Object -TypeName PSObject -Property @{Name="Item3"; Value=120; Weight=30} $items.Add($item3) | Out-Null
the optimal selection for a knapsack size of 50 would be items 2 and 3, since their combined weight is exactly 50 and they hold the maximum value of 220.
Executing the Get-Knapsack cmdlet according to the knapsack size limit and using the array of objects as input just like below:
Get-Knapsack -KnapsackSize 50 ` -Items $items ` -WeightPropertyName "Weight" ` -ValuePropertyName "Value"
will prove our guess right:
This algorithm can also be used when there's no value associated with the objects, where we assume that the value is the same as the weight. Executing the cmdlet without specifying a name for the value property will use the items' weight to perform the selection:
Get-Knapsack -KnapsackSize 50 ` -Items $items ` -WeightPropertyName "Weight"
I hope you'll find it useful!