Data Structures in Powershell
A good algorithm is not always the solution to a given problem. In most cases, a mechanism to save the data used by the algorithm is required, in order not only to organise the data but also optimise its execution. Such mechanisms are called Data Structures and you can think of them as small databases. Data Structures not only hold data, but also expose methods to access and update it.
My DataStructures module contains the most frequently used data structures, implemented for Powershell objects in C#. At the time this post is written, the module contains the stack and queue structures, but more are on the way.
Stack
The idea behind the stack data structure is to save objects in order to be processed later in the execution of an algorithm. The below figure shows the way stacks work:
A classic example of algorithms that take advantage of stacks are algorithms that perform some kind of recursive processing. Take the following code, for example, that recursively searches a directory tree using the DFS - Depth First Search method:
function dfs { param ( [string]$Path ) $items = Get-ChildItem -Path $Path -Directory foreach($i in $items) { Write-Output $i.fullname dfs -Path $i.fullname } } dfs -Path C:\top
The dfs function invokes itself for all the directories in the path being processed. Each time the function invokes itself, the current invocation goes into an internal stack that saves all the information so that the execution can resume when the invocation finishes. The second invocation of the function may also invoke itself, adding another state in the stack. As each invocation completes, the latest state saved in the stack is used in order to continue until the function completes.
The commands below show how to use the stack implementation from my module. Note that since the stack is a data type, the module has to be explicitly imported using the Import-Module cmdlet.
The key methods of the stack type are:
- Push: inserts an item
- Pop: removes an item
- Peek: shows the top item without removing it
- Clear: removes all the items from the stack
- Count: gets the number of items in the stack
- ToArray: returns an array object containing the items in the stack
- ToList: returns a list containing the items in the stack
To create a stack, use the New-PSObject cmdlet and to add items use the Push method:
The Peek and Pop methods return the top item in the stack, either keeping in or removing it:
The Count method returns the number of items in the stack and the Clear method removes all items from it:
Queue
The second type included in the module is Queue. Queues operate in a similar manner, with the difference that when an item is removed from the queue, the item to be removed is the last one. This is why queues are also named FIFO - First In First Out.
The below diagram shows how queues operate:
The key methods of the queue type implemented in my module are:
- Enqueue: inserts an item
- Dequeue: removes an item
- Peek: shows the top item without removing it
- Clear: removes all the items from the queue
- Count: gets the number of items in the stack
- ToArray: returns an array object containing the items in the stack
- ToList: returns a list containing the items in the stack
Stacks and Queues are usually based on some form of a pointed list or even array, making the insertion and removal of elements easier and faster.
I hope you find those types useful! Happy coding!