Delete Azure storage table entities with PowerShell

Deleting massive amounts of Azure table entities based on a query can be a time- consuming task. This example Script provides a way to use "Entity Group Transactions" and "Paging" to speed things up...

Overview

In short, the PowerShell function does the following logic:

  • Query the existing entities
  • take a bundle of max. 100 rows to delete them within a batch request
  • if the query returns more than 1000 rows, use pagination to get all rows
  • return the amount of deleted rows

Note: Recovering data takes much more time than deleting... I learned it the hard way. Make sure you have a Backup! and double check your queries !!!

Example

The PowerShell function requires some parameters:

Remove-AzureTableEntities -SASToken "sv=2019-02-02&se=2022-03-10T13%3A38%3A38Z&sp=raud&sig=vCQ4%2Bg0rt5YGGI8bFGv5tW0o5s2KMl31NuXwIGjJ3vo%3D&tn=test" -StorageUri "https://myTest.table.core.windows.net" -TableName "TEST" -Query "PartitionKey eq 'lookup' and Timestamp ge datetime'2020-03-08T13:49:49.000Z' and Timestamp lt datetime'2021-11-30T13:49:49.000Z'"

StorageUri

StorageUri is the URL of your StorageAccount like https://myTest.table.core.windows.net

TableName

TableName specifies the table you want to use

SASToken

SASToken is used to authenticate on the Storage Account. You can use "Azure Storage Explorer" or the Azure Portal to generate a SAS Token. Make sure you grant delete rights on the Azure Table. Example: "sv=2019-02-02&se=2022-03-10T13%3A38%3A38Z&sp=raud&sig=vCQ4%2Bg0rt5YGGI8bFGv5tW0o5s2KMl31NuXwIGjJ3vo%3D&tn=test"

Query

Query is used to filter your data. Check out https://docs.microsoft.com/en-us/rest/api/storageservices/querying-tables-and-entities for more details.
Example: "PartitionKey eq 'part1' and Timestamp ge datetime'2020-03-08T13:49:49.000Z' and Timestamp lt datetime'2021-11-30T13:49:49.000Z'"

Return value

The PowerShell function returns the number of deleted rows.

Query the existing entities

As first step, we need to get the existing items as we need the PartitionKey and RowKey for every item.
Based on the Documentation, we can use the $filter option to return only entities that matches with this filter.
The script does an HTTP GET request on "$($StorageUri)/$($TableName)()?$SASToken&`$filter=$($Query)"

Note: To prevent PowerShell to interpret $filter as a PowerShell variable, you have to escape the dollar sign like `$filter

You have to be aware, that you only get the first 1000 Rows...

Pagination

If your filter affects more than 1000 rows, the HTTP GET request returns a Request-Header with the next PartitionKey and RowKey (x-ms-continuation-NextPartitionKey and x-ms-continuation-NextRowKey).
To get the next 1000 rows, you have to call an HTTP GET request including the NextPartitionKey and NextRowKey:"$($StorageUri)/$($TableName)()?$SASToken&`$filter=$($Query)&NextPartitionKey=$($nextPartkey)&NextRowKey=$($nextRowKey)"

Deleting entities

To delete an entity on an Azure storage table, you have to call an HTTP DELETE request including the PartitionKey and RowKey of the entity to delete.
You will realize that the performance is not very good if you have to delete thousands of entities. That's because Azure table storage has some performance restrictions.

Deleting entities in a batch

By using entity group transactions, you can significant speed up the process as you can delete up to 100 rows in a single HTTP POST request.
The HTTP Post request must contain a body, that includes the activity (e.g. DELETE) plus the URL of the StorageAccount including the name of the table and the partitionkey and rowkey as parameters.
Example of a body with a single DELETE request:

--batch_8360b73e-53ce-4ab8-8de8-3894086bd697
content-type: multipart/mixed; boundary=changeset_7d6bec6f-eced-4cd9-8b40-9f9c528fd991

--changeset_7d6bec6f-eced-4cd9-8b40-9f9c528fd991
Content-Type: application/http
Content-Transfer-Encoding: binary

DELETE $StorageUri/$TableName(PartitionKey='APARTITIONKEY',RowKey='AROWKEY') HTTP/1.1
Accept: application/json;odata=minimalmetadata        
Accept-Encoding: identity
if-match: *

--changeset_7d6bec6f-eced-4cd9-8b40-9f9c528fd991--
--batch_8360b73e-53ce-4ab8-8de8-3894086bd697

Remove-AzureTableEntities

The full Script, without warranty and support...

<#
.SYNOPSIS
Delete Azure Table Entities

.DESCRIPTION
Delete Azure Table Entities based on a Query by using Entity Group Transactions and Paging for mass deletion.

.PARAMETER SASToken
Your SAS Token starting with "sv="

.PARAMETER StorageUri
The URL of your Storage Account (without the name of the table) like: https://myTest.table.core.windows.net

.PARAMETER TableName
The Name of the Table you want to query and delete entities

.PARAMETER Query
The $filter query like "PartitionKey eq 'part1' and Timestamp ge datetime'2020-03-08T13:49:49.000Z' and Timestamp lt datetime'2021-11-30T13:49:49.000Z'"

.EXAMPLE
$deletedEntities = Remove-AzureTableEntities -SASToken "sv=2019-02-02&se=2022-03-10T13%3A38%3A38Z&sp=raud&sig=vCQ4%2Bg0rt5YGGI8bFGv5tW0o5s2KMl31NuXwIGjJ3vo%3D&tn=test" -StorageUri "https://myTest.table.core.windows.net" -TableName "TEST" -Query "PartitionKey eq 'lookup' and Timestamp ge datetime'2020-03-08T13:49:49.000Z' and Timestamp lt datetime'2021-11-30T13:49:49.000Z'"

.NOTES
Created 2022 by Roger Zander -> https://rzander.azurewebsites.net
#>

function Remove-AzureTableEntities {
    [CmdletBinding()]
    param(
        [string] $SASToken,
        [string] $StorageUri,
        [string] $TableName,
        [string] $Query
    )
    
    
    $queryuri = "$($StorageUri)/$($TableName)()?$SASToken&`$filter=$($Query)"
    $batchuri = "$($StorageUri)/`$batch?$($SASToken)"

    $version = "2017-04-17"
    $GMTTime = (Get-Date).ToUniversalTime().toString('R')
    $headersGet = @{
        'x-ms-date'    = $GMTTime
        'x-ms-version' = $version
        'Accept'       = "application/json;odata=fullmetadata"
    }

    $headersBatch = @{
        'x-ms-date'             = $GMTTime
        'x-ms-version'          = $version
        'Content-Type'          = "multipart/mixed; boundary=batch_8360b73e-53ce-4ab8-8de8-3894086bd697"
        'DataServiceVersion'    = "3.0"
        'Accept-Charset'        = "UTF-8"
        'MaxDataServiceVersion' = "3.0;NetFx"
    }

    $body = @"
--batch_8360b73e-53ce-4ab8-8de8-3894086bd697
content-type: multipart/mixed; boundary=changeset_7d6bec6f-eced-4cd9-8b40-9f9c528fd991


"@

    $changeset = @"

--changeset_7d6bec6f-eced-4cd9-8b40-9f9c528fd991
Content-Type: application/http
Content-Transfer-Encoding: binary

DELETE $storageuri/$tableName(PartitionKey='APARTITIONKEY',RowKey='AROWKEY') HTTP/1.1
Accept: application/json;odata=minimalmetadata        
Accept-Encoding: identity
if-match: *

"@

    $i = 0
    $webReq = Invoke-WebRequest -uri $queryuri -Headers $headersGet -Method GET
    $ToDel = $webReq.Content | ConvertFrom-json
    $nextPage = "$($queryuri)&NextPartitionKey=$($webReq.Headers."x-ms-continuation-NextPartitionKey")&NextRowKey=$($webReq.Headers."x-ms-continuation-NextRowKey")"
    $totalitems = $ToDel.value.Count 
    $startblock = 0
    $endblock = $startblock + 99
    #while ($todel.value.Count -gt 0) {
    while ($totalitems -gt 0) {
        while ($endblock -lt ($totalitems + 99)) {
    
            $tempBody = $body
            $ToDel.value[$startblock..$endblock] | ForEach-Object {
                $i++
                $tempBody += $changeset.Replace('AROWKEY', $_.RowKey).Replace('APARTITIONKEY',$_.PartitionKey)
            }

            $tempBody += @"

--changeset_7d6bec6f-eced-4cd9-8b40-9f9c528fd991--

"@
            $tempBody += @"
--batch_8360b73e-53ce-4ab8-8de8-3894086bd697
"@

            Invoke-RestMethod -uri $batchuri -Headers $headersBatch -Method POST -Body $tempBody | Out-Null
        
            $startblock = $endblock + 1
            $endblock = $startblock + 99
        }

        if ($webReq.Headers."x-ms-continuation-NextPartitionKey") {
            $nextPage = "$($queryuri)&NextPartitionKey=$($webReq.Headers."x-ms-continuation-NextPartitionKey")&NextRowKey=$($webReq.Headers."x-ms-continuation-NextRowKey")"
            $webReq = Invoke-WebRequest -uri $nextPage -Headers $headersGet -Method GET
            $ToDel = $webReq.Content | ConvertFrom-json
            $totalitems = $ToDel.value.Count 
            $startblock = 0
            $endblock = $startblock + 99
        }
        else {
            $totalitems = 0
        }
    }

    return $i
}