Recently, I was deployed a Helm chart into an AWS EKS cluster which requested persistent volumes. For the uninitiated, basically I was getting something running on Kubernetes that required access to longer-term storage. Except, it wasn’t working. Most of the pods deployed fine, except for those needing the volumes. Here we go…

I go digging and see there’s something about waiting for a Persistent Volume Claim (PVC). That means the storage hasn’t become available when requested. Okidoke, we can investigate the PVC and see what’s up.

Warning  ProvisioningFailed   119s                  persistentvolume-controller  (combined from similar events): Failed to provision volume with StorageClass "gp2": UnauthorizedOperation: You are not authorized to perform this operation. Encoded authorization failure message: N24SBXsb8kL_...
           status code: 403, request id: ef5acce1-...

Hm, 403? UnauthorizedOperation? That’s a bit odd. I’ve had PVCs create volumes in the past using the same credentials. What gives? Time to dig into that encoded failure message.

aws sts decode-authorization-message --encoded-message 'N24SBXsb8kL_...' --query DecodedMessage --output text | jq '.'
{
  "allowed": false,
  "explicitDeny": true,
  "matchedStatements": {
    "items": [
      {
        "statementId": "DenyUnencryptedEBSVolumes",
        "effect": "DENY",
...

Ah, that’s getting much more specific. That’s even ringing some bells. Some time ago, I had applied an AWS Organization Service Control Policy (SCP) on this account that prevented the creation of unencrypted Elastic Block Storage (EBS) volumes. In EC2 I had set up encryption by default, but the default storage provisioner in EKS was ignorant of this and was diligently headbutting the wall. I’d also noticed that it was trying to create a volume using the gp2 storage class, which has been superceded by gp3.

How to unbreak this situation? Enter the EBS Container Storage Interface (CSI) driver. I’ll spare you the nitty gritty as I was able to follow the documentation and everything basically just-worked™️. Happy days! A previously established and somewhat forgotten security rule prevented me from accidentally allowing a bad thing to happen.

The system works.