Saturday 16 February 2019

AWS tip: Wildcard characters in S3 lifecycle policy prefixes

A quick word of warning regarding S3's treatment of asterisks (*) in object lifecycle policies. In S3 asterisks are valid 'special' characters and can be used in object key names, this can lead to a lifecycle action not being applied as expected when the prefix contains an asterisk.

Historically an asterisk is treated as a wildcard to pattern match 'any', so you would be able to conveniently match all files for a certain pattern: 'rm *' as an example, would delete all files. This is NOT how an asterisk behaves in S3 lifecycle prefixes. If you specify a prefix of '*' or '/*' it will only be applied to objects that start with an asterisk and not all objects. The '*' prefix rule would be applied to these objects:
example-bucket/*/key
example-bucket/*yek

But would not be applied to:
example-bucket/object
example-bucket/directory/key
example-bucket/tcejbo*

It is not an error to specify an asterisk and it will merely result in the policy not being applied so you may not even know this as an issue. Fortunately it is fairly easy to check for this configuration with the CLI, the following bash one liner will iterate through all buckets owned by the caller and check if the bucket has any policies with an asterisk in their name. It will print out the bucket name and policy if affected or a '.' (to show progress) if not.

You can also check the S3 console, it should display 'Whole bucket' under the 'Applied To' column for any lifecycle rules you intended to have applied to the entire bucket.