Introduction
Earlier this year the S3 team announced that S3 will stop accepting API requests signed using AWS Signature Version 2 after June 24th, 2019. Customers will need to update their SDKs, CLIs, and custom implementations to make use of AWS Signature Version 4 to avoid impact after this date. It might be difficult to find older applications or instances using outdated versions of the AWS CLI or SDKs that need to be updated, the purpose of this post is to explain how AWS CloudTrail data events and Amazon Athena can be used to help identify applications that may need to be updated. We will cover the setup of the CloudTrail data events, the Athena table creation, and some Athena queries to filter and refine the results to help with this process.Update (January/February 2019)
S3 recently added a SignatureVersion item to the AdditionalEventData field in the S3 data events, this significantly simplifies the process of finding clients using SigV2. The SQL queries below have been updated to exclude events with a SigV4 signature (additionaleventdata NOT LIKE '%SigV4%'). You can equally search for only '%SigV2%' and skip the CLI version string munging entirely.
Setting up CloudTrail data events in the AWS console
The first step is to create a trail to capture S3 data
events. This should be done in the region you plan on running your Athena
queries in order to avoid unnecessary data transfer charges. In the CloudTrail console
for the region, create a new trail specifying the trail name. The ‘Apply trail
to all regions’ option should be left as ‘Yes’ unless you plan on running
separate analyses for each region. Given that we are creating a data events
trail, select ‘None’ under the Management Events section and check the “Select
all S3 buckets in your account” checkbox. Finally select the S3 location where
the CloudTrail data will be written, we will create new bucket for simplicity:
Setting up CloudTrail data events using the AWS CLI
If you prefer to create the trail using the AWS CLI then you
can use the create-subscription
command to create the S3 bucket and trail with the correct permissions,
updating it to be a global trail and then adding the S3 data event
configuration:
A word on cost
Once the trail has been created, CloudTrail will start
recording S3 data events and delivering them to the configured S3 bucket. Data
events are currently priced at $0.10 per 100,000 events with the storage costs
being the standard S3 data storage charges for the (compressed) events, see the CloudTrail pricing page for additional details. It is recommend that you disable the data event trail once you are satisfied that you have gathered sufficient
request data, it can be re-enabled if further analysis is required at a later stage.
Creating the Athena table
The CloudTrail team simplified the process for using Athena
to analyse CloudTrail logs by adding
a feature to allow customers to create an Athena table directly from
the CloudTrail console event history page by simply clicking on the ‘Run
advanced queries in Amazon Athena’ link and selecting the corresponding S3
CloudTrail bucket:
An explanation of how to create the Athena table manually
can be found in the Athena
CloudTrail documentation.
Analysing the data events with Athena
We now have all the components needed to begin searching for
clients that may need to be updated. Starting with a basic query that filters
out most of the AWS requests (for example the AWS Console, CloudTrail, Athena, Storage
Gateway, CloudFront):
These results should mostly be client API/CLI requests but the
large number of requests can still be refined by only including regions that
actually support AWS Signature Version 2. From the region
and endpoint documentation for S3 we can see that we only need to
check eight of the regions. We can safely exclude the AWS Signature Version 4
(SigV4) regions as clients would not work correctly against these regions if
they did not already have SigV4 support. Let’s also look at distinct user
agents and extract the version from the user agent string:
We are unfortunately not able to filter on the calculated
‘version’ column and as it is a string it is also difficult to perform direct
numerical version comparison. We can use some arithmetic to create a version
number that can be compared. Using the AWS CLI requests as an example for the moment and
adding back the source IP address and user identity
The version comparison number (10110108) translates to the
version string 1.11.108 which is the first version of AWS CLI supporting SigV4
by default. This results in a list of clients accessing S3 objects in this
account using a version of the AWS CLI that needs to be updated:
The same query can be applied to all the AWS CLI and SDK
user agent strings by substituting the corresponding agent string and version
number for SDK versions using SigV4 by default:
AWS Client
|
SigV4 default version
|
User Agent String
|
Version comparator
|
Java
|
1.11.x
|
aws-sdk-java
|
10110000
|
.NET
|
3.1.10.0
|
aws-sdk-dotnet
|
30010010
|
Node.js
|
2.68.0
|
aws-sdk-nodejs
|
20680000
|
PHP
|
3
|
aws-sdk-php
|
30000000
|
Python Botocore
|
1.5.71
|
Botocore
|
10050071
|
Python Boto3
|
1.4.6
|
Boto3
|
10040006
|
Ruby
|
2.2.0
|
aws-sdk-ruby
|
20020000
|
AWS CLI
|
1.11.108
|
aws-cli
|
10110108
|
Powershell
|
3.1.10.0
|
AWSPowerShell
|
30010010
|
Note:
.NET35,.NET45, and
CoreCLR only, PCL, Xamarin, UWP platforms do not support SigV4 at all
All versions of Go and C++ SDKs support SigV4 by default
Additional Note:
There is no need to look at the client version number for new events which will automatically include the SignatureVersion.
Tracing the source of the requests
The source IP address will reflect the private IP of the EC2
instance accessing S3 through a VPC endpoint or the public IP if accessing S3
directly. You can search for either of these IPs in EC2 AWS Console for the
corresponding region. For non-EC2 or NAT access you should be able to use the
ARN to track down the source of the requests.