How do I upload larger Alerts data via API?

Introduction

Larger alerts (>2MB) can be sent to Hummingbird via the Alerts Upload API.

The Alerts Upload API will involve a 4-step client flow:

  1. Client requests an upload URL by calling the POST /alert_uploads endpoint.

  2. Client uploads the encrypted alerts data to the provided URL.

  3. Client notifies Hummingbird that the upload has been completed by calling the PUT /alerts_uploads/{token}/complete endpoint

  4. Client polls for the outcome of upload by calling the GET /alert_uploads/{token} endpoint

Technical Implementation

POST /alert_uploads

This is the first API endpoint called to begin the Alerts Upload process.

Authentication for this endpoint is the same as all of the Hummingbird API.

The request body should be an empty JSON object.

The response format will be a JSON object with the following format:

{
 "success": true,
 "alert_upload": {
   "token": "ZxiJDeAVWmDUaa7iDJLGArCJ",
   "upload": {
     "method": "post",
     "url": "https://s3.us-west-2.amazonaws.com/hummingbird....",
     "fields": {
       ...
     }
   },
   "encryption": {
     "version": 1,
     "public_key": {
       "id": "FVczGR6DDoKYHSY2LpiG6XGQ",
       "key": "-----BEGIN PUBLIC KEY-----..."
     }
   }
 }
}

Client-Side Encryption

The JSON body to be uploaded is the same as a request body sent to the synchronous POST /alerts endpoint. The payload must be encrypted client-side before upload using the public key returned from the initial API call and uploaded to the URI provided in that same response.

Encrypted Format

The encrypted content is formatted as follows:

<JSON header>\0<encrypted payload>

A JSON-formatted encryption metadata header is followed by a NULL byte which is followed by the encrypted AlertRequest payload in binary.

The format for the JSON encryption metadata varies depending on the version of encryption scheme used, and is described in more detail below. At a minimum, the header will contain a “version” key.

Version 1

This encryption scheme implements a simple sealed envelope using RSA-4096 (PKCS1) for key wrapping and AES-256-GCM for data encryption. A random session AES key and initialization vector are generated for each message.

The RSA public key used for key wrapping is provided in the API response in the first step for convenience. This is a long-lived key, and may also be shared or verified out-of-band, or hard-coded in the client depending on your desired security posture.

The public key id is a unique identifier for the public key. It is used during decryption to locate the appropriate private key and facilitate key rotation.

The format for the version 1 encryption metadata header is:

{
 "version": 1,
 "key_id": "hb-123",
 "encrypted_session_key": "base64(aes session key)",
 "iv": "base64(iv)",
 "auth_tag": "base64(auth_tag)"
}

The following example code implements the version 1 encryption scheme in Ruby, using OpenSSL. The encrypt() method outputs the full content which is to be uploaded using the pre-signed URL. See this Github Gist.

Upload to S3

After the content is encrypted it should be uploaded to S3 using the pre-signed URL and fields obtained from the first API call. This should use an HTTP POST. See AWS documentation for more details about this.

Total upload size is limited to 10MB in order to protect the user experience for your investigators who ultimately may have to manually review all of the uploaded data. Please reach out to our customer success team if this limit is too restrictive.

Pre-signed URLs will have very short validity and should be used immediately after retrieval.

Tip: Ensure that the Content-Type is multipart/form-data, and that the form data includes everything provided in ‘upload.fields’, in addition to the encrypted envelope as ‘file’.

An example curl request to upload data to S3:

curl --location --request POST 'https://s3.us-west-2.amazonaws.com/hummingbird.XXX' \
--form 'key="alert_uploads/XX"' \
--form 'acl="private"' \
--form 'x-amz-meta-hb-organization-token="XXX"' \
--form 'x-amz-meta-hb-key-id="XXX"' \
--form 'policy="XXX"' \
--form 'x-amz-credential="XXX"' \
--form 'x-amz-algorithm="XXX"' \
--form 'x-amz-date="XXX"' \
--form 'x-amz-security-token="XXX"' \
--form 'x-amz-signature="XXX"' \
--form 'file=@"/tmp/encrypted_envelope"'

PUT /alert_uploads/{token}/complete

Notify Hummingbird that the upload has been completed. Send an empty JSON object {}. Use the token obtained from the response to the initial GET request.

Hummingbird will synchronously verify that the uploaded file exists in S3 and enqueue for further processing. No other verification or processing will occur as a result of this call.

The response will be in the same format as the response to GET /alert_uploads/{token} described next.

GET /alert_uploads/{token}

This endpoint can be polled for the results of the upload.

Alerts processing takes place asynchronously and we occasionally receive high volumes in short periods of time which can lead to temporarily longer queuing times. Please be mindful of this when polling this endpoint if you find that the alerts processing times vary. We generally aim to process alerts within 24 hours, though most alerts are processed within an hour.

This endpoint returns the overall status of the alert, an error message if one occurred, as well as an array of AlertFetchResponse objects, the same response typically provided by the GET /alerts/{token} endpoint on a per-alert basis. The array of alerts will remain EMPTY until ALL alerts in the upload have been processed, and the status is either PROCESSED or ERROR.

The response format is as follows:

{
 "success": true,
 "alert_upload": {
   "token": "Hn4JsTGbMTy2nmJEDokwZArf",
   "updated_at": "2021-11-12T19:11:37.889Z",
   "created_at": "2021-11-12T19:11:36.948Z",
   "alert_processing_started_at": null,
   "failed_at": null,
   "error_message": null,
   "status": "PENDING|PROCESSING|PROCESSED|PROCESSING_ERROR|UPLOAD_ERROR",
   "alerts": [AlertFetchResponse, ... ]
 }
}

Frequently Asked Questions

How can I test these APIs?

We recommend testing in your Sandbox environment. Remember that Sandbox is only designed for use with fake data, and no PII should be sent.

What is the lifespan of the pre-signed URLs?

5 minutes

Is the RSA public key the same for all customers?

It is not. A separate 4096 bit RSA key is generated for each customer.

How is the S3 bucket secured?

Data is encrypted with S3 server-side encryption in addition to the client side encryption described above. All objects are ‘private’ and versioned by default. S3 bucket configurations are regularly inspected as part of security audits and penetration testing. Is there a rate limiting on the polling endpoint? Not at this time.

How long does alert processing take?

We process alerts asynchronously, so processing times vary. We aim to process all alerts within 24 hours, though most alerts are processed within an hour.

Resources

Hummingbird API documentation

End-to-End Example in Python

Example Encryption in Ruby

Example Encryption in Golang

Last updated