AWS S3

Cloud storage

  • Integrates with third-party technologies using REST APIs and SDKs so developers can manage unlimited web data in one place

  • Protects data by supporting SSL transfers and automatic encryptions so developers can keep data safe

  • Supports network-optimized, physical disk-based, or third-party connector methods so developers can import or export data with ease

  • Supports object tagging so developers can customize and identify unique categories to properly manage transitions between storage classes

AWS S3

How AWS S3 works

Getting raw analytics data into AWS S3 in a clean format is not an easy task. You could use an analytics tool like Mixpanel or Google Analytics to collect the data, then tap into export APIs to ETL data from those tools into S3. The problem with this approach is that it’ll take a good amount of development work, and at the end you’re left with data that was designed to power reports in an analytics tool, not the raw data you want for custom analysis. 

AWS S3

Get more out of AWS S3 with Segment

When you enable the AWS S3 integration, raw Segment data gets copied into your AWS S3 bucket automatically. The data is stored as line-separated JSON objects and follows the Segment Spec, which was designed to be clean and easy to understand. Each object contains the data from a single API call made to Segment. Segment warehouses are powered by an ETL process that uses Segment’s copy of these logs. Use warehouses to save a bunch of time if possible, but if you need your raw Segment data for custom systems or pipelines, AWS S3 is a good option. 

To set up AWS S3, configure the bucket (including region) and AWS IAM role. After this, enable the S3 destination by entering the credentials of the AWS resources into the S3 settings page. Once the destination is enabled, the first sync begins within 1 hour. Detailed instructions and common questions can be found in the S3 set up guide.

Integrate AWS S3 with Segment

Segment makes it easy to set up AWS S3.