Snowflake is a data warehouse built for the cloud. Snowflake delivers performance, simplicity, concurrency and affordability.
This document was last updated on January 29, 2018. If you notice any gaps, out-dated information or simply want to leave some feedback to help us improve our documentation, please let us know!
There are six steps to get started using Snowflake with Segment. Please make sure that you are running the commands in each step while logged in as an
SYSADMIN. While we are using predefined user (
SEGMENT_USER), role (
SEGMENT), warehouse (
SEGMENT_WAREHOUSE) and database (
SEGMENT_EVENTS) names, you can use any names you like.
- Create Virtual Warehouse
- Create Database
- Create Role for Segment
- Create User for Segment
- Test the User and Credentials
- Connect Snowflake to Segment
Create Virtual Warehouse
The Segment Snowflake destination requires a Snowflake virtual warehouse to load data in to. To avoid conflicts with other regular operations in your cluster, we recommend creating a new warehouse just for Segment loads, but this is not mandatory. An X-Small warehouse works for most customers when starting.
CREATE WAREHOUSE "SEGMENT_WAREHOUSE" WITH WAREHOUSE_SIZE = 'XSMALL' WAREHOUSE_TYPE = 'STANDARD' AUTO_SUSPEND = 600 AUTO_RESUME = TRUE;
AUTO_SUSPEND is set to ~10 minutes in the UI (or 600 if using SQL) and
AUTO_RESUME is enabled, to avoid extra costs.
The Segment Snowflake destination creates its own schemas and tables, so it’s recommended to create a new database for this purpose to avoid name conflicts with existing data.
CREATE DATABASE "SEGMENT_EVENTS";
Create Role for Segment
You need to run these commands rather than creating a role with the “Create Role” dialog in the UI.
This role will be attached to Segment’s user and it gives just enough permissions for loading data in your database. We recommend not reusing this role for other operations.
Create a new role:
CREATE ROLE "SEGMENT";
Grant access to the virtual warehouse:
GRANT USAGE ON WAREHOUSE "SEGMENT_WAREHOUSE" TO ROLE "SEGMENT";
Grant access to the database:
GRANT USAGE ON DATABASE "SEGMENT_EVENTS" TO ROLE "SEGMENT"; GRANT CREATE SCHEMA ON DATABASE "SEGMENT_EVENTS" TO ROLE "SEGMENT";
Create User for Segment
Finally, you need to create the user that will be connected to Segment. Be sure to use a strong, unique password.
CREATE USER "SEGMENT_USER" MUST_CHANGE_PASSWORD = FALSE DEFAULT_ROLE = "SEGMENT" PASSWORD = "my_strong_password"; -- Please do not use this password GRANT ROLE "SEGMENT" TO USER "SEGMENT_USER";
Test the User and Credentials
You should now test the new user and credentials. If the following commands are run successfully, you are will then be ready to connect Snowflake to Segment. We recommend using snowsql to run the following:
~$ snowsql --accountname myb10 --username SEGMENT_USER Password: * SnowSQL * v1.1.46 Type SQL statements or !help SEGMENT_USER#(no warehouse)@(no database).(no schema)>SELECT 1; +---+ | 1 | |---| | 1 | +---+ 1 Row(s) produced. Time Elapsed: 0.093s SEGMENT_USER#(no warehouse)@(no database).(no schema)>USE WAREHOUSE "SEGMENT_WAREHOUSE"; +----------------------------------+ | status | |----------------------------------| | Statement executed successfully. | +----------------------------------+ 1 Row(s) produced. Time Elapsed: 0.118s SEGMENT_USER#SEGMENT_WAREHOUSE@(no database).(no schema)>USE DATABASE "SEGMENT_EVENTS"; +----------------------------------+ | status | |----------------------------------| | Statement executed successfully. | +----------------------------------+ 1 Row(s) produced. Time Elapsed: 0.130s SEGMENT_USER#SEGMENT_WAREHOUSE@SEGMENT_EVENTS.(no schema)>!exit
If you would like to use the web interface, please switch to the new role for the Segment user, create a new Worksheet and execute:
SELECT 1; USE WAREHOUSE "SEGMENT_WAREHOUSE"; USE DATABASE "SEGMENT_EVENTS";
Connect Snowflake to Segment
After creating a Snowflake warehouse, the next step is to connect Segment.
- In the Segment App, select Add Destination.
- Search for and select “Snowflake”.
- Add your credentials as follows:
- User - The user name (as created above).
- Password - The password for the user.
- Account - The account id of your cluster, not the url (e.g. url: my-business.snowflakecomputing.com, account-id: myb10)
- Database - The database name (as created above).
- Warehouse - The warehouse name (as created above).
If you create a network policy with Snowflake, add the following IP address to the “Allowed IP Addresses” list:
Multi-Factor Authentication (MFA) & SSO
At this time, the Segment Snowflake destination is not compatible with Snowflake’s MFA or SSO settings. If your connected user has MFA or SSO enabled, you will need to disable it for syncs to run correctly.
AUTO_SUSPEND to ~10 minutes in the UI (or 600 if using SQL) to avoid credit consumption by the Segment syncing process.
If you enable the
AUTO_SUSPEND feature, we recommend that you also enable
AUTO-RESUME. This will ensure that your Snowflake warehouse automatically resumes when Segment loads data. Otherwise, Segment will not be able to load data unless you manually resume your Snowflake warehouse.
Unique Warehouse, Database, and Role
We strongly recommend creating a unique Warehouse, Database and Role for the Segment Snowflake connection to your Snowflake instance.
I get “Object does not exist” when running “USE DATABASE” or “USE WAREHOUSE”, even if the warehouse or the database are created.
Make sure you have created the role and assigned the proper permissions with the account
ACCOUNTADMIN. Other non-system accounts don’t assign the right permissions.
I’ve consumed all the credits after the initial sync.
If you have used all your credits, you will need to contact Snowflake to purchase more.
Also make sure
AUTO_SUSPEND is enabled and set to 5 or 10 minutes in the warehouse used by Segment. This setting will help avoid unintended use of credits by the Segment Snowflake destination.
My syncs are going slower than I expect.
This complaint is most often due to not using a separate Warehouse specifically for Segment.
If you’re already doing so, see this section of the Snowflake docs for more details on how to handle slow running processes.
What size should I start with when creating a new Snowflake instance?
Most customers have the best luck starting with a X-Small instance.
Why do I see so many ‘Rollback’ statements?
rollback is issued at the end of each session to make sure there’s no “in-flight” processes hanging out that could block other processes later.
Does Segment use transactions for loading data?
We don’t open transactions explicitly - that would lock resources. However, if autocommit is on, each statement functions as it’s own transaction, and a silent commit is issued after each.
### Indexes Queuing - can use a different WH for Segment or use the recommendations from the Snowflake docs.
If you have any questions, or see anywhere we can improve our documentation, please let us know!