How do I use CloudCasa with Wasabi?
Snowflake External Tables have been verified to use with Wasabi.
For more detail on configuring Wasabi storage, refer to Snowflake’s documentation on Working with Amazon S3-compatible storage.
Prerequisites
Active Wasabi Storage account
A dataset supported by Snowflake uploaded to your Wasabi Bucket
Active Snowflake account
An understanding of Data Lake / Data Warehouse terminology and basic usage of Snowflake
At this time, Snowflake must authorize using external s3 urls. You must open a support case with Snowflake ahead of time and ask that the S3 URL for the region your bucket lives can be authorized for your account. For more information on the Wasabi S3 URLs, review Service URLs for Wasabi's Storage Regions.
Uploading Data to your Wasabi Bucket
To use Snowflake with Wasabi, you must have a dataset you wish to read from Wasabi. In the example below, we are using sample data generated with Faker. The sample data consists of 1000 files with 1000 rows of fake customer names, addresses, emails, phone numbers, companies, and so on.
The data resides in the "snowflake-sample-test " bucket in the "us-east-2" region and a folder of "db1". The path to the data will be "//snowflake-sample-test/db1". You should adjust this accordingly when following along.
Creating a SQL Worksheet
After logging into your Snowflake account, create a new SQL Worksheet. This will be your scratchpad for working with the data living in Wasabi.
Creating a Snowflake Database
After you have the Worksheet open, create a Database to hold the information about the External Table and any related data.
To do so, run the following two commands. This example uses a database called "MY_TEST_WASABI_DATABASE". Be sure to adjust this accordingly to your use case.
The first command will create the database. The second command will select it to be used for further queries in the Worksheet. After copying the commands in, select the first statement and click Run in the top right hand corner (the blue button with an arrow on it).
CREATE DATABASE MY_TEST_WASABI_DATABASE;
USE MY_TEST_WASABI_DATABASE;
Creating a Snowflake Stage Object
A stage object must be created on Snowflake to tell where your data is being held or "staged". This is where we create the reference to a Wasabi bucket with the region and credentials.
Be sure to update the name of your stage, "wasabi_east2" in this example, the URL with your bucket name/path, region, and access keys.
This configuration example discusses the use of Wasabi's us-east-2 storage region. To use other Wasabi storage regions, use the appropriate Wasabi service URL as described in Service URLs for Wasabi's Storage Regions.
CREATE STAGE wasabi_east2
URL = 's3compat://snowflake-sample-test/db1'
ENDPOINT = 's3.us-east-2.wasabisys.com'
REGION = 'us-east-2'
CREDENTIALS = ( AWS_KEY_ID = 'xxx' AWS_SECRET_KEY = 'xxx' )
List the Data in Your Wasabi Account
After your stage has been created successfully, you should be able to list the contents of the files inside your Wasabi bucket.
We will call the list query on the stage name.
LIST @wasabi_east2;
If you are successful in listing the contents of your Wasabi bucket, we can then create the Snowflake External Table to be used to query your data. You will need to update the stage name, the file format type based upon your dataset, and the name chosen for the stage.
CREATE EXTERNAL TABLE et
LOCATION=@wasabi_east2/
AUTO_REFRESH = FALSE
REFRESH_ON_CREATE = TRUE
FILE_FORMAT = (TYPE = CSV);
Running a Query on Your Data
After creating your table, you should be all set to run some queries against your data to validate it.
SELECT * FROM et;
You have now created a Snowflake External Table referencing data sitting on Wasabi.