Chord Data Platform
Overview
Sharing Data with Chord from Cloud Storage
9 min
introduction this guide provides step by step instructions for integrating external cloud storage services to chord's data warehouse (snowflake) we delegate authentication to snowflake managed identities, eliminating the need to manage credentials directly below you'll find instructions for connecting either aws s3, google cloud storage or azure blob storage services to chord's data warehouse (snowflake) configuring an integration for aws s3 this sections describes how to configure secure access to data files stored in an aws s3 bucket create an iam policy that has the permissions required to access the bucket and get objects a log into the aws management console b from the home dashboard, search for and select iam c from the left hand navigation pane, select account settings d under security token service (sts) in the endpoints list, find the united states (n virginia) (i e , us east 1 ) region if the sts status is inactive, move the toggle to active e select create policy f for policy editor , select json g add a policy document that will allow chord to access the s3 bucket and folder the following policy (in json format) provides chord with the required permissions to read data using a single bucket and folder path copy and paste the text into the policy editor { "version" "2012 10 17", "statement" \[ { "effect" "allow", "action" \[ "s3\ getobject", "s3\ getobjectversion" ], "resource" "arn\ aws\ s3 \<bucket>/\<prefix>/ " }, { "effect" "allow", "action" \[ "s3\ listbucket", "s3\ getbucketlocation" ], "resource" "arn\ aws\ s3 \<bucket>", "condition" { "stringlike" { "s3\ prefix" \[ "\<prefix>/ " ] } } } ] } h select next i enter a policy name and an optional description j select create policy create an iam role and attach the custom iam role a from the left hand navigation pane in the identity and access management (iam) dashboard, select roles b select create role c select aws account as the trusted entity type d select another aws account e in the account id field, enter your own aws account id temporarily later, you will modify the trust relationship and grant access to snowflake f select the require external id option an external id is used to grant access to your aws resources (such as s3 buckets) to a third party like snowflake enter a placeholder id such as 0000 in a later step, you will modify the trust relationship for your iam role and specify the external id that chord will provide g select next h select the iam policy you created in the previous step i select next j enter a name and description for the role, then select create role you have now created an iam policy, iam role and attached the policy to the role k on the role summary page, locate and record the role arn value you will need to provide chord this arn to complete the next steps provide chord the iam role arn and bucket name the full bucket name is a url that will look like the following s3 //my cool bucket/ once your bucket name and iam role arn is confirmed, chord will provide an iam user arn and an aws external id grant the iam user permissions to access bucket objects a sign in to the aws management console b select iam c from the left hand navigation pane, select roles d select the iam role you created in the previous step e select the trust relationships tab f select edit trust policy g modify the policy document with the values provided by chord this includes iam user arn and external id { "version" "2012 10 17", "statement" \[ { "sid" "", "effect" "allow", "principal" { "aws" "\<user arn>" }, "action" "sts\ assumerole", "condition" { "stringequals" { "sts\ externalid" "\<external id>" } } } ] } h select update policy to save your changes configuring an integration for google cloud storage this sections describes how to configure secure access to data files stored in a google cloud storage bucket provide chord with your gcs bucket name the bucket name is the name of the cloud storage bucket that stores your data files the full bucket name is a url that will look like the following gcs\ //my cool bucket/ once your bucket name is confirmed, chord will provide you a cloud storage service account name to be used in the following steps create a custom role that has the permissions required to access the bucket and get objects a sign in to the google cloud console as a project editor b from the home dashboard, select iam & admin » roles c select create role d enter a title and optional description for the custom role e select add permissions f filter the list of permissions, and add the following from the list storage buckets get , storage objects get and storage objects list g select add h select create assign the custom role to the cloud storage service account a sign in to the google cloud console as a project editor b from the home dashboard, select cloud storage » buckets c filter the list of buckets, and select the in scope bucket d select permissions » view by principals , then select grant access e under add principals , paste the name of the service account name provided by chord f under assign roles , select the custom iam role that you created previously, then select save (optional) grant the cloud storage service account permissions on the cloud key management service cryptographic keys this step is required only if your gcs bucket is encrypted using a key stored in the google cloud key management service (cloud kms) a sign in to the google cloud console as a project editor b from the home dashboard, search for and select security » key management c select the key ring that is assigned to your gcs bucket d click show info panel in the upper right corner the information panel for the key ring slides out e click the add principal button f in the new principals field, search for the service account name provided by chord g from the select a role dropdown, select the "cloud kms crytokey encryptor/decryptor role" h click the save button the service account name is added to the cloud kms crytokey encryptor/decryptor role dropdown in the information panel configuring an integration for microsoft azure blob storage this sections describes how to configure secure access to data files stored in microsoft azure blob storage provide chord with your tenant id and container name once your tenant id and container name are confirmed, chord will provide a consent url and app name in a web browser, navigate to the consent url the page displays a microsoft permissions request page click the accept button this action allows the azure service principal created for chord’s snowflake account to be granted an access token on specified resources inside your tenant obtaining an access token succeeds only if you grant the service principal the appropriate permissions on the container (see the next step) the microsoft permissions request page redirects to the snowflake corporate site (snowflake com) sign in to the microsoft azure portal navigate to azure services » storage accounts click the name of the storage account you are granting the snowflake service principal access to click access control (iam) » add role assignment select the desired role to grant to the snowflake service principal storage blob data reader grants read access only this allows loading data from files staged in the storage account search for the snowflake service principal this is the identity in the app name property that we will provide click the review + assign button troubleshooting this document has been created for customer reference purposes for the most up to date information, you may consult official snowflake documentation https //docs snowflake com/en/user guide/data load s3 config storage integration#configuring secure access to cloud storage https //docs snowflake com/en/user guide/data load gcs config#step 3 grant the service account permissions to access bucket objects https //docs snowflake com/en/user guide/data load azure config#step 2 grant snowflake access to the storage locations recommended next steps if you plan to use cloud storage integration to share data with chord, we have some helpful recommendations for structuing data to faciliitate seemless data ingestion see docid\ xktajrailw5unrqrmpan9