You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Expose a mechanism to set boto3 client arguments in S3ArtifactStorage to allow for full s3 client customization. This would allow logging/storing of artifacts to new or particular S3 endpoints, to apply settings such credentials directly without necessitating environment variables, and generally unlock the full capabilities of the boto3 client (new endpoints, proxy settings, ssl, etc.
Motivation
My organization has an on-prem bucketstore solution (VAST S3) that I wanted to use to store artifacts. I needed a way to modify S3ArtifactStorage so as to specify the input arguments to the s3 client, eg. boto3.client(url_endpoint=...)
It's possible to configure SOME of the arguments used by the client with environment variables (like AWS_ACCESS_KEY_ID) , but not url_endpoint. It's possible setup the default boto3 Session and provide credentials or refer to a config file profile with boto3.setup_default_session(profile_name=...) but (a) I didn't want to define my connection params that way (b) I'm not even sure I can specify a url_endpoint or proxy servers that way.
Pitch
Create a subclass of S3ArtifactStorage such that the _get_s3_client() method is overriden. The overriden method is initialized with all the desired client arguments . The custom class with boto3 client arguments can be initialized via a S3ArtifactStorage_factory method that returns the ready-to-go custom S3ArtifactStorageCustom class. All that then remains is to register the new S3ArtifactStorageCustom class with the artifact registry. To make the whole experience as seamless as possible, a S3ArtifactStorage_clientconfig(**boto3_client_kwargs) method can be exposed. Calling this method with boto3 client arguments will perform all previously mentioned steps, afterwhich when a Run is initialized run.set_artifacts_uri('s3://...') and run.log_artifact(...) can be used to save artifacts to diverse s3 endpoints (and without mucking with environment variables).
Usage
from aim Run
from aim.storage.artifacts.s3_storage import S3ArtifactStorage_clientconfig
S3ArtifactStorage_clientconfig(
endpoint_url='http://vast.myorg.net',
aws_access_key_id='xxxxxxxxx',
aws_secret_access_key='yyyyyyyyyyyyyyyyyy',
config = {...}, ...)
run = Run(...)
run.set_artifacts_uri('s3://MYBUCKET/aim_artifacts')
run.log_artifact(...)
Alternatives
I considered modifying S3ArtifactStorage's method directly such that the client configs could be included as part of the class's init, but the artifact_registry registry mechanism doesn't really allow for that.
I opted to patch the registry entry for s3 as opposed to creating a new registry entry such as s3+, because I just want my s3://BUCKET/whatever/... paths to just work.
Otherwise uuhhh, I guess I could save the artifacts locally and have a different mechanism to upload the to s3. Or I guess at that point just handle all the storage if artifacts myself without using aim to do so at all... but that'd be a shame.
The text was updated successfully, but these errors were encountered:
🚀 Feature
Expose a mechanism to set boto3 client arguments in S3ArtifactStorage to allow for full s3 client customization. This would allow logging/storing of artifacts to new or particular S3 endpoints, to apply settings such credentials directly without necessitating environment variables, and generally unlock the full capabilities of the boto3 client (new endpoints, proxy settings, ssl, etc.
Motivation
My organization has an on-prem bucketstore solution (VAST S3) that I wanted to use to store artifacts. I needed a way to modify S3ArtifactStorage so as to specify the input arguments to the s3 client, eg.
boto3.client(url_endpoint=...)
It's possible to configure SOME of the arguments used by the client with environment variables (like
AWS_ACCESS_KEY_ID
) , but not url_endpoint. It's possible setup the default boto3 Session and provide credentials or refer to a config file profile withboto3.setup_default_session(profile_name=...)
but (a) I didn't want to define my connection params that way (b) I'm not even sure I can specify a url_endpoint or proxy servers that way.Pitch
Create a subclass of
S3ArtifactStorage
such that the_get_s3_client()
method is overriden. The overriden method is initialized with all the desired client arguments . The custom class with boto3 client arguments can be initialized via aS3ArtifactStorage_factory
method that returns the ready-to-go custom S3ArtifactStorageCustom class. All that then remains is to register the newS3ArtifactStorageCustom
class with the artifact registry. To make the whole experience as seamless as possible, aS3ArtifactStorage_clientconfig(**boto3_client_kwargs)
method can be exposed. Calling this method with boto3 client arguments will perform all previously mentioned steps, afterwhich when a Run is initializedrun.set_artifacts_uri('s3://...')
andrun.log_artifact(...)
can be used to save artifacts to diverse s3 endpoints (and without mucking with environment variables).Usage
Alternatives
I considered modifying S3ArtifactStorage's method directly such that the client configs could be included as part of the class's init, but the
artifact_registry
registry mechanism doesn't really allow for that.I opted to patch the registry entry for
s3
as opposed to creating a new registry entry such ass3+
, because I just want mys3://BUCKET/whatever/...
paths to just work.Otherwise uuhhh, I guess I could save the artifacts locally and have a different mechanism to upload the to s3. Or I guess at that point just handle all the storage if artifacts myself without using aim to do so at all... but that'd be a shame.
The text was updated successfully, but these errors were encountered: