Object Storage management using s3cmd¶
s3cmd¶
s3cmd is a command line client tool to interact with S3 object storage.
Installation¶
You may want to activate python virtual environment.
pip install s3cmd
Credentials¶
To set credentials you need to configure s3cmd, which includes the credentials and s3cmd settings.
The basic configuration file consists of host_base
, host_bucket
, access_key
and secret_key
. For more configuration parameters, check s3cmd configuration file documentation.
In the Swift S3 API Overview section we show in detail how to create configuration file with S3 credentials. As a result configuration file named s3config.cfg is created which will be used throughout this section.
Using s3cmd¶
For more s3cmd commands, refer to the s3cmd usage.
Note
In all s3cmd commands you need to point to the configuration file using --config=CONFIG_FILE.
If not specified s3cmd searches in the default directory $HOME/.cfg.
Interact with buckets¶
List buckets¶
s3cmd --config=s3config.cfg ls
# lists object storage containers in the currently active project
2024-06-07 06:23 s3://ps-test
2024-06-14 18:30 s3://sc-okd-v6pp5-image-registry-uswpbiefxqdaexgvxeeykiekhrpcijiyrk
2024-09-02 09:18 s3://usage_reports
Warning
When calling s3cmd commands, you may see the following error:
ERROR: SSL certificate verification failure: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1000)
Solution: force to not check SSL certificates by appending the --no-check-certificate flag at every s3cmd command call, for example:
s3cmd --config=s3config.cfg --no-check-certificate ls
Create bucket¶
s3cmd --config=s3config.cfg mb s3://aabish-bucket
Bucket 's3://aabish-bucket/' created
If you keep "creating" a bucket with already existing name while staying in your current active project, you will receive a misleading message that the bucket is created. In fact, the bucket is neither newly created, nor it is replaced. The bucket and its content are intact.
# bucket s3://aabish-bucket already exists in this project
$ s3cmd --config=s3config.cfg ls
2024-09-18 08:23 s3://aabish-bucket
2024-06-07 06:23 s3://ps-test
2024-06-14 18:30 s3://sc-okd-v6pp5-image-registry-uswpbiefxqdaexgvxeeykiekhrpcijiyrk
2024-09-02 09:18 s3://usage_reports
# if you keep executing mb (make bucket) command, you receive a misleading message instead of an error
$ s3cmd --config=s3config.cfg mb s3://aabish-bucket
Bucket 's3://aabish-bucket/' created
$ s3cmd --config=s3config.cfg mb s3://aabish-bucket
Bucket 's3://aabish-bucket/' created
However, if you (or another user) work on a new project and try to create a bucket with the name that already exists in another project, you will receive an (expected) error.
Example below shows that a user is working on another project, hence points to another configuration file, and creates a bucket with the name that is already taken. Remember that each project has its own config file as mentioned earlier.
$ s3cmd --config=s3config-of-another-project.cfg mb s3://aabish-bucket
ERROR: Bucket 'aabish-bucket' already exists
ERROR: S3 error: 409 (BucketAlreadyExists)
-
Must be unique globally, which means across ALL projects (even those where you are not a member) and ALL users.
-
Must be between 3 and 63 characters in length.
-
Must NOT contain uppercase characters or underscores (_).
-
Must start with a lowercase letter or number.
Check bucket operations documentation for further constraints and conventions.
Retrieve bucket information¶
s3cmd --config=s3config.cfg info s3://aabish-bucket
s3://aabish-bucket/ (bucket):
Location: default
Payer: BucketOwner
Ownership: none
Versioning:none
Expiration rule: none
Block Public Access: none
Policy: none
CORS: none
ACL: science-it.aabish: FULL_CONTROL
ACL stands for Access Control List. It allows to manage access permissions to buckets and objects, more examples on ACL in the section Permissions.
Delete bucket¶
s3cmd --config=s3config.cfg rb s3://aabish-bucket
Bucket 's3://aabish-bucket' removed
If bucket contains objects (files), you will receive the following error:
ERROR: S3 error: 409 (BucketNotEmpty)
Buckets can be deleted only when they are empty. However, you may force to delete the bucket and it's contents by ensuring both flags are included --recursive
and --force
. Perform this operation with caution.
s3cmd --config=s3config.cfg rb s3://aabish-bucket --recursive --force
WARNING: Bucket is not empty. Removing all the objects from it first. This may take some time...
delete: 's3://aabish-bucket/test-file-s3cmd.txt'
Bucket 's3://aabish-bucket/' removed
Interact with objects (files)¶
Upload object to bucket¶
For our example, let's create a bucket and create local test file to be uploaded to remote S3 bucket. Unlike for buckets there are almost no restrictions on object names, for more details check documentation.
echo -e "line1\nline2" > test-file-s3cmd.txt
s3cmd --config=s3config.cfg mb s3://aabish-bucket
Upload local test file to S3 bucket:
s3cmd --config=s3config.cfg put test-file-s3cmd.txt s3://aabish-bucket
upload: 'test-file-s3cmd.txt' -> 's3://aabish-bucket/test-file-s3cmd.txt' [1 of 1]
12 of 12 100% in 0s 77.91 B/s done
Warning
You may receive a warning message:
WARNING: Module python-magic is not available. Guessing MIME types based on file extensions.
libmagic
and python-magic
libraries on MacOS (example) or using apt install
on Linux: brew install libmagic
pip install python-magic
Upload object with existing name¶
If you upload object with existing name to the S3 bucket, you do not receive any error. The existing object in the bucket is overwritten with the new object.
In this example, we modify existing file by adding a new line and upload the file to remote bucket.
echo "line3" >> test-file-s3cmd.txt
s3cmd --config=s3config.cfg put test-file-s3cmd.txt s3://aabish-bucket
No error is given, the remote file is replaced with the local copy.
upload: 'test-file-s3cmd.txt' -> 's3://aabish-bucket/test-file-s3cmd.txt' [1 of 1]
18 of 18 100% in 0s 188.80 B/s done
Retrieve object information¶
By analogy with bucket management use command info
on an object level:
s3cmd --config=s3config.cfg info s3://aabish-bucket/test-file-s3cmd.txt
s3://aabish-bucket/test-file-s3cmd.txt (object):
File size: 18
Last mod: Thu, 19 Sep 2024 18:40:37 GMT
MIME type: text/plain
Storage: STANDARD
MD5 sum: cc3d5ed5fda53dfa81ea6aa951d7e1fe
SSE: none
Policy: none
CORS: none
ACL: science-it.aabish: FULL_CONTROL
x-amz-meta-s3cmd-attrs: atime:1726768815/ctime:1726768814/gid:20/gname:staff/md5:cc3d5ed5fda53dfa81ea6aa951d7e1fe/mode:33188/mtime:1726768814/uid:501/uname:
Upload directory with objects to bucket¶
Create directory and place some files:
echo -e "line4\nline5" > test-file2-s3cmd.txt
mkdir test-dir
cp test-file* test-dir
Add flag --recursive
to upload directory with files. The bucket will mirror the structure of the directory:
s3cmd --config=s3config.cfg put test-dir s3://aabish-bucket --recursive
upload: 'test-dir/test-file-s3cmd.txt' -> 's3://aabish-bucket/test-dir/test-file-s3cmd.txt' [1 of 2]
18 of 18 100% in 0s 138.64 B/s done
upload: 'test-dir/test-file2-s3cmd.txt' -> 's3://aabish-bucket/test-dir/test-file2-s3cmd.txt' [2 of 2]
12 of 12 100% in 0s 79.85 B/s done
List objects in bucket¶
s3cmd --config=s3config.cfg ls s3://aabish-bucket
DIR s3://aabish-bucket/test-dir/
2024-09-19 18:18 18 s3://aabish-bucket/test-file-s3cmd.txt
View structure of the remote directory test-dir/
:
s3cmd --config=s3config.cfg ls s3://aabish-bucket/test-dir/
2024-09-19 18:14 18 s3://aabish-bucket/test-dir/test-file-s3cmd.txt
2024-09-19 18:14 12 s3://aabish-bucket/test-dir/test-file2-s3cmd.txt
Download object from bucket¶
s3cmd --config=s3config.cfg get s3://aabish-bucket/test-file-s3cmd.txt test-file-s3cmd-frombucket.txt
download: 's3://aabish-bucket/test-file-s3cmd.txt' -> 'test-file-s3cmd-frombucket.txt' [1 of 1]
18 of 18 100% in 0s 145.23 B/s done
Remove object from bucket¶
s3cmd --config=s3config.cfg del s3://aabish-bucket/test-file-s3cmd.txt
delete: 's3://aabish-bucket/test-file-s3cmd.txt'
If you want to delete multiple objects from the bucket add files to the command line s3cmd del <file1> <file2> ... <fileN>
.
If you want to delete all objects from the bucket, add flags --recursive
and --force
. This way the bucket is emptied, but not deleted.
For our example, we upload some files back to bucket:
# upload 2 local files to S3 bucket
s3cmd --config=s3config.cfg put test-file-s3cmd.txt test-file-s3cmd-frombucket.txt s3://aabish-bucket
upload: 'test-file-s3cmd-frombucket.txt' -> 's3://aabish-bucket/test-file-s3cmd-frombucket.txt' [1 of 2]
18 of 18 100% in 0s 354.02 B/s done
upload: 'test-file-s3cmd.txt' -> 's3://aabish-bucket/test-file-s3cmd.txt' [2 of 2]
18 of 18 100% in 0s 341.62 B/s done
s3cmd --config=s3config.cfg del s3://aabish-bucket --recursive --force
delete: 's3://aabish-bucket/test-dir/test-file-s3cmd.txt'
delete: 's3://aabish-bucket/test-dir/test-file2-s3cmd.txt'
delete: 's3://aabish-bucket/test-file-s3cmd-frombucket.txt'
delete: 's3://aabish-bucket/test-file-s3cmd.txt'
s3cmd --config=s3config.cfg ls s3://aabish-bucket
Permissions¶
Access control on buckets and objects is managed by ACLs (Access Control Lists) and Policies.
By default created buckets and their objects are private, accessible only by you.
By changing ACLs it is possible to grant access (read) permissions to other users per bucket or per object or per set of objects. For more granular access control, e.g. per user level, one can also change bucket / object policies (not covered here).
Using s3cmd, you can grant / revoke permissions to read bucket or object with command setacl
in combination with flags --acl-public
/ --acl-private
.
ACLs are not inherited from parent objects. Therefore, note an important difference when calling setacl --acl-public
:
- on a bucket level, this enables the public listing of the bucket's directory only. But does not allow objects within the bucket to be accessed.
- on an object level, this enables read access to the actual object's content.
By analogy you can also create a bucket / upload an object with read permissions using the commands mb --acl-public
/ put --acl-public
.
Enable directory listing of a bucket¶
First let's upload some files to the bucket:
# upload 2 local files to S3 bucket
s3cmd --config=s3config.cfg put test-file-s3cmd.txt test-file-s3cmd-frombucket.txt s3://aabish-bucket
Using command setacl --acl-public
you can enable public listing of the bucket directory (top level of the bucket):
s3cmd --config=s3config.cfg setacl s3://aabish-bucket --acl-public
s3://aabish-bucket/: ACL set to Public
s3cmd --config=s3config.cfg info s3://aabish-bucket
s3://aabish-bucket/ (bucket):
Location: default
Payer: BucketOwner
Ownership: none
Versioning:none
Expiration rule: none
Block Public Access: none
Policy: none
CORS: none
ACL: *anon*: READ
ACL: science-it.aabish: FULL_CONTROL
URL: http://rgw.science-it.uzh.ch/aabish-bucket/
Notice the difference compared to a private bucket:
-
there is a new line
ACL: *anon*: READ
indicating that the bucket directory can be read by anyone. -
there is a new line
URL
which gives the public URL address of the bucket.
To view the content, either paste URL into the browser webpage or download it via terminal with cURL
. Output is an XML page hence we pretty print it with xmllint
. File names of the bucket are shown under tag <Key>FILE_NAME</Key>
.
Warning
You need to change URL address to https://
instead of http://
.
curl https://rgw.science-it.uzh.ch/aabish-bucket/ | xmllint --format -
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 913 0 913 0 0 10930 0 --:--:-- --:--:-- --:--:-- 11000
<?xml version="1.0" encoding="UTF-8"?>
<ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Name>aabish-bucket</Name>
<Prefix/>
<MaxKeys>1000</MaxKeys>
<IsTruncated>false</IsTruncated>
<Contents>
<Key>test-file-s3cmd-frombucket.txt</Key>
<LastModified>2024-09-19T18:40:37.266Z</LastModified>
<ETag>"cc3d5ed5fda53dfa81ea6aa951d7e1fe"</ETag>
<Size>18</Size>
<StorageClass>STANDARD</StorageClass>
<Owner>
<ID>b8df8b4026724aee9501727b6094f296</ID>
<DisplayName>science-it.aabish</DisplayName>
</Owner>
<Type>Normal</Type>
</Contents>
<Contents>
<Key>test-file-s3cmd.txt</Key>
<LastModified>2024-09-19T18:40:37.422Z</LastModified>
<ETag>"cc3d5ed5fda53dfa81ea6aa951d7e1fe"</ETag>
<Size>18</Size>
<StorageClass>STANDARD</StorageClass>
<Owner>
<ID>b8df8b4026724aee9501727b6094f296</ID>
<DisplayName>science-it.aabish</DisplayName>
</Owner>
<Type>Normal</Type>
</Contents>
<Marker/>
</ListBucketResult>
We enabled the public directory listing of the bucket but if we try to access objects within the container, we will get an expected "Access denied" error.
curl https://rgw.science-it.uzh.ch/aabish-bucket/test-file-s3cmd.txt | xmllint --format -
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 248 100 248 0 0 2947 0 --:--:-- --:--:-- --:--:-- 2952
<?xml version="1.0" encoding="UTF-8"?>
<Error>
<Code>AccessDenied</Code>
<Message/>
<BucketName>aabish-bucket</BucketName>
<RequestId>tx000009ec82dce5550579a-0066ed4526-10ac5c1-default</RequestId>
<HostId>10ac5c1-default-default</HostId>
</Error>
Make objects public / private¶
To set single object as public, by analogy with buckets use command setacl --acl-public
on an object level:
s3cmd --config=s3config.cfg setacl s3://aabish-bucket/test-file-s3cmd.txt --acl-public
s3://aabish-bucket/test-file-s3cmd.txt: ACL set to Public [1 of 1]
Retrieve information about the object and check the created URL. Ensure to change URL address to https://
instead of http://
.
s3cmd --config=s3config.cfg info s3://aabish-bucket/test-file-s3cmd.txt
s3://aabish-bucket/test-file-s3cmd.txt (object):
File size: 18
Last mod: Thu, 19 Sep 2024 18:40:37 GMT
MIME type: text/plain
Storage: STANDARD
MD5 sum: cc3d5ed5fda53dfa81ea6aa951d7e1fe
SSE: none
Policy: none
CORS: none
ACL: *anon*: READ
ACL: science-it.aabish: FULL_CONTROL
URL: http://rgw.science-it.uzh.ch/aabish-bucket/test-file-s3cmd.txt
x-amz-meta-s3cmd-attrs: atime:1726768815/ctime:1726768814/gid:20/gname:staff/md5:cc3d5ed5fda53dfa81ea6aa951d7e1fe/mode:33188/mtime:1726768814/uid:501/uname:
View content of the remote file:
curl https://rgw.science-it.uzh.ch/aabish-bucket/test-file-s3cmd.txt
line1
line2
line3
Similarly, you can set object as private by using the flag --acl-private
:
s3cmd --config=s3config.cfg setacl s3://aabish-bucket/test-file-s3cmd.txt --acl-private
s3://aabish-bucket/test-file-s3cmd.txt: ACL set to Private [1 of 1]
To make all objects in the bucket public, execute setacl
command on a bucket level and include --recursive
flag:
s3cmd --config=s3config.cfg setacl s3://aabish-bucket/ --acl-public --recursive
s3://aabish-bucket/test-file-s3cmd-frombucket.txt: ACL set to Public [1 of 2]
s3://aabish-bucket/test-file-s3cmd.txt: ACL set to Public [2 of 2]
Disable directory listing of a bucket¶
To remove permissions to view the bucket content, set ACL as private:
s3cmd --config=s3config.cfg setacl s3://aabish-bucket --acl-private
s3://aabish-bucket/: ACL set to Private
When trying to "view" the content of your bucket, you will receive an expected "Access Denied" error:
curl https://rgw.science-it.uzh.ch/aabish-bucket/ | xmllint --format -
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 248 100 248 0 0 2792 0 --:--:-- --:--:-- --:--:-- 2818
<?xml version="1.0" encoding="UTF-8"?>
<Error>
<Code>AccessDenied</Code>
<Message/>
<BucketName>aabish-bucket</BucketName>
<RequestId>tx00000f7aa06e418b07a14-0066ec73fa-107766b-default</RequestId>
<HostId>107766b-default-default</HostId>
</Error>