mirror of
https://gitea.com/mcereda/oam.git
synced 2026-02-09 05:44:23 +00:00
274 lines
13 KiB
Markdown
274 lines
13 KiB
Markdown
# Simple Storage Service
|
|
|
|
1. [TL;DR](#tldr)
|
|
1. [Storage classes](#storage-classes)
|
|
1. [Lifecycle configuration](#lifecycle-configuration)
|
|
1. [Further readings](#further-readings)
|
|
1. [Sources](#sources)
|
|
|
|
## TL;DR
|
|
|
|
<details>
|
|
<summary>Usage</summary>
|
|
|
|
```sh
|
|
# List all buckets.
|
|
aws s3 ls
|
|
aws s3api list-buckets --output 'json' --query 'Buckets[].Name'
|
|
aws s3api list-buckets --output 'yaml-stream' | yq -r '.[].Buckets[].Name' -
|
|
|
|
# List prefixes and objects in buckets.
|
|
# Adding the trailing '/' or '--recurse' lists the content of prefixes.
|
|
aws s3 ls 's3://my-bucket'
|
|
aws s3 ls --recursive 's3://my-bucket/prefix/'
|
|
aws s3 ls 's3://arn:aws:s3:us-west-2:123456789012:accesspoint/myaccesspoint/'
|
|
|
|
# Find the size of buckets or objects.
|
|
# It will list all the contents *and* give a total size at the end.
|
|
aws s3 ls --human-readable --recursive --summarize 's3://my-bucket'
|
|
aws s3 ls … 's3://my-bucket/prefix/'
|
|
|
|
# Create buckets.
|
|
aws s3 mb 's3://my-bucket'
|
|
|
|
# Copy files to or from buckets.
|
|
aws s3 cp 'test.txt' 's3://my-bucket/test4.txt'
|
|
aws s3 cp 'test.txt' 's3://my-bucket/test2.txt' --expires '2024-10-01T20:30:00Z'
|
|
aws s3 cp 's3://my-bucket/test.txt' 'test2.txt'
|
|
aws s3 cp 's3://my-bucket/test.txt' 's3://my-bucket/test5.txt'
|
|
aws s3 cp 's3://my-bucket/test.txt' 's3://my-other-bucket/'
|
|
aws s3 cp 's3://my-bucket' '.' --recursive
|
|
aws s3 cp 'myDir' 's3://my-bucket/' --recursive --exclude "*.jpg"
|
|
aws s3 cp 's3://my-bucket/logs/' 's3://my-bucket2/logs/' --recursive \
|
|
--exclude "*" --include "*.log"
|
|
aws s3 cp 's3://my-bucket/test.txt' 's3://my-bucket/test2.txt' \
|
|
--acl 'public-read-write'
|
|
aws s3 cp 'file.txt' 's3://my-bucket/' \
|
|
--grants read=uri='http://acs.amazonaws.com/groups/global/AllUsers' \
|
|
'full=id=79a59df900b949e55d96a1e698fbacedfd6e09d98eacf8f8d5218e7cd47ef2be'
|
|
aws s3 cp 'mydoc.txt' 's3://arn:aws:s3:us-west-2:123456789012:accesspoint/myaccesspoint/mykey'
|
|
|
|
# Handle file streams.
|
|
# Useful for piping:
|
|
# - setting the source to '-' sends data from stdin
|
|
# - setting the destination to '-' sends data to stdout
|
|
aws s3 cp - 's3://my-bucket/stream.txt'
|
|
aws s3 cp - 's3://my-bucket/stream.txt' --expected-size '54760833024'
|
|
aws s3 cp 's3://my-bucket/stream.txt' -
|
|
|
|
# Directly print the contents of files to stdout.
|
|
aws s3 cp --quiet 's3://my-bucket/file.txt' '-'
|
|
aws s3 cp --quiet 's3://my-bucket/file.txt' '/dev/stdout'
|
|
|
|
# Remove objects.
|
|
aws s3 rm 's3://my-bucket/prefix-name' --recursive --dryrun
|
|
|
|
# Sync buckets.
|
|
aws s3 sync '.' 's3://my-bucket'
|
|
aws s3 sync 's3://my-bucket' '.' --delete
|
|
aws s3 sync 's3://my-bucket' 's3://my-other-bucket' --exclude "*.jpg"
|
|
aws s3 sync 's3://my-us-west-2-bucket' 's3://my-eu-east-1-bucket' \
|
|
--source-region 'us-west-2' --region 'eu-east-1'
|
|
aws s3 sync '.' 's3://arn:aws:s3:us-west-2:123456789012:accesspoint/myaccesspoint/'
|
|
|
|
# Delete buckets.
|
|
aws s3 rb 's3://my-bucket'
|
|
aws s3 rb 's3://my-bucket' --force
|
|
|
|
# Check permissions.
|
|
aws s3api get-bucket-acl --bucket 'my-bucket'
|
|
```
|
|
|
|
</details>
|
|
|
|
<details>
|
|
<summary>Lifecycle configurations</summary>
|
|
|
|
```sh
|
|
# Manage lifecycle configurations.
|
|
# Operations on lifecycle rules take a while.
|
|
aws s3api get-bucket-lifecycle-configuration --bucket 'bucketName'
|
|
aws s3api put-bucket-lifecycle-configuration --bucket 'bucketName' \
|
|
--lifecycle-configuration 'file://lifecycle.definition.json'
|
|
aws s3api delete-bucket-lifecycle-configuration --bucket 'bucketName'
|
|
```
|
|
|
|
</details>
|
|
|
|
<details>
|
|
<summary>Real life use cases</summary>
|
|
|
|
```sh
|
|
# Get objects with their storage class.
|
|
aws s3api list-objects --bucket 'my-bucket' \
|
|
--query 'Contents[].{Key: Key, StorageClass: StorageClass}'
|
|
|
|
# Show tags on objects.
|
|
aws s3api list-objects-v2 \
|
|
--bucket 'my-bucket' --prefix 'someObjectsInHereAreTagged' \
|
|
--query 'Contents[*].Key' --output text \
|
|
| xargs -n 1 \
|
|
aws s3api get-object-tagging --bucket 'my-bucket' --query 'TagSet[*]' --key
|
|
```
|
|
|
|
</details>
|
|
|
|
## Storage classes
|
|
|
|
| Class name | Console name | Fees | Latency | Minimum storage charge | Minimum billed object size | # of AZs |
|
|
| -------------------------- | --------------------- | -------------------- | ---------------- | ---------------------- | -------------------------- | -------- |
|
|
| Standard | `STANDARD` | ✗ | milliseconds | ✗ | | 3+ |
|
|
| Express One Zone | `EXPRESS_ONEZONE` | ✗ | single-digit ms | 1 hour | | 1 |
|
|
| Intelligent Tiering | `INTELLIGENT_TIERING` | per monitored object | milliseconds | ✗ | | 3+ |
|
|
| Standard Infrequent Access | `STANDARD_IA` | per GB retrieved | milliseconds | 30 days | 128 KB | 3+ |
|
|
| One Zone Infrequent Access | `ONEZONE_IA` | per GB retrieved | milliseconds | 30 days | 128 KB | 1 |
|
|
| Glacier Instant Retrieval | `GLACIER_IR` | per GB retrieved | milliseconds | 90 days | 128 KB | 3+ |
|
|
| Glacier Flexible Retrieval | `GLACIER` | per GB retrieved | minutes to hours | 90 days | | 3+ |
|
|
| Glacier Deep Archive | `DEEP_ARCHIVE` | per GB retrieved | hours | 180 days | | 3+ |
|
|
|
|
_Standard_ is the storage class used by default if none is specified when uploading objects.
|
|
|
|
_Express One Zone_ is purpose-built for consistency and low latency. It has the highest performance, and lower request
|
|
costs than standard, but is only available within a single Availability Zone at a time.
|
|
|
|
_Intelligent Tiering_ optimizes storage costs by automatically moving data between access tiers depending on its usage,
|
|
without performance impact or operational overhead.<br/>
|
|
Ideal for data that has unknown or changing access patterns.
|
|
|
|
Intelligent Tiering automatically moves objects that have not been accessed in some time to lower-cost access tiers that
|
|
still offer low-latency and high-throughput.
|
|
|
|
<details style='padding: 0 0 1rem 1rem'>
|
|
|
|
Objects in Intelligent Tiering are stored automatically in the following tiers:
|
|
|
|
- _Frequent Access_: contains objects that are uploaded, or transitioned, to the storage class.
|
|
- _Infrequent Access_: contains objects that have not been accessed for **30 consecutive days**.
|
|
- _Archive Instant Access_: contains objects that have not been accessed for **90 consecutive days**.
|
|
|
|
> [!important]
|
|
> Object less than 128 KB in size are **not** eligible for auto-tiering. These objects are kept in the Frequent Access
|
|
> tier at all times.
|
|
|
|
</details>
|
|
|
|
One can also enable automatic archiving capabilities within Intelligent Tiering for data that can be accessed
|
|
**asynchronously**. In this case, it will eventually move objects to access tiers with even lower costs, but that
|
|
require explicit retrieval processes.
|
|
|
|
<details style='padding: 0 0 1rem 1rem'>
|
|
|
|
The optional archive access tiers are the following:
|
|
|
|
- _Archive Access_: archives objects that have not been accessed for **at least 90 consecutive days**.
|
|
- _Deep Archive Access_: archives objects that have not been accessed for **at least 180 consecutive days**.
|
|
|
|
Objects in the Archive Access or Deep Archive Access tiers **must first be restored** to higher tiers by using the
|
|
`RestoreObject` action.
|
|
|
|
</details>
|
|
|
|
_Standard Infrequent Access_ and _One Zone Infrequent Access_ are designed for data that is both **long-lived** and
|
|
**infrequently accessed**, but still requires millisecond access.<br/>
|
|
Suitable for objects larger than 128 KB that are needed for at least 30 days.
|
|
|
|
> [!important]
|
|
> S3 charges for object smaller than 128 KB as if they were of 128 KB.<br/>
|
|
> Objects deleted, overwritten, or transitioned to a different storage class before the end of the 30-day minimum
|
|
> storage duration period will still incur in charges for the full 30 days.
|
|
|
|
_Glacier Instant Retrieval_, _Glacier Flexible Retrieval_, and _Glacier Deep Archive_ are designed for low-cost,
|
|
long-term data storage and data archiving.<br/>
|
|
All these storage classes require minimum storage durations and charge retrieval fees.
|
|
|
|
Glacier Instant Retrieval is the only one in the Glacier set that offers milliseconds retrieval and real-time
|
|
access.<br/>
|
|
Glacier Flexible Retrieval and Glacier Deep Archive archive the data they receive, making it **not** available for
|
|
real-time access.
|
|
|
|
## Lifecycle configuration
|
|
|
|
S3 supports specific lifecycle transitions between storage classes using Lifecycle configurations:
|
|
|
|

|
|
|
|
Objects can be transitioned **down** the storage classes, but **not** up.<br/>
|
|
Objects in need to be moved to a higher storage class need to be **_recreated_** in that storage class. This means that
|
|
they will take new metadata.
|
|
|
|
Other constraints apply, e.g., objects smaller than 128KiB are not usually transitioned in tier.<br/>
|
|
See [General considerations for transitions][lifecycle general considerations for transitions].
|
|
|
|
When multiple rules are applied through Lifecycle configurations, objects can become eligible for multiple Lifecycle
|
|
actions. In such cases:
|
|
|
|
1. Permanent deletion takes precedence over transitions.
|
|
1. Transitions takes precedence over creation of delete markers.
|
|
1. When objects are eligible for transition to both S3 Glacier Flexible Retrieval and S3 Standard-IA (or One Zone-IA),
|
|
precedence is given to S3 Glacier Flexible Retrieval transition.
|
|
|
|
> [!important]
|
|
> When adding Lifecycle configurations to buckets, there is usually some lag before a new, or updated, Lifecycle
|
|
> configuration is fully propagated to all S3's systems.<br/>
|
|
> Expect a delay of a few minutes before any change in configuration starts taking effect. This includes configuration
|
|
> deletions.
|
|
|
|
Examples: [1][lifecycle configuration examples], [2][s3 lifecycle rules examples]
|
|
|
|
## Further readings
|
|
|
|
- [Amazon Web Services]
|
|
- [Configure notification for lifecycle rules][lifecycle configure notification]
|
|
- AWS' [CLI]
|
|
- [Expiring Amazon S3 objects based on last accessed date to decrease costs]
|
|
- [Understanding and managing Amazon S3 storage classes]
|
|
- [Using S3 Intelligent-Tiering]
|
|
- [Amazon S3 cost optimization for predictable and dynamic access patterns]
|
|
- [Gateway Endpoints vs Internet Routing for S3]
|
|
- [Understanding your AWS billing and usage reports for Amazon S3]
|
|
- [Downloading objects]
|
|
- [Download and upload objects with presigned URLs]
|
|
|
|
### Sources
|
|
|
|
- [Amazon S3 Storage Classes]
|
|
- [General considerations for transitions][lifecycle general considerations for transitions]
|
|
- [Lifecycle configuration examples][lifecycle configuration examples]
|
|
- [CLI subcommand reference]
|
|
- [Find out the size of your Amazon S3 buckets]
|
|
- [How S3 Intelligent-Tiering works]
|
|
- [Amazon S3 Intelligent Tiering]
|
|
|
|
<!--
|
|
Reference
|
|
═╬═Time══
|
|
-->
|
|
|
|
<!-- In-article sections -->
|
|
<!-- Knowledge base -->
|
|
[amazon web services]: README.md
|
|
[cli]: cli.md
|
|
|
|
<!-- Files -->
|
|
[s3 lifecycle rules examples]: ../../../examples/aws/s3.lifecycle-rules
|
|
|
|
<!-- Upstream -->
|
|
[Amazon S3 cost optimization for predictable and dynamic access patterns]: https://aws.amazon.com/blogs/storage/amazon-s3-cost-optimization-for-predictable-and-dynamic-access-patterns/
|
|
[amazon s3 storage classes]: https://aws.amazon.com/s3/storage-classes/
|
|
[cli subcommand reference]: https://docs.aws.amazon.com/cli/latest/reference/s3/
|
|
[Download and upload objects with presigned URLs]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-presigned-url.html
|
|
[Downloading objects]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/download-objects.html
|
|
[expiring amazon s3 objects based on last accessed date to decrease costs]: https://aws.amazon.com/blogs/architecture/expiring-amazon-s3-objects-based-on-last-accessed-date-to-decrease-costs/
|
|
[find out the size of your amazon s3 buckets]: https://aws.amazon.com/blogs/storage/find-out-the-size-of-your-amazon-s3-buckets/
|
|
[how s3 intelligent-tiering works]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/intelligent-tiering-overview.html
|
|
[lifecycle configuration examples]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/lifecycle-configuration-examples.html
|
|
[lifecycle configure notification]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/lifecycle-configure-notification.html
|
|
[lifecycle general considerations for transitions]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/lifecycle-transition-general-considerations.html
|
|
[Understanding and managing Amazon S3 storage classes]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-class-intro.html
|
|
[Understanding your AWS billing and usage reports for Amazon S3]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/aws-usage-report-understand.html
|
|
[Using S3 Intelligent-Tiering]: https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-intelligent-tiering.html
|
|
|
|
<!-- Others -->
|
|
[Amazon S3 Intelligent Tiering]: https://awsfundamentals.com/blog/amazon-s3-intelligent-tiering
|
|
[Gateway Endpoints vs Internet Routing for S3]: https://awsfundamentals.com/blog/gateway-endpoints-vs-internet-routing-s3
|