diff --git a/.vscode/settings.json b/.vscode/settings.json
index 91321a3..1a3a040 100644
--- a/.vscode/settings.json
+++ b/.vscode/settings.json
@@ -273,6 +273,7 @@
"poweroff",
"powerpipe",
"powersave",
+ "powertop",
"preemptible",
"printenv",
"privs",
diff --git a/knowledge base/cloud computing/aws/opensearch.md b/knowledge base/cloud computing/aws/opensearch.md
index 5e76693..b10bbe0 100644
--- a/knowledge base/cloud computing/aws/opensearch.md
+++ b/knowledge base/cloud computing/aws/opensearch.md
@@ -9,6 +9,8 @@ Amazon offering for managed OpenSearch clusters.
1. [Migrate indexes to UltraWarm storage](#migrate-indexes-to-ultrawarm-storage)
1. [Return warm indexes to hot storage](#return-warm-indexes-to-hot-storage)
1. [Migrate indexes to Cold storage](#migrate-indexes-to-cold-storage)
+1. [Index state management plugin](#index-state-management-plugin)
+1. [Snapshots](#snapshots)
1. [Best practices](#best-practices)
1. [Dedicated master nodes](#dedicated-master-nodes)
1. [Cost-saving measures](#cost-saving-measures)
@@ -17,7 +19,8 @@ Amazon offering for managed OpenSearch clusters.
## Storage
-Clusters can be set up to use the [hot-warm architecture].
+Clusters can be set up to use the [hot-warm architecture].\
+Compared to OpenSearch's, AWS' managed OpenSearch service offers the two extra `UltraWarm` and `Cold` storage options.
_Hot_ storage provides the fastest possible performance for indexing and searching **new** data.
@@ -31,13 +34,12 @@ Aside that, they behave like any other hot index.
_UltraWarm_ nodes use **warm** storage in the form of S3 and caching.
-AWS' managed OpenSearch service offers also _Cold_ storage.
-It is meant for data accessed only occasionally or no longer in active use.
+_Cold_ storage is meant for data accessed only occasionally or no longer in active use.
Cold indexes are normally detached from nodes and stored in S3, meaning one **can't** read from nor write to cold
indexes by default.
Should one need to query them, one needs to selectively attach them to UltraWarm nodes.
-Use [Index State Management][index state management in amazon opensearch service] to automate indexes migration to
+If using the [hot-warm architecture], leverage the [Index State Management plugin] to automate indexes migration to
lower storage states after they meet specific conditions.
### UltraWarm storage
@@ -162,6 +164,172 @@ GET _cold/migration/my-index/_status
POST _cold/migration/my-index/_cancel
```
+## Index state management plugin
+
+Refer [OpenSearch's Index State Management plugin][opensearch index state management] and
+[Index State Management in Amazon OpenSearch Service].
+
+Compared to [OpenSearch] and [ElasticSearch], ISM for Amazon's managed OpenSearch service has several differences:
+
+- The managed OpenSearch service supports the three unique ISM operations `warm_migration`, `cold_migration`, and
+ `cold_delete`.
+
+ If one's domain has [UltraWarm storage] enabled, the `warm_migration` action transitions indexes to warm storage.\
+ If one's domain has [cold storage] enabled, the `cold_migration` action transitions indexes to cold storage, and the
+ `cold_delete` action deletes them from cold storage.
+
+ Should one of these actions not complete within the set timeout period, the migration or deletion of the affected
+ indexes will continue.\
+ Setting an `error_notification` for one of the above actions will send a notification about the action failing,
+ should it not complete within the timeout period, but the notification is only for one's own reference. The actual
+ operation has no inherent timeout, and will continue to run until it eventually succeeds or fails.
+
+- \[should the domain run OpenSearch or Elasticsearch 7.4 or later] The managed OpenSearch service supports the ISM
+ `open` and `close` operations.
+- \[should the domain run OpenSearch or Elasticsearch 7.7 or later] The managed OpenSearch service supports the ISM
+ `snapshot` operation.
+
+- Cold indexes API:
+ - Require specifying the `?type=_cold` parameter when you use the following ISM APIs:
+ - Add policy
+ - Remove policy
+ - Update policy
+ - Retry failed index
+ - Explain index
+ - Do **not** support wildcard operators, except when used at the end of the path.\
+ I.E., `_plugins/_ism/add/logstash-*` is supported, but `_plugins/_ism/add/iad-*-prod` is not.
+ - Do **not** support multiple index names and patterns.\
+ I.E., `_plugins/_ism/remove/app-logs` is supported, but `_plugins/_ism/remove/app-logs,sample-data` is not.
+
+- The managed OpenSearch service allows to change only the following ISM settings:
+ - `plugins.index_state_management.enabled` and `plugins.index_state_management.history.enabled` at cluster level.
+ - `plugins.index_state_management.rollover_alias` at index level.
+
+## Snapshots
+
+Refer [Snapshots][opensearch snapshots] and [Creating index snapshots in Amazon OpenSearch Service].
+
+AWS-managed OpenSearch Service snapshots come in the following forms:
+
+- _Automated_ snapshots: only for cluster recovery, stored in a **preconfigured** S3 bucket at **no** additional cost.\
+ One can use them to restore the domain in the event of red cluster status or data loss.
+- _Manual_ snapshots: for cluster recovery or moving data from one cluster to another.\
+ Users must be those initiating manual snapshots.\
+ These snapshots are stored in one's own S3 bucket. Standard S3 charges apply.
+
+All AWS-managed OpenSearch Service domains take automated snapshots, but with a frequency difference:
+
+- Domains running OpenSearch or Elasticsearch 5.3 and later take **hourly** automated snapshots and retain up to 336 of
+ them for 14 days.
+- Domains running Elasticsearch 5.1 and earlier take **daily** automated snapshots during off-peak hours and retain up
+ to 14 of them. No snapshot data is retained for more than 30 days.
+
+> [!IMPORTANT]
+> Should a cluster enter the red status, all automated snapshots will fail for the time that status persists.
+
+To be able to create snapshots manually:
+
+- An S3 bucket must exist to store snapshots.
+
+ > [!IMPORTANT]
+ > Manual snapshots do **not** support the S3 Glacier storage class.\
+ > Do **not** apply any S3 Glacier lifecycle rule to this bucket.
+
+- An IAM role that delegates permissions to the OpenSearch Service must be defined.\
+ This role must be able to act on the S3 bucket above.
+
+
+ Trust relationship (A.K.A. assume role policy)
+
+ ```json
+ {
+ "Version": "2012-10-17",
+ "Statement": [{
+ "Effect": "Allow",
+ "Principal": {
+ "Service": "es.amazonaws.com"
+ },
+ "Action": "sts:AssumeRole"
+ }]
+ }
+ ```
+
+
+
+
+ Policy
+
+ ```json
+ {
+ "Version": "2012-10-17",
+ "Statement": [{
+ "Effect": "Allow",
+ "Action": [
+ "s3:ListBucket",
+ "s3:GetObject",
+ "s3:PutObject",
+ "s3:DeleteObject"
+ ],
+ "Resource": [
+ "arn:aws:s3:::{{ bucket name here }}",
+ "arn:aws:s3:::{{ bucket name here }}/*"
+ ]
+ }]
+ }
+ ```
+
+
+
+- The IAM user or role whose credentials will be used to sign the requests must have permissions to:
+
+ - Pass the role above to the OpenSearch Service.
+
+
+ Policy
+
+ ```json
+ {
+ "Version": "2012-10-17",
+ "Statement": [{
+ "Effect": "Allow",
+ "Action": "iam:PassRole",
+ "Resource": "arn:aws:iam::{{ aws account id }}:role/{{ role name }}"
+ }]
+ }
+ ```
+
+
+
+ Should one use the domain's dashboards' dev tools, and should the domain use Cognito for authentication, those
+ permissions need to be added to the IAM role that cognito uses for the user pool.
+
+ Should the user or role making the requests be missing such permissions, they might encounter this error when trying
+ to register a repository in the next step:
+
+ > User: arn:aws:iam::123456789012:user/MyUserAccount is not authorized to perform: iam:PassRole on resource:
+ > arn:aws:iam::123456789012:role/TheSnapshotRole
+
+ - Use the `es:ESHttpPut` action in the domain.
+
+
+ Policy
+
+ ```json
+ {
+ "Version": "2012-10-17",
+ "Statement": [{
+ "Effect": "Allow",
+ "Action": "es:ESHttpPut",
+ "Resource": "arn:aws:es:{{ region}}:{{ aws account id }}:domain/{{ domain name }}/*"
+ }]
+ }
+ ```
+
+
+
+Snapshots can be taken only from indices in the hot or warm storage tiers.\
+Only **one** index from warm storage is allowed at a time, and the request **cannot** contain indices in mixed tiers.
+
## Best practices
Refer [Operational best practices for Amazon OpenSearch Service] and
@@ -242,6 +410,7 @@ can manage.
## Further readings
- [OpenSearch]
+- [ElasticSearch]
- [Hot-warm architecture]
- [Supported instance types in Amazon OpenSearch Service]
@@ -267,19 +436,25 @@ can manage.
-->
+[Cold storage]: #cold-storage
+[Index State Management plugin]: #index-state-management-plugin
[migrate indexes to ultrawarm storage]: #migrate-indexes-to-ultrawarm-storage
[ultrawarm storage]: #ultrawarm-storage
-[dedicated master nodes]: #dedicated-master-nodes
-[hot-warm architecture]: ../../opensearch.md#hot-warm-architecture
-[opensearch]: ../../opensearch.md
-[s3]: s3.md
+[Dedicated master nodes]: #dedicated-master-nodes
+[Hot-warm architecture]: ../../opensearch.md#hot-warm-architecture
+[ElasticSearch]: ../../elasticsearch.md
+[OpenSearch]: ../../opensearch.md
+[OpenSearch index state management]: ../../opensearch.md#index-state-management-plugin
+[OpenSearch snapshots]: ../../opensearch.md#snapshots
+[S3]: s3.md
[best practices for configuring your amazon opensearch service domain]: https://aws.amazon.com/blogs/big-data/best-practices-for-configuring-your-amazon-opensearch-service-domain/
[cold storage for amazon opensearch service]: https://docs.aws.amazon.com/opensearch-service/latest/developerguide/cold-storage.html
+[Creating index snapshots in Amazon OpenSearch Service]: https://docs.aws.amazon.com/opensearch-service/latest/developerguide/managedomains-snapshots.html
[dedicated master nodes in amazon opensearch service]: https://docs.aws.amazon.com/opensearch-service/latest/developerguide/managedomains-dedicatedmasternodes.html
[how do i reduce the cost of using opensearch service domains?]: https://repost.aws/knowledge-center/opensearch-domain-pricing
[index state management in amazon opensearch service]: https://docs.aws.amazon.com/opensearch-service/latest/developerguide/ism.html
diff --git a/knowledge base/elasticsearch.md b/knowledge base/elasticsearch.md
index e9dd44f..63e6fd9 100644
--- a/knowledge base/elasticsearch.md
+++ b/knowledge base/elasticsearch.md
@@ -9,7 +9,8 @@ Part of the Elastic Stack along with Beats, [Kibana] and [Logstash].
Use cases: application search, log analytics, data observability, data ingestion, others.
-Forked by Amazon into OpenSearch after Elastic's [2021 license change announcement].
+[Forked by Amazon][stepping up for a truly open source elasticsearch] into OpenSearch after Elastic's
+[2021 license change announcement][elastic license update].
1. [TL;DR](#tldr)
1. [Further readings](#further-readings)
@@ -50,7 +51,7 @@ Forked by Amazon into OpenSearch after Elastic's [2021 license change announceme
## Further readings
- [Website]
-- [Main repository]
+- [Codebase]
- [OpenSearch], open source fork by Amazon
- [Beats], [Kibana] and [Logstash]: the rest of the Elastic stack
@@ -63,15 +64,16 @@ Forked by Amazon into OpenSearch after Elastic's [2021 license change announceme
-[beats]: beats.md
-[kibana]: kibana.md
-[logstash]: logstash.md
-[opensearch]: opensearch.md
+[Beats]: beats.md
+[Kibana]: kibana.md
+[Logstash]: logstash.md
+[OpenSearch]: opensearch.md
-[2021 license change announcement]: https://www.elastic.co/blog/elastic-license-update
-[main repository]: https://github.com/elastic/elasticsearch
-[website]: https://www.elastic.co/elasticsearch
+[Codebase]: https://github.com/elastic/elasticsearch
+[Elastic License Update]: https://www.elastic.co/blog/elastic-license-update
+[Website]: https://www.elastic.co/elasticsearch
+[Stepping up for a truly open source Elasticsearch]: https://aws.amazon.com/blogs/opensource/stepping-up-for-a-truly-open-source-elasticsearch/
diff --git a/knowledge base/opensearch.md b/knowledge base/opensearch.md
index 0bfbd8a..c8b4500 100644
--- a/knowledge base/opensearch.md
+++ b/knowledge base/opensearch.md
@@ -7,12 +7,14 @@ Makes it easy to ingest, search, visualize, and analyze data.
Use cases: application search, log analytics, data observability, data ingestion, others.
1. [TL;DR](#tldr)
-1. [Node types](#node-types)
-1. [Indexes](#indexes)
+1. [Concepts](#concepts)
+ 1. [Node types](#node-types)
+ 1. [Indexes](#indexes)
1. [Setup](#setup)
1. [The split brain problem](#the-split-brain-problem)
1. [Tuning](#tuning)
1. [Hot-warm architecture](#hot-warm-architecture)
+1. [Manage indexes](#manage-indexes)
1. [Index templates](#index-templates)
1. [Composable index templates](#composable-index-templates)
1. [Ingest data](#ingest-data)
@@ -20,6 +22,9 @@ Use cases: application search, log analytics, data observability, data ingestion
1. [Re-index data](#re-index-data)
1. [Data streams](#data-streams)
1. [Index patterns](#index-patterns)
+1. [Index state management plugin](#index-state-management-plugin)
+1. [Snapshots](#snapshots)
+ 1. [Take snapshots](#take-snapshots)
1. [APIs](#apis)
1. [Further readings](#further-readings)
1. [Sources](#sources)
@@ -131,7 +136,9 @@ If indexes do not already exist, OpenSearch automatically creates them while [in
-## Node types
+## Concepts
+
+### Node types
| Node type | Description | Best practices for production |
| ------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
@@ -154,9 +161,7 @@ After assessing all requirements, it is suggested to use benchmark testing tools
Provision a small sample cluster and run tests with varying workloads and configurations. Compare and analyze the system
and query metrics for these tests improve upon the architecture.
-## Indexes
-
-Refer [Managing indexes].
+### Indexes
Indexes are collections of documents that one wants to make searchable.
They organize the data for fast retrieval.
@@ -430,6 +435,16 @@ Refer [Elasticsearch Split Brain] and [Avoiding the Elasticsearch split brain pr
Refer [Set up a hot-warm architecture].
+Enables using the [Index State Management plugin] to automate indexes migration to lower storage states after they meet
+specific conditions.
+
+## Manage indexes
+
+Refer [Managing indexes].
+
+If using the [hot-warm architecture], leverage the [Index State Management plugin] to automate indexes migration to
+lower storage states after they meet specific conditions.
+
## Index templates
Refer [Index templates][documentation index templates].
@@ -542,7 +557,7 @@ If the destination index requires field mappings or custom settings, (re)create
with the desired ones.
- Reindex all documents
+ Re-index all documents
Copy **all** documents from one index to another.
@@ -583,7 +598,7 @@ POST _reindex
- Reindex only unique documents
+ Re-index only unique documents
Copy **only** documents **missing** from a destination index by setting the `op_type` option to `create`.
@@ -886,6 +901,83 @@ They require data to be indexed before creation.
+## Index state management plugin
+
+Requires the cluster to use the [hot-warm architecture].
+
+Refer [Index State Management][documentation index state management].
+
+## Snapshots
+
+Backups of a cluster's indexes and state.
+
+Index snapshots include the affected indexes' data.\
+State snapshots includes cluster settings, node information, index metadata (mappings, settings, or templates), and
+shard allocation.
+
+Snapshots are typically used for:
+
+- Recovering from failure.
+- Migrating existing data from one cluster to another.
+
+One can take and restore snapshots using the snapshot-related [API][apis].\
+If needing to automate snapshot creation, one can use the [snapshot management] feature.
+
+> [!NOTE]
+> Snapshots **do take time to complete**.\
+> While a snapshot is in progress, the cluster continues to index documents and respond to requests. New documents, and
+> updates to existing documents, will **not** included in the snapshot.
+
+Snapshots includes primary shards as they existed when the cluster initiated the snapshot.\
+Depending on the size of the snapshot thread pool, different shards might be included in a snapshot at slightly
+different times.
+
+Snapshots are incremental, meaning that they only store data that has changed since the last successful snapshot.\
+The difference in disk usage between frequent and infrequent snapshots is often minimal. Users are advised to prefer
+taking more snapshots more frequently, then one every so often.
+
+> [!IMPORTANT]
+> When deleting a snapshot, be sure to use the [API][apis] rather than navigating to the storage location and purging
+> files.\
+> Incremental snapshots often share a lot of the same data. When using the API, OpenSearch only removes data that no
+> other snapshot is using.
+
+Snapshots are stored in snapshot repositories.
+Repositories are just storage locations. They can be a shared file system, S3 buckets, Hadoop Distributed File System
+(HDFS), or Azure Storage.
+
+Before one can take snapshots, one **must register** a snapshot repository in the cluster.
+
+### Take snapshots
+
+Refer [Take and restore snapshots].
+
+When taking snapshots, one must specify the name of the snapshot repository and the name of the snapshot.
+
+
+
+This snapshot includes all indexes **and** the cluster's state:
+
+```plaintext
+PUT _snapshot/some-repository/1
+```
+
+Add a request body to include or exclude certain indexes, or specify other settings:
+
+```plaintext
+PUT /_snapshot/my-repository/2
+{
+ "indices": "opensearch-dashboards*,my-index*,-my-index-2016",
+ "ignore_unavailable": true,
+ "include_global_state": false,
+ "partial": false
+}
+```
+
+
+
+Check snapshots' progress with `GET _snapshot/_status`.
+
## APIs
OpenSearch clusters offer a REST API.
@@ -1115,10 +1207,19 @@ DELETE /students/_doc/1
- Indexes
+ Indices
- View the inferred field types in indexes
+ List indices
+
+```plaintext
+GET _list/indices
+```
+
+
+
+
+ View the inferred field types in indices
Send `GET` requests to the `_mapping` endpoint:
@@ -1243,6 +1344,20 @@ DELETE /students
+
+ Snapshots
+
+`/_snapshot` endpoint.
+
+
+ Delete snapshots
+
+```plaintext
+DELETE _snapshot/repository-name/snapshot-name
+```
+
+
+
## Further readings
- [Website]
@@ -1273,6 +1388,7 @@ DELETE /students
- [Index templates][documentation index templates]
- [OpenSearch Data Streams]
- [OpenSearch Indexes and Data streams]
+- [Snapshot Operations in OpenSearch]