chore(aws/ssm): register how ssm worked before i forget

This commit is contained in:
Michele Cereda
2024-04-22 21:26:46 +02:00
parent a7bf416580
commit 39a359b7f1
7 changed files with 213 additions and 34 deletions

2
.ansible-lint-ignore Normal file
View File

@@ -0,0 +1,2 @@
ansible/playbooks/aws.ec2.enable-ssm-agent.yml package-latest
ansible/playbooks/keybase.register-device.yml no-changed-when

View File

@@ -0,0 +1,39 @@
---
- name: Enable SSM management through agent
hosts: all
tasks:
- name: Install the SSM Agent
tags:
- agent
- package
become: true
ansible.builtin.package:
name: amazon-ssm-agent
state: latest
register: package
- name: Enable required services
tags:
- agent
- service
become: true
ansible.builtin.service:
name: amazon-ssm-agent.service
state: started
enabled: true
register: service
post_tasks:
- name: Check everything is working from the instance
tags:
- check
when:
- package is not failed
- service is not failed
block:
- name: Run the diagnostic command
become: true
ansible.builtin.command: ssm-cli get-diagnostics --output 'json'
register: diagnostics
changed_when: false
- name: Show the results
ansible.builtin.debug:
var: diagnostics.stdout

View File

@@ -29,6 +29,7 @@
- name: Get the tools' label - name: Get the tools' label
register: cli_tools_label register: cli_tools_label
ansible.builtin.shell: >- ansible.builtin.shell: >-
set -o pipefail && \
/usr/sbin/softwareupdate --list /usr/sbin/softwareupdate --list
| grep -B 1 -E 'Command Line Tools' | grep -B 1 -E 'Command Line Tools'
| awk -F'*' '/^ *\\*/ {print $2}' | awk -F'*' '/^ *\\*/ {print $2}'

View File

@@ -100,9 +100,11 @@ ansible-galaxy remove 'namespace.role'
## Configuration ## Configuration
Ansible can be configured using INI files named `ansible.cfg`, environment variables, command-line options, playbook keywords, and variables. Ansible can be configured using INI files named `ansible.cfg`, environment variables, command-line options, playbook
keywords, and variables.
The `ansible-config` utility allows to see all the configuration settings available, their defaults, how to set them and where their current value comes from. The `ansible-config` utility allows to see all the configuration settings available, their defaults, how to set them and
where their current value comes from.
Ansible will process the following list and use the first file found; all the other files are ignored even if existing: Ansible will process the following list and use the first file found; all the other files are ignored even if existing:
@@ -111,7 +113,7 @@ Ansible will process the following list and use the first file found; all the ot
1. the `~/.ansible.cfg` file in the user's home directory; 1. the `~/.ansible.cfg` file in the user's home directory;
1. the `/etc/ansible/ansible.cfg` file. 1. the `/etc/ansible/ansible.cfg` file.
One can generate a fully commented-out example of the `ansible.cfg` file: Generate a fully commented-out example of the `ansible.cfg` file:
```sh ```sh
ansible-config init --disabled > 'ansible.cfg' ansible-config init --disabled > 'ansible.cfg'
@@ -199,6 +201,12 @@ Return a boolean result.
# Compare semver version numbers. # Compare semver version numbers.
- ansible.builtin.debug: - ansible.builtin.debug:
var: "'2.0.0-rc.1+build.123' is version('2.1.0-rc.2+build.423', 'ge', version_type='semver')" var: "'2.0.0-rc.1+build.123' is version('2.1.0-rc.2+build.423', 'ge', version_type='semver')"
# Find specific values in JSON objects.
- ansible.builtin.command: ssm-cli get-diagnostics --output 'json'
become: true
register: diagnostics
failed_when: diagnostics.stdout | to_json | community.general.json_query('DiagnosticsOutput[*].Status=="Failed"')
``` ```
### Loops ### Loops
@@ -293,12 +301,12 @@ stdout_callback = json
`yaml` will set tasks output only to be in the defined format: `yaml` will set tasks output only to be in the defined format:
```sh ```sh
$ ANSIBLE_STDOUT_CALLBACK='yaml' ansible-playbook --inventory='localhost.localdomain,' 'localhost.configure.yml' -vv --check $ ANSIBLE_STDOUT_CALLBACK='yaml' ansible-playbook --inventory='localhost,' 'localhost.configure.yml' -vv --check
PLAY [Configure localhost] ******************************************************************* PLAY [Configure localhost] *******************************************************************
TASK [Upgrade system packages] *************************************************************** TASK [Upgrade system packages] ***************************************************************
task path: /home/user/localhost.configure.yml:7 task path: /home/user/localhost.configure.yml:7
ok: [localhost.localdomain] => changed=false ok: [localhost] => changed=false
cmd: cmd:
- /usr/bin/zypper - /usr/bin/zypper
- --quiet - --quiet
@@ -310,7 +318,7 @@ ok: [localhost.localdomain] => changed=false
The `json` output format will be a single, long JSON file: The `json` output format will be a single, long JSON file:
```sh ```sh
$ ANSIBLE_STDOUT_CALLBACK='json' ansible-playbook --inventory='localhost.localdomain,' 'localhost.configure.yml' -vv --check $ ANSIBLE_STDOUT_CALLBACK='json' ansible-playbook --inventory='localhost,' 'localhost.configure.yml' -vv --check
{ {
"custom_stats": {}, "custom_stats": {},
"global_custom_stats": {}, "global_custom_stats": {},
@@ -323,7 +331,7 @@ $ ANSIBLE_STDOUT_CALLBACK='json' ansible-playbook --inventory='localhost.localdo
"tasks": [ "tasks": [
{ {
"hosts": { "hosts": {
"localhost.localdomain": { "localhost": {
"action": "community.general.zypper", "action": "community.general.zypper",
"changed": false, "changed": false,
@@ -397,7 +405,8 @@ Use the special `X` mode setting in the `file` plugin:
### Only run a task when another has a specific result ### Only run a task when another has a specific result
When a task executes, it also stores the two special values `changed` and `failed` in its results. You can use those as conditions to execute the next ones: When a task executes, it also stores the two special values `changed` and `failed` in its results.<br/>
One can use those as conditions to execute the next ones:
```yaml ```yaml
- name: Trigger task - name: Trigger task
@@ -463,7 +472,9 @@ Environment variables can be set at a play, block, or task level using the `envi
ansible.builtin.command: curl ifconfig.io ansible.builtin.command: curl ifconfig.io
``` ```
The `environment` keyword does not affect Ansible itself or its configuration settings, the environment for other users, or the execution of other plugins like lookups and filters; variables set with `environment` do not automatically become Ansible facts, even when set at the play level. The `environment` keyword does **not** affect Ansible itself or its configuration settings, the environment for other
users, or the execution of other plugins like lookups and filters.<br/>
Variables set with `environment` do **not** automatically become Ansible facts, even when set at the play level.
### Set variables to the value of environment variables ### Set variables to the value of environment variables
@@ -484,7 +495,8 @@ Use the `lookup()` plugin with the `env` option:
### Define different values for `true`/`false`/`null` ### Define different values for `true`/`false`/`null`
Create a test and define two values: the first will be returned when the test returns `true`, the second will be returned when the test returns `false` (Ansible 1.9+): Create a test and define two values: the first will be returned when the test returns `true`, the second will be
returned when the test returns `false` (Ansible 1.9+):
```yaml ```yaml
{{ (ansible_pkg_mgr == 'zypper') | ternary('gnu_parallel', 'parallel')) }} {{ (ansible_pkg_mgr == 'zypper') | ternary('gnu_parallel', 'parallel')) }}
@@ -523,11 +535,13 @@ Use the `ansible.builtin.copy` instead of `ansible.builtin.template`:
Root Cause: Root Cause:
> Mac OS High Sierra and later versions have restricted multithreading for improved security.<br/> > Mac OS High Sierra and later versions have restricted multithreading for improved security.<br/>
> Apple has defined some rules on what is allowed and not is not after forking processes, and have also added `async-signal-safety` to a limited number of APIs. > Apple has defined some rules on what is allowed and not is not after forking processes, and have also added
> `async-signal-safety` to a limited number of APIs.
Solution: Solution:
Disable fork initialization safety features as shown in [Why Ansible and Python fork break on macOS High Sierra+ and how to solve]: Disable fork initialization safety features as shown in
[Why Ansible and Python fork break on macOS High Sierra+ and how to solve]\:
```sh ```sh
export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES
@@ -553,7 +567,8 @@ For **remote** files, use the [`slurp` module][slurp]:
### Only run a task when explicitly requested ### Only run a task when explicitly requested
Leverage the [`never` tag][special tags: always and never] to never execute the task unless requested by using the `--tags 'never'` option: Leverage the [`never` tag][special tags: always and never] to never execute the task unless requested by using the
`--tags 'never'` option:
```yaml ```yaml
- tags: never - tags: never
@@ -572,7 +587,13 @@ Conversely, one can achieve the opposite by using the `always` tag and the `--sk
Message example: Message example:
> fatal: \[i-4ccab452bb7743336]: UNREACHABLE! => {"changed": false, "msg": "Failed to create temporary directory. In some cases, you may have been able to authenticate and did not have permissions on the target directory. Consider changing the remote tmp path in ansible.cfg to a path rooted in \"/tmp\", for more error information use -vvv. Failed command was: ( umask 77 && mkdir -p \"` echo \u001b]0;@ip-192-168-42-42:/usr/bin\u0007/home/centos/.ansible/tmp `\"&& mkdir \"` echo \u001b]0;@ip-192-168-42-42:/usr/bin\u0007/home/centos/.ansible/tmp/ansible-tmp-1708603630.2433128-49665-225488680421418 `\" && echo ansible-tmp-1708603630.2433128-49665-225488680421418=\"` echo \u001b]0;@ip-192-168-42-42:/usr/bin\u0007/home/centos/.ansible/tmp/ansible-tmp-1708603630.2433128-49665-225488680421418 `\" ), exited with result 1, stdout output: \u001b]0;@ip-192-168-42-42:/usr/bin\u0007bash: @ip-192-168-42-42:/usr/bin/home/centos/.ansible/tmp: No such file or directory\r\r\nmkdir: cannot create directory '0': Permission denied\r\r", "unreachable": true} > ```plaintext
> fatal: [i-4ccab452bb7743336]: UNREACHABLE! => {
> "changed": false,
> "msg": "Failed to create temporary directory. In some cases, you may have been able to authenticate and did not have permissions on the target directory. Consider changing the remote tmp path in ansible.cfg to a path rooted in \"/tmp\", for more error information use -vvv. Failed command was: ( umask 77 && mkdir -p \"` echo \u001b]0;@ip-192-168-42-42:/usr/bin\u0007/home/centos/.ansible/tmp `\"&& mkdir \"` echo \u001b]0;@ip-192-168-42-42:/usr/bin\u0007/home/centos/.ansible/tmp/ansible-tmp-1708603630.2433128-49665-225488680421418 `\" && echo ansible-tmp-1708603630.2433128-49665-225488680421418=\"` echo \u001b]0;@ip-192-168-42-42:/usr/bin\u0007/home/centos/.ansible/tmp/ansible-tmp-1708603630.2433128-49665-225488680421418 `\" ), exited with result 1, stdout output: \u001b]0;@ip-192-168-42-42:/usr/bin\u0007bash: @ip-192-168-42-42:/usr/bin/home/centos/.ansible/tmp: No such file or directory\r\r\nmkdir: cannot create directory '0': Permission denied\r\r",
> "unreachable": true
> }
> ```
Root cause: Root cause:

View File

@@ -32,7 +32,7 @@ aws ec2 describe-instances --output text \
# Show images details. # Show images details.
aws ec2 describe-images --image-ids 'ami-8b8c57f8' aws ec2 describe-images --image-ids 'ami-8b8c57f8'
aws ec2 describe-images --filters \ aws ec2 describe-images --filters \
'Name=name,Values=["al2023-ami-*"]' \ 'Name=name,Values=["al2023-ami-minimal-*"]' \
'Name=owner-alias,Values=["amazon"]' \ 'Name=owner-alias,Values=["amazon"]' \
'Name=architecture,Values=["arm64","x86_64"]' \ 'Name=architecture,Values=["arm64","x86_64"]' \
'Name=block-device-mapping.volume-type,Values=["gp3"]' 'Name=block-device-mapping.volume-type,Values=["gp3"]'
@@ -49,6 +49,7 @@ See [EBS].
- [AWS EC2 Instance pricing comparison] - [AWS EC2 Instance pricing comparison]
- [EC2Instances.info on vantage.sh] - [EC2Instances.info on vantage.sh]
- [SSM] - [SSM]
- [Connect to your instances without requiring a public IPv4 address using EC2 Instance Connect Endpoint]
### Sources ### Sources
@@ -67,6 +68,7 @@ See [EBS].
<!-- Files --> <!-- Files -->
<!-- Upstream --> <!-- Upstream -->
[connect to your instances without requiring a public ipv4 address using ec2 instance connect endpoint]: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/connect-with-ec2-instance-connect-endpoint.html
[describe-images]: https://docs.aws.amazon.com/cli/latest/reference/ec2/describe-images.html [describe-images]: https://docs.aws.amazon.com/cli/latest/reference/ec2/describe-images.html
[describeimages]: https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_DescribeImages.html [describeimages]: https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_DescribeImages.html
[using instance profiles]: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2_instance-profiles.html [using instance profiles]: https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_use_switch-role-ec2_instance-profiles.html

View File

@@ -1,30 +1,16 @@
# SSM # SSM
1. [TL;DR](#tldr) 1. [TL;DR](#tldr)
1. [Requirements](#requirements)
1. [Gotchas](#gotchas) 1. [Gotchas](#gotchas)
1. [Integrate with Ansible](#integrate-with-ansible) 1. [Integrate with Ansible](#integrate-with-ansible)
1. [Troubleshooting](#troubleshooting)
1. [Check node availability using `ssm-cli`](#check-node-availability-using-ssm-cli)
1. [Further readings](#further-readings) 1. [Further readings](#further-readings)
1. [Sources](#sources) 1. [Sources](#sources)
## TL;DR ## TL;DR
<details>
<summary>Requirements</summary>
- The IAM instance profile must have the correct permissions.<br/>
FIXME: specify.
- One's instance's security group and VPC must allow HTTPS outbound traffic on port 443 to the Systems Manager's
endpoints:
- `ssm.eu-west-1.amazonaws.com`
- `ec2messages.eu-west-1.amazonaws.com`
- `ssmmessages.eu-west-1.amazonaws.com`
If the VPC does not have internet access, one must have enabled VPC endpoints to allow that outbound traffic from the
instance.
- Also see <https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/connect-with-ec2-instance-connect-endpoint.html>
</details>
<details> <details>
<summary>Usage</summary> <summary>Usage</summary>
@@ -63,6 +49,75 @@ aws ssm send-command --instance-ids "i-08fc83ad07487d72f" \
</details> </details>
## Requirements
For instances to be managed by Systems Manager and be available in lists of managed nodes, it must:
- Run a supported operating system.
- Have the SSM Agent installed **and running**.
```sh
sudo dnf -y install 'amazon-ssm-agent'
sudo systemctl enable --now 'amazon-ssm-agent.service'
```
- Have an AWS IAM instance profile attached with the correct permissions.<br/>
The instance profile enables the instance to communicate with the Systems Manager service.
**Alternatively**, the instance must be registered to Systems Manager using hybrid activation.
The minimum permissions required are given by the Amazon-provided `AmazonSSMManagedInstanceCore` policy
(`arn:aws:iam::aws:policy/AmazonSSMManagedInstanceCore`).
- Be able to to connect to a Systems Manager endpoint through the SSM Agent in order to register with the service.<br/>
From there, the instance must be available to the service. This is confirmed by the service by sending a signal every
five minutes to check the instance's health.
After the status of a managed node has been `Connection Lost` for at least 30 days, the node could be removed from the
Fleet Manager console.<br/>
To restore it to the list, resolve the issues that caused the lost connection.
Check whether SSM Agent successfully registered with the Systems Manager service by executing the `aws ssm
describe-instance-associations-status` command.<br/>
It won't return results until a successful registration has taken place.
```sh
aws ssm describe-instance-associations-status --instance-id 'instance-id'
```
<details>
<summary>Failed invocation</summary>
```json
{
"InstanceAssociationStatusInfos": []
}
```
</details>
<details>
<summary>Successful invocation</summary>
```json
{
"InstanceAssociationStatusInfos": [
{
"AssociationId": "51f0ed7e-c236-4c34-829d-e8f2a7a3bb4a",
"Name": "AWS-GatherSoftwareInventory",
"DocumentVersion": "1",
"AssociationVersion": "2",
"InstanceId": "i-0123456789abcdef0",
"ExecutionDate": "2024-04-22T14:41:37.313000+02:00",
"Status": "Success",
"ExecutionSummary": "1 out of 1 plugin processed, 1 success, 0 failed, 0 timedout, 0 skipped. ",
"AssociationName": "InspectorInventoryCollection-do-not-delete"
},
]
}
```
</details>
## Gotchas ## Gotchas
- SSM starts shell sessions under `/usr/bin` - SSM starts shell sessions under `/usr/bin`
@@ -129,6 +184,58 @@ Pitfalls:
This, or use the shell profiles in [SSM's preferences][session manager preferences] to change the directory when This, or use the shell profiles in [SSM's preferences][session manager preferences] to change the directory when
logged in. logged in.
## Troubleshooting
Refer [Troubleshooting managed node availability].
1. Check the [Requirements] are satisfied.
1. [Check node availability using `ssm-cli`][check node availability using ssm-cli].
### Check node availability using `ssm-cli`
Refer
[Troubleshooting managed node availability using `ssm-cli`][troubleshooting managed node availability using ssm-cli].
From the managed instance:
```sh
$ sudo dnf -y install 'amazon-ssm-agent'
$ sudo systemctl enable --now 'amazon-ssm-agent.service'
$ sudo ssm-cli get-diagnostics --output 'table'
┌──────────────────────────────────────┬─────────┬─────────────────────────────────────────────────────────────────────┐
│ Check │ Status │ Note │
├──────────────────────────────────────┼─────────┼─────────────────────────────────────────────────────────────────────┤
│ EC2 IMDS │ Success │ IMDS is accessible and has instance id i-0123456789abcdef0 in │
│ │ │ region eu-west-1 │
├──────────────────────────────────────┼─────────┼─────────────────────────────────────────────────────────────────────┤
│ Hybrid instance registration │ Skipped │ Instance does not have hybrid registration │
├──────────────────────────────────────┼─────────┼─────────────────────────────────────────────────────────────────────┤
│ Connectivity to ssm endpoint │ Success │ ssm.eu-west-1.amazonaws.com is reachable │
├──────────────────────────────────────┼─────────┼─────────────────────────────────────────────────────────────────────┤
│ Connectivity to ec2messages endpoint │ Success │ ec2messages.eu-west-1.amazonaws.com is reachable │
├──────────────────────────────────────┼─────────┼─────────────────────────────────────────────────────────────────────┤
│ Connectivity to ssmmessages endpoint │ Success │ ssmmessages.eu-west-1.amazonaws.com is reachable │
├──────────────────────────────────────┼─────────┼─────────────────────────────────────────────────────────────────────┤
│ Connectivity to s3 endpoint │ Success │ s3.eu-west-1.amazonaws.com is reachable │
├──────────────────────────────────────┼─────────┼─────────────────────────────────────────────────────────────────────┤
│ Connectivity to kms endpoint │ Success │ kms.eu-west-1.amazonaws.com is reachable │
├──────────────────────────────────────┼─────────┼─────────────────────────────────────────────────────────────────────┤
│ Connectivity to logs endpoint │ Success │ logs.eu-west-1.amazonaws.com is reachable │
├──────────────────────────────────────┼─────────┼─────────────────────────────────────────────────────────────────────┤
│ Connectivity to monitoring endpoint │ Success │ monitoring.eu-west-1.amazonaws.com is reachable │
├──────────────────────────────────────┼─────────┼─────────────────────────────────────────────────────────────────────┤
│ AWS Credentials │ Success │ Credentials are for │
│ │ │ arn:aws:sts::012345678901:assumed-role/managed/i-0123456789abcdef0 │
│ │ │ and will expire at 2024-04-22 18:19:48 +0000 UTC │
├──────────────────────────────────────┼─────────┼─────────────────────────────────────────────────────────────────────┤
│ Agent service │ Success │ Agent service is running and is running as expected user │
├──────────────────────────────────────┼─────────┼─────────────────────────────────────────────────────────────────────┤
│ Proxy configuration │ Skipped │ No proxy configuration detected │
├──────────────────────────────────────┼─────────┼─────────────────────────────────────────────────────────────────────┤
│ SSM Agent version │ Success │ SSM Agent version is 3.3.131.0 which is the latest version │
└──────────────────────────────────────┴─────────┴─────────────────────────────────────────────────────────────────────┘
```
## Further readings ## Further readings
- [Ansible] - [Ansible]
@@ -140,23 +247,29 @@ Pitfalls:
- [Using Ansible in AWS] - [Using Ansible in AWS]
- [How can i change the session manager shell to BASH on EC2 linux instances?] - [How can i change the session manager shell to BASH on EC2 linux instances?]
- [Using Ansible in AWS] - [Using Ansible in AWS]
- [Troubleshooting managed node availability]
- [Troubleshooting managed node availability using `ssm-cli`][troubleshooting managed node availability using ssm-cli]
<!-- <!--
References References
--> -->
<!-- In-article sections --> <!-- In-article sections -->
[check node availability using ssm-cli]: #check-node-availability-using-ssm-cli
[gotchas]: #gotchas [gotchas]: #gotchas
[requirements]: #requirements
<!-- Knowledge base --> <!-- Knowledge base -->
[ansible]: ../../ansible.md [ansible]: ../../ansible.md
[ec2]: ec2.md [ec2]: ec2.md
<!-- Upstream --> <!-- Upstream -->
[start a session]: https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager-working-with-sessions-start.html
[session manager preferences]: https://console.aws.amazon.com/systems-manager/session-manager/preferences
[aws_ssm connection plugin notes]: https://docs.ansible.com/ansible/latest/collections/community/aws/aws_ssm_connection.html#notes [aws_ssm connection plugin notes]: https://docs.ansible.com/ansible/latest/collections/community/aws/aws_ssm_connection.html#notes
[community.aws.aws_ssm connection]: https://docs.ansible.com/ansible/latest/collections/community/aws/aws_ssm_connection.html [community.aws.aws_ssm connection]: https://docs.ansible.com/ansible/latest/collections/community/aws/aws_ssm_connection.html
[session manager preferences]: https://console.aws.amazon.com/systems-manager/session-manager/preferences
[start a session]: https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager-working-with-sessions-start.html
[troubleshooting managed node availability using ssm-cli]: https://docs.aws.amazon.com/systems-manager/latest/userguide/ssm-cli.html
[troubleshooting managed node availability]: https://docs.aws.amazon.com/systems-manager/latest/userguide/troubleshooting-managed-instances.html
<!-- Others --> <!-- Others -->
[ansible temp dir change]: https://devops.stackexchange.com/questions/10703/ansible-temp-dir-change [ansible temp dir change]: https://devops.stackexchange.com/questions/10703/ansible-temp-dir-change

View File

@@ -0,0 +1 @@
https://www.kcl-lang.io/docs/user_docs/getting-started/intro