All Articles

The Hive - Snapshot + Restore with ES on AWS

The Hive + Elasticsearch

In this post, I will be touching on setting up ES snapshot + restore functionality for The Hive + Cortex. There’s no documentation for this on Github, so I thought I’d take a stab at it for those who are working on something similar. Since The Hive is an incident management platform with plenty of information and IOCs, we’d want to make sure that it’s regularly backed up in case of failure. I will also touch on setting up in a shared folder as well.

According the Elasticsearch, you can take snapshot of individual indices or the entire cluster and store it in a repo on a shared filesystem. You’ll just have to set up a mounted folder where all the nodes can have read/write access to and define it. ES will avoid copying existing data, so it’s ok to take snapshots of your cluster frequently as needed. For remote repository support, you can save it to S3, HDFS, Azure, GCS …

If you’ve installed ES on Linux, the default data folder is in /var/lib/elasticsearch (CentOS) or /var/lib/elasticsearch/data (Ubuntu). To understand more about ES data storage, read this article: https://www.elastic.co/blog/found-dive-into-elasticsearch-storage .

You can also look at stats here:

curl http://127.0.0.1:9200/_nodes/stats/fs?pretty

Goal:

  1. To take snapshot of our hive cluster every 7 days and restore it in either a folder(a) or s3 bucket (b).
  2. PS: Snapshot is the backup of the data.
  3. To successfully be able to restore our data as needed.

Store backup data in an S3 bucket in AWS

1. Install the repository-s3 plugin.

This will help us to use the snapshot + restore functionality with S3 bucket.

Install the plugin:

cd /usr/share/elasticsearch/bin
sudo ./elasticsearch-plugin install repository-s3

Check to make sure you have the plug in:
./elasticsearch-plugin list
  • The 3 available commands for elasticsearch-plugin are: install, remove, list

1a. Restart and check to make sure the plug in show up

You will need this plugin in every single one of your ES node.

systemctl restart elasticsearch

curl http://localhost:9200/_nodes?filter_path=nodes.*.plugins | jq

2. Create a bucket in Amazon S3

  • Name it hive-es-s3-repository

3. Set up IAM Role + Policy

  • You need to create an IAM role with a policy that’ll allow ES to write to your bucket.
  • Policies -> Create Policy -> Create your own policy -> hive-es-s3-repository
{
    "Version":"2012-10-17",
    "Statement": [
        {
            "Action": [ "s3:ListBucket" ],
            "Effect": "Allow",
            "Resource": [ "arn:aws:s3:::hive-es-s3-repository" ]
        },
        {
            "Action": ["s3:GetObject",
                       "s3:PutObject",
                       "s3:DeleteObject",
                       "iam:PassRole"
                      ],
            "Effect": "Allow",
            "Resource": [
                "arn:aws:s3:::hive-es-s3-repository/*"
            ]
        }
    ]
}

You can also use AWS CLI to create a role to attach to the policy document:

aws iam create-role --role-name hive-es-s3-repository --assume-role-policy-document '{"Version": "2012-10-17", "Statement": [{"Sid": "", "Effect": "Allow", "Principal": {"Service": "es.amazonaws.com"}, "Action": "sts:AssumeRole"}]}'

4. Back to the IAM console -> Choose Roles tab, refresh:

  • Copy the Role’s Amazon Resource Name (ARN)
  • Choose Attach Policy -> and choose attach policy again
  • You now have successfully set up a trust relationship between Amazon ES and Amazon ES to allow ES to write to your S3 bucket

5. Set up Repository in Amazon ES

curl -XPUT 'http://localhost:9200/_snapshot/snapshot-repository' -H 'Content-Type: application/json' -d' {
    "type": "s3",
    "settings": {
        "bucket": "hive-es-s3-repository",
        "region": "us-west-2",
        "role_arn": "arn:aws:iam::123456789012:role/hive-es-s3-repository"
    }
}'

6. Back up an index

  • Use the _snapshot API to create a snapshot of your index
  • By default, ES snapshots the entire cluster’s indexes

    curl -XPUT localhost:9200/_snapshot/hive-es-s3-repository/hive_snapshot -H 'Content-Type: application/json'
    {
    "indices": "the_hive_15",
    "ignore_unavailable": true,
    "include_global_state": false
    }

7. Check the status of your backup

curl -XGET localhost:9200/_snapshot/hive-es-s3-repository/_status
curl -XGET localhost:9200/_snapshot/hive-es-s3-repository/_all

8. Restore your index to same cluster

curl -XPOST localhost:9200/_snapshot/hive-es-s3-repository/snapshot_1/_restore -H 'Content-Type: application/json' -d '{
  "indices": "hive",
  "ignore_unavailable": false,
  "include_global_state": false
}'

When you restore, you specify the snapshot, and also any indexes that are in the snapshot. You do this in much the same way as when you took the snapshot in the first place.

9. Delete indices you don’t need

curl -XDELETE localhost:9200/_snapshot/hive-es-s3-repository/hive
curl -XDELETE localhost:9200/_snapshot/hive-es-s3-repository/_all

Store backup data in a NFS folder

1. Create a folder that’s accessible by ALL ES nodes at the same mounting point

mkdir -p /opt/backup

b. Give Elasticsearch permissions to write to this directory

chmod 777 /opt/backup/

1a. If multiple nodes/clusters, create a shared NFS folder

Note: you will need a shared volume, e.g. a NFS volume, behind the repository path that all nodes can access. That way, if Node1 writes a file, it will be visible by Node2 and Node3. A directory in the local file system will therefore not work, even if the path is identical on all machines.

“In order to register the shared file system repository it is necessary to mount the same shared filesystem to the same location on all master and data nodes. This location (or one of its parent directories) must be registered in the path.repo setting on all master and data nodes.”

If you don’t do that, you will get an error. So depending on your set up, if you have many nodes, S3 bucket might be the way to go. If you only have 1 node, then the folder might be easier.

Add path.repo in Elasticsearch configuration file:

cat >> /etc/elasticsearch/elasticsearch.yml  << EOF
path.repo: ["/opt/backup/"]
EOF

OR:

Edit your `elasticsearch.yml` to add the path in “path.repo”
path.repo: ["/opt/backup/"]

# Shared repo
# path.repo: ["/mount/backups", "/mount/longterm_backups"]
Microsoft Windows UNC Paths:
# path.repo: ["\\\\MY_SERVER\\Snapshots"]

3. Restart Elasticsearch service and create a repo we are going to use to store our snapshot. Check to see if there’s any existing repo first:

curl -XGET 'http://localhost:9200/_snapshot/_all?pretty'

{}  = blank response indicating there’s no repo set up yet.

4. Create a repo on cluster

curl -X PUT "localhost:9200/_snapshot/hive-es-s3-repository?pretty" -H 'Content-Type: application/json' -d'
{
   "type": "fs",
	"settings": {
    	"location": "/opt/backup",
    	"compress": true
	}
}
'

Check that repo was created successfully:

curl -XGET 'http://localhost:9200/_snapshot/_all?pretty

5. Create backup of entire cluster/specific index:

Entire:

curl -XPUT  "http://localhost:9200/_snapshot/hive-es-s3-repository/hive?wait_for_completion=true" -H 'Content-Type: application/json'

Specific Index:

curl -XPUT 'http://localhost:9200/_snapshot/hive-es-s3-repository/snapshot_1?wait_for_completion=true&pretty'  -H 'Content-Type: application/json' -d '{
  "indices": "the_hive_15"
}'

6. List snapshots:

curl -XGET "http://localhost:9200/_snapshot/hive-es-s3-repository/_all?pretty"

7. Restore Process (moving our data from cluster1 to cluster2)

cd  opt/
ls
  backup

Zip our backup file:
tar czf   backup.tar.gz es-backup

Now scp this tar.gz file to cluster 2 and then decompress it on cluster2:
scp /opt/backup.tar.gz root@10.0.0.2:/root/
  • Remember to replace the 10.0.0.2 with your own ACTUAL IP ;)

8. ES Restore from Cluster 2 - 10.0.0.02

Logged in cluster2 and decompress in ES snapshot directory:

# mv ~/es-backup.tar.gz  /backup2/
# cd  backup2 # tar xzf  es-backup.tar.gz  

# chown -R elasticsearch:elasticsearch /backup2/es-backup
Add the path.repo in elasticsearch configuration file (elasticsearch.yml)
cat >> /etc/elasticsearch/elasticsearch.yml << EOF

path.repo: ["/backup2/es-backup"]

EOF

9. Restart ES

service elasticsearch  restart

Register it as a repository:

curl -XPUT -H "Content-Type: application/json;charset=UTF-8" 'http://10.0.0.2:9200/_snapshot/esrestore -d ‘{
  "type": "fs",
  "settings": {
     "location": "/backup2/es-backup",
     "compress": true
  }
}'

Double check that it worked:

curl -XGET 'http://10.0.0.2:9200/_snapshot/_all?pretty'
...... response.....
{
    "esrestore": {
       "type": "fs",
        "settings": {
            "compress": "true",
            "location": "/backup2/es-backup"
        }
    }
}

11. Restore it from repository

curl  -XPOST "http://10.0.0.2:9200/_snapshot/esrestore/hive/_restore?wait_for_completion=true"

12. ES Snapshots of specific indices (log in to cluster1)

curl -X PUT -H "Content-Type: application/json" "http://localhost:9200/_snapshot/esbackup/hive_2?wait_for_completion=true" -d '
{
  "indices": "index_1,index_2, index_3",
  "ignore_unavailable": true,
  "include_global_state": false
}'

Cron jobs

This will create a snapshot in the backups repository with a name like “20-12-2019-12-10-00” – the current date and time.

cd /etc/cron.monthly
crontab -e

You can also use a similar command to create a new ES repository every month, for example, so you can periodically take a complete snapshot of the cluster.

0,30 * * * * curl -XPUT  "http://localhost:9200/_snapshot/the_hive_backup/hive-$(date +\%d-\%m-\%Y-\%H-\%M-\%S)?wait_for_completion=true" -H 'Content-Type: application/json'

Once done, you can list your cron jobs by:

crontab -l

Sources / Resources:

I’ve read a ton of articles to get this to work and want to make sure you have all the resources you need to do your thing as well!