Elasticsearch exercises

Introduction

This is in a very early stage. Please get in touch with feedback.

I find lab-based exercises to be the best way to cement information as knowledge. Applying what you have read about any new technology or piece of functionality in a mix of settings is what will really help it become second nature. Many of the tutorials or documentation on Elasticsearch (among others) lack exercises for the reader to put what they have read into practice.

I am addressing this problem by building a collection of lab exercises for anybody learning Elasticsearch so they can apply and test their knowledge of various APIs and techniques.

About the exercises

Each collection of exercises is focused on one dataset. The data will be indexed, queried, restructured, moved and exported, using as many APIs available in Elasticsearch as possible. Datasets must be appropriately licenced (free to use, reformat and redistribute), relevant, and interesting. I am always on the lookout for new datasets so please get in touch if you find one you think could form the foundation of a new exercise collection.

The environment used here is similar to that of the Elastic Certified Engineer exam but the exercises aren’t meant to be part of a mock Certification exam; they’re for someone learning Elasticsearch. At the same time, they aren’t dissimilar from the type of questions in the exam. If you are studying for the exam, these exercises may help. I’m using an approximation of the exam environment because it is familiar to most people and only requires basic Linux knowledge.

Requirements

You will need a machine capable of running a small, single-node Elasticsearch v7.2 cluster and Kibana. A modern laptop with 8 or 16GB of RAM should suffice. If you are running your nodes in AWS, a t2.medium instance is plenty. Later exercises will require a multi-node cluster and we will discuss how best to create a suitable lab environment nearer the time, or read how to do this using Vagrant.

Elasticsearch will need be run directly on a host, virtual machine or cloud instance where you can access the shell directly; ideally over SSH. Elasticsearch and Kibana distributions (extracted from the .tar.gz or .zip archive) are required on the node, as well as the data files used in these exercises.

All REST calls to Elasticsearch in these exercises will assume that Elasticsearch is running on localhost. You will need to modify those addresses with the host of your cluster if it is different.

The list

August 2020
July 2020