I have posted the last batch of Elasticsearch lab exercises using the Olympic events dataset. This last set contains two more exercises for using some search DSL, and a test for using the enrich API.
I will post my solutions soon, with some background on each feature being used, and how they work. There are some fundamental concepts that are important to understand to make sure your searches return the documents you need and - more importantly - don’t return documents you’re not expecting.
I’ve encountered more people than I would expect who don’t have a good enough understanding of Elasticsearch’s architecture when they begin building their clusters. Performance is fine when the cluster first goes into production but starts creaking when data volume or search load increases. It can be difficult or expensive to fix the problems.
Some situations like that can be avoided with better planning. It’s not always possible to get an accurate estimation of search and index requirements but it’s important to know the fundamentals of how Elasticsearch works in order to scale it out effectively.
Elasticsearch architecture is the most important topic any Elasticsearch Engineer should understand. Even a high-level awareness will give an appreciation of what can be causing problems with your cluster.
I’ve been wanting to release a video about this for too long; I wrote a rough script almost a year ago. It will be done soon and I’m looking forward to releasing it.
There will be another round of exercises starting in late September. This round will cover some of the same topics from the Olympics labs, with the addition of full-text queries.