April 16, 2020

Part 1: Elastic Search Interview Questions And Answers

Explain in brief about Elasticsearch?
Elasticsearch is an open-source, RESTful, scalable, built on Apache Lucene library, document-based search engine. It stores retrieve and manage textual, numerical, geospatial, structured and unstructured data in the form of JSON documents using CRUD REST API or ingestion tools such as Logstash.

Elasticsearch, Apache Lucene search engine is a JSON document, which is indexed for faster searching. Due to indexing, user can search text from JSON documents within 10 seconds.

It is distributed NoSQL database, it uses documents rather than schema or tables.

Uber, Instacart, Slack, Shopify, Stack Overflow, DigitalOcean, Udemy, 9GAG, Wikipedia, Netflix etc are some companies that use Elasticsearch along with Logstash and Kibana.
  • Elastic Search -- Relational database
  • Index -- Database
  • Type -- Table
  • Document -- Row/ Record
  • Fields -- Column
  • Mapping -- Schema
To install Elasticsearch, what software is required as a prerequisite?
Java 7 or higher is recommended as the software required for running Elasticsearch on your device.

Why we need elasticsearch?

  • To implement simple/ fuzzy search
  • To implement analytics
  • Autocomplete (like we search on google) and instant search

What are the steps to start an Elasticsearch server?
  • Open the command prompt, change the directory up to the bin folder of the Elasticsearch folder that got created after it has been installed.
  • Type /Elasticsearch.bat and press Enter to start the Elasticsearch server.
  • This will start Elasticsearch on command prompt in the background. 
  • To check if its started or not open this URL http://localhost:9200 on browser. This should display the Elasticsearch cluster name and other meta value related to its database.
Explain Elasticsearch Cluster?
It is a group of one or more node instances connected responsible for the distribution of tasks, searching and indexing across all the nodes.

What is a Node in Elasticsearch?
A node is an instance of Elasticsearch. Here are the different node types:
  • Data nodes hold data and perform an operation such as CRUD (Create/Read/Update/Delete), search and aggregations on data.
  • Master nodes help in configuration and management to add and remove nodes across the cluster.
  • Client nodes send cluster requests to the master node and data-related requests to data nodes,
  • Ingest nodes for pre-processing documents before indexing.
How does an ingest node in Elasticsearch function?
The Ingest node processes the documents before indexing, which takes place with help of series of processors which sequentially modifies the document by removing one or more fields followed by another processor that renames the field value. This helps normalizes the document and accelerates the indexing, resulting in faster search results.

What is the difference between Master node and Master eligible node in Elasticsearch?
Master node functionality revolves around actions across the cluster such as the creation of index/indices, deletion of index/indices, monitor or keeps an account of those nodes that form a cluster. These nodes also decide shards allocation to specific nodes resulting in stable Elasticsearch cluster health.

Whereas, Master eligible nodes are those nodes that get elected to become Master Node.

What is an index in an Elasticsearch cluster?
An Elasticsearch cluster can contain multiple indices, which are database as compared with a relational database, these indices contain multiple types (tables). The types (tables) contain multiple Documents (records/rows) and these documents contain Properties (columns).

What is a Type in an Elastic search?
Type, here is a table in the relational database. These types (tables) hold multiple Documents (rows), and each document has Properties (columns).

What is Mapping in an Elasticsearch?
Mapping is the outline of the documents stored in an index. The mapping defines how a document is indexed, how its fields are indexed and stored by Lucene.

What is a Document with respect to Elasticsearch?
A document is a JSON document that is stored in Elasticsearch. It is equivalent to a row in a relational database table.

Explain SHARDS with regards to Elasticsearch?
When the number of documents increases, hard disk capacity, and processing power will not be sufficient, responding to client requests will be delayed. In such a case, the process of dividing indexed data into small chunks is called Shards, which improves the fetching of results during data search.

What is the advantage of creating a replica?
A replica is an exact copy of the Shard, used to increase query throughput or achieve high availability during extreme load conditions. These replicas help to efficiently manage requests.

-K Himaanshu Shuklaa..

No comments:

Post a Comment