-
Notifications
You must be signed in to change notification settings - Fork 3
MIABIS converter tutorial
By following this tutorial you will learn:
- How to install, configure and run all the required software.
- How to format your data.
- How to index samples and access them though HTTP.
Please make sure that your system has a recent version of Java (1.8). You can find the latest java version from your system in www.java.com.
In order to properly follow this tutorial you will need to download the following tools:
- MIABIS converter: A tool that facilitates sample indexing. It reads and process the input files. (download it here)
- Elasticsearch: A search server that provides a distributed full-text search engine with an HTTP interface. (download it here)
For this tutorial, we have prepared a sample dataset. (download it here)
Before you can continue, please unzip the downloaded file elasticsearch-1.7.3.zip
.
Before we start Elasticsearch, we need to tweak the configuration file. In particular we are interested in two configuration flags:
-
cluster.name
: This is the name you will assign to your cluster. This configuration flag is important since it is used to discover and auto-join other nodes. It is important that your Cluster Name is unique and reflects the name of your biobank. -
node.name
: This is the name you will assign to your node. This flag sets the node name. It is important to identify nodes in a cluster.
To change the default configuration, open the file elasticsearch-1.7.3/config/elasticsearch.yml
. Look for the properties cluster.name
and node.name
and replace the default values. In this case, we are going to use elixir as our cluster name and node1 as our node name:
################################### Cluster ###################################
# Cluster name identifies your cluster for auto-discovery. If you're running
# multiple clusters on the same network, make sure you're using unique names.
#
cluster.name: elixir
#################################### Node #####################################
# Node names are generated dynamically on startup, so you're relieved
# from configuring them manually. You can tie this node to a specific name:
#
node.name: node1
Keep in mind that depending on your setup you may need/want to tweak other settings. Here is a list with all the configuration flags available.
- Unzip the downloaded file
unzip elasticsearch-1.7.3.zip
- Get into the elasticsearch foler
cd elasticsearch-1.7.3
- Start Elasticsearch
bin/elasticsearch
- Test elastic search by running
curl -X GET http://localhost:9200/
. The response should be similar to:
{
"ok" : true,
"status" : 200,
"name" : "Terminatrix",
"version" : {
"number" : "0.90.7",
"build_hash" : "36897d07dadcb70886db7f149e645ed3d44eb5f2",
"build_timestamp" : "2013-11-13T12:06:54Z",
"build_snapshot" : false,
"lucene_version" : "4.5.1"
},
"tagline" : "You Know, for Search"
}