CouchDB is a NoSQL database that stores data as JSON documents. It is extremely helpful in situations where a schema would cause headaches and a flexible data model is required. CouchDB also supports master-master continuous replication, which means data can be continuously replicated between two databases without having to setup a complex system of master and slave databases.
ElasticSearch is a full-text search engine that indexes everything and makes pretty much anything searchable. This works extremely well with CouchDB because one of the limitations of CouchDB is that for all queries you have to either know the document ID or you have to use map/reduce.
We will be installing CouchDB from source in order to get the latest version. A more thorough tutorial on this can be viewed here.
Update the package manager:
apt-get update
Install the tools to compile couch:
apt-get install -y build-essential
Install Erlang, the programming language that CouchDB is written in:
apt-get install -y erlang-base erlang-dev erlang-nox erlang-eunit
Install the rest of the libraries that CouchDB needs:
apt-get install -y libmozjs185-dev libicu-dev libcurl4-gnutls-dev libtool
Go to the directory where the CouchDB source files will reside:
cd /usr/local/src
Get the source files:
curl -O http://apache.mirrors.tds.net/couchdb/source/1.5.0/apache-couchdb-1.5.0.tar.gz
Untar the source files:
tar xvzf apache-couchdb-1.5.0.tar.gz
Go to the new directory:
cd apache-couchdb-1.5.0
Configure the source and install it:
./configure
make && make install
Note: This step can take a while. Once it is done, CouchDB will be fully installed. Now we need to create the appropriate user and assign permissions
Create a CouchDB user:
adduser --disabled-login --disabled-password --no-create-home couchdb
Note: The prompts asking for things such as Name can be ignored if you would like. You can use the default values for each one.
Assign the appropriate permissions to the CouchDB user:
chown -R couchdb:couchdb /usr/local/var/log/couchdb /usr/local/var/lib/couchdb /usr/local/var/run/couchdb
Setup CouchDB as a service so that it does not have to be started manually:
ln -s /usr/local/etc/init.d/couchdb /etc/init.d
update-rc.d couchdb defaults
Start CouchDB:
service couchdb start
Verify that CouchDB is running
curl localhost:5984
You should see a response that starts with:
{"couchdb":"Welcome"...
Install the latest version of the headless open-jdk:
apt-get install openjdk-7-jre-headless
Get the latest version of ElasticSearch:
wget https://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.90.8.deb
Install the package:
dpkg -i elasticsearch-0.90.8.deb
Before continuing, you will want to configure Elasticsearch so it is not accessible to the public Internet–Elasticsearch has no built-in security and can be controlled by anyone who can access the HTTP API. This can be done by editing elasticsearch.yml
. Assuming you installed with the package, open the configuration with this command:
sudo vi /etc/elasticsearch/elasticsearch.yml
Then find the line that specifies network.bind_host
, then uncomment it and change the value to localhost
so it looks like the following:
network.bind_host: localhost
Then insert the following line somewhere in the file, to disable dynamic scripts:
script.disable_dynamic: true
Save and exit. Now restart Elasticsearch to put the changes into effect:
sudo service elasticsearch restart
Verify that ElasticSearch is running (If the request fails the first time, try again. It can take a bit of time for it to start):
curl http://127.0.0.1:9200
You should see a response that starts with:
{ "ok" : true, "status" : 200,
Stop ElasticSearch:
/etc/init.d/elasticsearch stop
Create the new directory:
mkdir /var/data/
mkdir /var/data/elasticsearch
Change ownership of the directory to the ‘elasticsearch’ user:
chown elasticsearch /var/data/elasticsearch
Change the group:
chgrp elasticsearch /var/data/elasticsearch
Use nano to open the ElasticSearch configuration file:
nano /etc/default/elasticsearch
Change the line containing:
DATA_DIR=
to
DATA_DIR= /var/data/elasticsearch
Save and close the file.
Navigate to the ElasticSearch directory:
cd /usr/share/elasticsearch/
Install the plugin:
./bin/plugin -install elasticsearch/elasticsearch-river-couchdb/1.2.0
Start ElasticSearch:
/etc/init.d/elasticsearch start
Create the CouchDB database:
curl -X PUT http://127.0.0.1:5984/testdb
Create some test documents:
curl -X PUT 'http://127.0.0.1:5984/testdb/1' -d '{"name":"My Name 1"}'
curl -X PUT 'http://127.0.0.1:5984/testdb/2' -d '{"name":"My Name 2"}'
curl -X PUT 'http://127.0.0.1:5984/testdb/3' -d '{"name":"My Name 3"}'
curl -X PUT 'http://127.0.0.1:5984/testdb/4' -d '{"name":"My Name 4"}'
Create the index:
curl -X PUT '127.0.0.1:9200/_river/testdb/_meta' -d '{ "type" : "couchdb", "couchdb" : { "host" : "localhost", "port" : 5984, "db" : "testdb", "filter" : null }, "index" : { "index" : "testdb", "type" : "testdb", "bulk_size" : "100", "bulk_timeout" : "10ms" } }'
Do a test query with ElasticSearch:
curl http://127.0.0.1:9200/testdb/testdb/_search?pretty=true
You should see something similar to this:
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 4,
"max_score" : 1.0,
"hits" : [ {
"_index" : "testdb",
"_type" : "testdb",
"_id" : "4",
"_score" : 1.0, "_source" : {"_rev":"1-7e9376fc8bfa6b8c8788b0f408154584","_id":"4","name":"My Name 4"}
}, {
"_index" : "testdb",
"_type" : "testdb",
"_id" : "1",
"_score" : 1.0, "_source" : {"_rev":"1-87386bd54c821354a93cf62add449d31","_id":"1","name":"My Name"}
}, {
"_index" : "testdb",
"_type" : "testdb",
"_id" : "2",
"_score" : 1.0, "_source" : {"_rev":"1-194582c1e02d84ae36e59f568a459633","_id":"2","name":"My Name 2"}
}, {
"_index" : "testdb",
"_type" : "testdb",
"_id" : "3",
"_score" : 1.0, "_source" : {"_rev":"1-62a53c50e7df02ec22973fc802fb9fc0","_id":"3","name":"My Name 3"}
} ]
}
}
Now, rather than being limited to using map/reduce or the _id of each document, you can do full text queries on your data by using ElasticSearch.
<div class=“author”>Submitted by: <a href=“http://blog.opendev.io”>Cooper Thompson</a></div>
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!
:( Hello!
It seams that this tutorial is not valid anymore or is incompatible with Ubuntu 14.10 x64 droplet.
When I get to the
./configure
step I get the following error:Erlang version compatibility… configure: error: The installed Erlang version must be >= R14B (erts-5.8.1) and <R17 (erts-5.11)
The if I go on, at the
make && make install
step I get the following error:make: *** No targets specified and no makefile found. Stop.
Anybody know how to fix this or maybe it’s there a updated version for this tut.
Thanks!
Great tutorial! Works like a charm on the droplet. BUT…I can’t work out how to make a query from javascript. I’ve tried the elasticsearch.js library and also a plain old http request. That results in a cross domain error. I don’t have a lot of knowlegde about this but after reading a bit on wikipedia and stuff I found out that a different port on the same URL is considered a different domain. So I tried to allow requests in Nginx. Elasticsearch has CORS enabled by deafult and I’m still stuck. What am I missing?
I am new to this elasticsearch thing and sorry for the question it may seems stupid but, Can you describe a scenario where this is useful? According to this http://db-engines.com/en/system/CouchDB%3BElasticsearch Elasticsearch can do the almost same as CouchDB. Thanks in advance! :)
This tutorial was really helpful. Works perfect! Thanks!
Awesome tutorial!!