Setting up a Multi-Region Deployment of HarperDB on DigitalOcean

Published on November 3, 2023

Setting up a Multi-Region Deployment of HarperDB on DigitalOcean

HarperDB, a globally distributed data and application platform, is now available on the DigitalOcean Marketplace, giving DigitalOcean users a fast way to bootstrap HarperDB.

HarperDB is unique because it combines a high-performance database, user-built custom applications, and real-time data streaming into a single platform. The technology was built with a focus on performance and ease of use. With flexible user-defined APIs, simple HTTP/S interface, and a high-performance single-model data store that accommodates both NoSQL and SQL workloads, HarperDB scales with your application from proof of concept to production.

HarperDB’s replication engine replicates data between instances of HarperDB using a highly performant, bi-directional pub/sub model on a per-table basis. This can be across an unlimited number of nodes, ultimately enabling limitless global scale. HarperDB also blows competition out of the water when it comes to performance, with upwards of 20K writes/second/node and 120K reads/second/node.

In this tutorial, we’ll go through setting up HarperDB on multiple regions via DigitalOcean Droplets and demonstrate creating a simple API layer on top with Custom Functions.

Creating Droplets

After logging into DigitalOcean, navigate to the marketplace and click “Create HarperDB Droplet”:

We will create two instances: one in New York and one in San Francisco. Choose the Droplet Type (I’ll be using basic tier for demo purposes) and attach a volume. You can choose the automatic format and mount option unless you want to manage LVM Configuration yourself.

Choose an Image on DigitalOcean Marketplace

Wait for the two Droplets to be created in each region and note the IP addresses. Also, we will need to enable Firewalls for HarperDB Studio to interact with our instances.

Navigate to the Networking > Firewall section and open up ports 9925-9926 and 9932:

Inbound Rules Settings on Droplets — Inbound Rules

Finally, ssh into the Droplets and inspect the contents of ~/.harperdb for credentials that were automatically created. You’ll need these details for the next section.

Connecting to HarperDB Studio

HarperDB Studio is an online portal for managing HarperDB instances. HarperDB has a super powerful free tier, with the option to upgrade to a paid model as needed. See pricing info here.

Navigate to studio.harperdb.io and click on “Create New HarperDB Cloud Instance” and choose “Enterprise” option. Then fill out the instance information accordingly:

Username/Password is from the .harperdb file mentioned above and the host is the IP address of your Droplet. It is important to click the SSL button as the marketplace installation has SSL enabled.

Once you click Instance Details, it may warn you about accepting self-signed certificates. Click on the link in your browser and accept those certificates. After a few seconds, you’ll see your instance show up on the dashboard.

Adding Data

Now that our databases are set up, we can create schemas and tables. Let’s create a schema called dev and table called dog with hash attribute id:

Schema and Table Settings in HarperDB Studio

We can then use curl commands to add some data:

curl -k --location 'https://<my-ip>:9925' \

--header 'Content-Type: application/json' \

--header 'Authorization: <YourBase64EncodedInstanceUser:Pass>' \

--data-raw '{

    "operation": "insert",

    "schema": "dev",

    "table": "dog",

    "records": [

        {

            "dog_name": "Charlie",

            "age": 2

        }

    ]

}'

One of the nice features of HarperDB is that they have first-class replication support built into the database. Navigate to the replication tab on HarperDB studio and create a cluster user to enable clustering.

Clustering Settings

Do this for both Droplets and wait for the database to restart. Then you can add each instance under clustering tab with publish/subscribe capabilities:

Connected Instances Under Clustering Settings

HarperDB achieves replication in an asynchronous pub/sub model. In our example, we’ll want to set up a multi-region application that can fetch data, so enable both publish/subscribe capabilities.

You can try adding another data point, and you should see data show up under the browse tab for both instances.

Setting up Custom Functions

HarperDB allows users to define a light API layer via a feature called Custom Functions. Custom Functions combine serverless functions with the underlying database, collapsing the stack into a single solution with the ability to define custom API endpoints that have direct access to HarperDB core operations. HarperDB’s serverless Custom Functions, powered by Fastify, are just like AWS Lambda functions or Stored Procedures. Functions are low maintenance and easy to develop; define logic and choose when to execute.

We’ll use Custom Functions to deploy a simple function that fetches data along with region information.

To do so, navigate to /opt/hdb/custom_functions directory on HarperDB droplets. Then create a new directory called “digitalocean”. Here we’ll initialize npm project:

npm init -y

npm install @fastify/env

We are utilizing the @fastify/env library to load environment variables where we will encode region information.

Create a index.js under a new routes directory:

mkdir routes

touch routes/index.js

Then paste the following code:

import fastifyEnv from '@fastify/env'

import { fileURLToPath } from 'url'

import path from 'path'

const schema = {

  type: 'object',

  required: ['DO_REGION'],

  properties: {

    DO_REGION: {

      type: 'string'

    },

  }

}

const __filename = fileURLToPath(import.meta.url)

const __dirname = path.dirname(__filename)

const options = {

  schema,

  dotenv: {

    path: `${__dirname}/.env`,

    debug: true

  },

  data: process.env

}

export default async (server, { hdbCore, logger }) => {

  await server.register(fastifyEnv, options);

  server.route({

    url: '/',

    method: 'GET',

    handler: async () => {

      const body = {

        operation: 'sql',

        sql: 'SELECT * FROM dev.dog ORDER BY dog_name',

      };

      const results = await hdbCore.requestWithoutAuthentication({ body });

      const response = {

        region: process.env.DO_REGION,

        results

      }

      return response

    },

  });

};

Then create a .env file and add in DO_REGION information like:

DO_REGION=new-york

Testing our Application

Finally, we’re ready to test out our application that fetches all records from our dev.dog table ordered by dog_name attribute.

By default, Custom Function endpoint is <ip-address>:9926/<name-of-function>. So in our case, it would be <ip-address>:9926/digitalocean.

Choose a Droplet IP that’s close to you. Once we curl that endpoint, we get back:

{

  "region": "new-york",

  "results": [

    {

      "age": 2,

      "dog_name": "Charlie"

    },

    {

      "age": 4,

      "dog_name": "Coco"

    }

  ]

}

Note that in my case, I reached out to the New York region (also note that I added in another record previously).

You can receive the same output from the other region except with the correct region data (e.g. San Francisco).

If you would like to route traffic based on geolocation, you can integrate these endpoints to a global load balancer or use a DNS service that can route traffic accordingly.

Wrapping Up

In this article, we saw how to set up a multi-region configuration of HarperDB with replication on DigitalOcean. With the new marketplace template, it’s trivial to spin up new instances of HarperDB on demand. We also went over how to enable replication and added a sample Custom Functions to show how an API layer can be added with little overhead.

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.

Learn more about our products

About the author

HarperDB

Author

See author profile

HarperDB is a globally-distributed edge application platform comprised of an edge database, streaming broker, and user-defined applications delivering near-zero latency, huge cost savings, and a superior developer experience.

Category:

Tags:

DigitalOcean 1-Click Apps Marketplace

SQL