Carefully declaring the duties of each and every element of an application deployment stack brings along a lot of benefits with it, including simpler diagnosis of problems when they occur, capacity to scale rapidly, as well as a more clear scope of management for the components involved.
In today’s world of web services engineering, a key component for achieving the above scenario involves making use of messaging and work (or task) queues. These usually resilient and flexible applications are easy to implement and set up. They are perfect for splitting the business logic between different parts of your application bundle when it comes to production.
In this DigitalOcean article, continuing our series on application level communication solutions, we will be looking at Beanstalkd to create this separation of pieces.
Beanstalkd was first developed to solve the needs of a popular web application (Causes on Facebook). Currently, it is an absolutely reliable, easy to install messaging service which is perfect to get started with and use.
As mentioned earlier, Beanstalkd’s main use case is to manage the workflow between different parts and workers of your application deployment stack through work queues and messages, similar to other popular solutions such as RabbitMQ. However, the way Beanstalkd is created to work sets it apart from the rest.
Since its inception, unlike other solutions, Beanstalkd was intended to be a work queue and not an umbrella tool to cover many needs. To achieve this purpose, it was built as a lightweight and rapidly functioning application based on C programming language. Its lean architecture also allows it to be installed and used very simply, making it perfect for a majority of use cases.
Being able to monitor jobs with a returned ID, returned upon creation, is only one of the features of Beanstalkd that sets it apart from the rest. Some other interesting features offered are:
Persistence - Beanstalkd operates in-memory but offers persistence support as well.
Prioritisation - unlike most alternatives, Beanstalkd offers prioritisation for different tasks to handle urgent things when they are needed to.
Distribution - different server instances can be distributed similarly to how Memcached works.
Burying - it is possible to indefinitely postpone a job (i.e. a task) by burying it.
Third party tools - Beanstalkd comes with a variety of third-party tools including CLIs and web-based management consoles.
Expiry - jobs can be set to expire and auto-queue later (TTR - Time To Run).
Some exemplary use-cases for Banstalkd are:
Allowing web servers to respond to requests quickly instead of being forced to perform resource-heavy procedures on the spot
Performing certain jobs at certain intervals (i.e. crawling the web)
Distributing a job to multiple workers for processing
Letting offline clients (e.g. a disconnected user) fetch data at a later time instead of having it lost permanently through a worker
Introducing fully asynchronous functionality to the backend systems
Ordering and prioritising tasks
Balancing application load between different workers
Greatly increase reliability and uptime of your application
Processing CPU intensive jobs (videos, images etc.) later
Sending e-mails to your lists
and more.
Just like most applications, Beanstalkd comes with its own jargon to explain its parts.
Beanstalkd Tubes translate to queues from other messaging applications. They are through where jobs (or messages) are transferred to consumers (i.e. workers).
Since Beanstalkd is a “work queue”, what’s transferred through tubes are referred as jobs - which are similar to messages being sent.
Producers, similar to Advanced Message Queuing Protocol’s definition, are applications which create and send a job (or a message). They are to be used by the consumers.
Receivers are different applications of the stack which get a job from the tube, created by a producer for processing.
It is possible to very simply obtain Beanstalkd through package manager aptitude
and get started. However, in a few commands, you can also download it and install it from the source.
Note: We will be performing our installations and perform the actions listed here on a fresh and newly created droplet for various reasons. If you are actively serving clients and might have modified your system, to not to break anything working and to not to run in to issues, you are highly advised to try the following instructions on a new system.
Run the following command to download and install Beanstalkd:
aptitude install -y beanstalkd
Edit the default configuration using nano
for launch at system boot:
nano /etc/default/beanstalkd
After opening the file, scroll down to the bottom and find the line #START=yes
. Change it to:
START=yes
Press CTRL+X and confirm with Y to save and exit.
To start using the application, please skip to the next section or follow along to see how to install Beanstalkd from source.
We are going to need a key tool for the installation process from source - Git.
Run the following to get Git on your droplet:
aptitude install -y git
Download the essential development tools package:
aptitude install -y build-essential
Using Git let’s clone (download) the official repository:
git clone https://github.com/kr/beanstalkd
Enter the downloaded directory:
cd beanstalkd
Build the application from source:
make
Install:
make install
Upon installing, you can start working with the Beanstalkd server. Here are the options for running the daemon:
-b DIR wal directory
-f MS fsync at most once every MS milliseconds (use -f0 for "always fsync")
-F never fsync (default)
-l ADDR listen on address (default is 0.0.0.0)
-p PORT listen on port (default is 11300)
-u USER become user and group
-z BYTES set the maximum job size in bytes (default is 65535)
-s BYTES set the size of each wal file (default is 10485760)
(will be rounded up to a multiple of 512 bytes)
-c compact the binlog (default)
-n do not compact the binlog
-v show version information
-V increase verbosity
-h show this help
# Usage: beanstalkd -l [ip address] -p [port #]
# For local only access:
beanstalkd -l 127.0.0.1 -p 11301 &
If installed through the package manager (i.e. aptitude), you will be able to manage the Beanstalkd daemon as a service.
# To start the service:
service beanstalkd start
# To stop the service:
service beanstalkd stop
# To restart the service:
service beanstalkd restart
# To check the status:
service beanstalkd status
Beanstalkd comes with a long list of support client libraries to work with many different application deployments. This list of support languages - and frameworks - include:
Python
Django
Go
Java
Node.js
Perl
PHP
Ruby
and more.
For a full list of support languages and installation instructions for your favourite, check out the client libraries page on Github for Beanstalkd.
In this section - before completing the article - let’s quickly go over basic usage of Beanstalkd. In our examples, we will be working with the Python language and Beanstald’s Python bindings - beanstalkc.
To install beanstalkc, run the following commands:
pip install pyyaml
pip install beanstalkc
In all your Python files in which you are thinking of working with Beanstalkd, you need to import beanstalkc and connect:
import beanstalkc
# Connection
beanstalk = beanstalkc.Connection(host='localhost', port=11301)
To enqueue a job:
beanstalk.put('job_one')
To receive a job:
job = beanstalk.reserve()
# job.body == 'job_one'
To delete a job after processing it:
job.delete()
To use a specific tube (i.e. queue / list):
beanstalk.use('tube_a')
To list all available tubes:
beanstalk.tubes()
# ['default', 'tube_a']
Final example (nano btc_ex.py
):
import beanstalkc
# Connect
beanstalk = beanstalkc.Connection(host='localhost', port=11301)
# See all tubes:
beanstalk.tubes()
# Switch to the default (tube):
beanstalk.use('default')
# To enqueue a job:
beanstalk.put('job_one')
# To receive a job:
job = beanstalk.reserve()
# Work with the job:
print job.body
# Delete the job:
job.delete()
Press CTRL+X and confirm with Y to save and exit.
When you run the above script, you should see the job’s body being printed:
python btc_ex.py
# job_one
To see more about beanstalkd (and beanstalkc) operations, check out its Getting Started tutorial.
<div class=“author”>Submitted by: <a href=“https://twitter.com/ostezer”>O.S. Tezer</a></div>
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!
Thanks you for this tutorial. I installed Beanstalkd in my CentOS system using yum install beanstalkd. How can I make this persistent so Beanstalkd runs even during reboots or when it crashes? I read your section that says to edit the default configuration for launch at system boot but there is no file in
/etc/default/beanstalkd
. Can someone tell me how to auto-restart Beanstalkd that is installed on CentOS usingyum install beanstalkd
?@kamaln7 Hello, I have already get the Beanstalkd installed on My VPS, but I need to increase the job size limit, could you point me on this?
Thanks.
i have install beanstalkd but its giving error while i am trying to start “service beanstalkd start”, error :
Job for beanstalkd.socket failed. See “systemctl status beanstalkd.socket” and “journalctl -xe” for details.
Could i access beanstalkd server from remote address? I mean have an application deployed to a sharing hosting and i want to install beanstalkd on digitalocean VPS, so would have access from my VPS IP address?
Hi Now I wan to work with the PHP language How to set up add a job to - beanstalk Thanks
@jesse.a.gordon: Thanks Jesse! I’ve updated the article.
I believe you have a typo
To start the service:
service beanstalks start
should be
To start the service:
service beanstalkd start
Just in case people are copying & pasting to their console, I would never do such a thing!