Effective web server log management is crucial for maintaining your website’s performance, troubleshooting issues, and gaining insights into user behavior. Apache is one of the most popular web servers. It generates access and error logs that contain valuable information. To efficiently manage and analyze these logs, you can use Logstash to process and forward them to DigitalOcean’s Managed OpenSearch for indexing and visualization.
In this tutorial, we will guide you through installing Logstash on a Droplet, configuring it to collect your Apache logs, and sending them to Managed OpenSearch for analysis.
Logstash can be installed using the binary files OR via the package repositories. For easier management and updates, using package repositories is generally recommended.
In this section, we’ll guide you through installing Logstash on your Droplet using both APT and YUM package managers.
Let’s identify the OS:
cat /etc/os-release
Download and install the Public Signing Key:
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elastic-keyring.gpg
You may need to install the apt-transport-https
package on Debian before proceeding:
sudo apt-get install apt-transport-https
Save the repository definition to /etc/apt/sources.list.d/elastic-8.x.list
:
echo "deb [signed-by=/usr/share/keyrings/elastic-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-8.x.list
Use the echo
method described above to add the Logstash repository. Do not use add-apt-repository
as it will add a deb-src
entry as well, but we do not provide a source package. If you have added the deb-src
entry, you will see an error like the following:
Unable to find expected entry 'main/source/Sources' in Release file (Wrong sources.list entry or malformed file)
Just delete the deb-src
entry from the /etc/apt/sources.list
file and the installation should work as expected.
Run sudo apt-get update
and the repository is ready for use. You can install it with:
sudo apt-get update && sudo apt-get install logstash
Download and install the public signing key:
sudo rpm --import https://artifacts.elastic.co/GPG-KEY-elasticsearch
Add the following in your /etc/yum.repos.d/logstash.repo
file. You can make use of ‘tee’ to update and create the file.
sudo tee /etc/yum.repos.d/logstash.repo > /dev/null <<EOF
[logstash-8.x]
name=Elastic repository for 8.x packages
baseurl=https://artifacts.elastic.co/packages/8.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
EOF
Your repository is ready for use. You can install it with:
sudo yum install logstash
For further information, please refer to the Installing Logstash guide.
A Logstash pipeline consists of three main stages: input, filter, and output. Logstash pipelines make use of plugins. You can make use of community plugins or create your own.
Input: This stage collects data from various sources. Logstash supports numerous input plugins to handle data sources like log files, databases, message queues, and cloud services.
Filter: This stage processes and transforms the data collected in the input stage. Filters can modify, enrich, and structure the data to make it more useful and easier to analyze.
Output: This stage sends the processed data to a destination. Destinations can include databases, files, and data stores like OpenSearch.
The OpenSearch output plugin can be installed by running the following command:
/usr/share/logstash/bin/logstash-plugin install logstash-output-opensearch
More information can be found on this logstash-output-opensearch-plugin repository.
Now let’se create a pipeline:
Create a new file in the path /etc/logstash/conf.d/ called apache_pipeline.conf
, and copy the following contents.
input {
file {
path => "/var/log/apache2/access.log"
start_position => "beginning"
sincedb_path => "/dev/null"
tags => "apache_access"
}
file {
path => "/var/log/apache2/error.log"
start_position => "beginning"
sincedb_path => "/dev/null"
tags => "apache_error"
}
}
filter {
if "apache_access" in [tags] {
grok {
match => { "message" => "%{HTTPD_COMBINEDLOG}" }
}
mutate {
remove_field => [ "message","[log][file][path]","[event][original]" ]
}
} else {
grok {
match => { "message" => "%{HTTPD24_ERRORLOG}" }
}
}
}
output {
if "apache_access" in [tags] {
opensearch {
hosts => "https://<OpenSearch-Hostname>:25060"
user => "doadmin"
password => "<your_password>"
index => "apache_access"
ssl_certificate_verification => true
}
} else {
opensearch {
hosts => "https://<OpenSearch-Hostname>:25060"
user => "doadmin"
password => "<your_password>"
index => "apache_error"
ssl_certificate_verification => true
}
}
}
Replace the <OpenSearch_Host>
with your OpenSearch server’s hostname and <OpenSearch_Password>
with your OpenSearch password.
Let’s break down the above configuration.
INPUT: This is used to configure a source for the events. The ‘file’ input plugin is used here.
path => “/var/log/apache2/access.log” : Specifies the path to the Apache access log file that Logstash will read from
Do make sure that the Logstash service has access to the input path.
start_position => “beginning”: Defines where Logstash should start reading the log file. “beginning” indicates that Logstash should start processing the file from the beginning, rather than from the end
sincedb_path => “/dev/null”: Specifies the path to a sincedb file. Sincedb files are used by Logstash to keep track of the current position in log files, enabling it to resume where it left off in case of restarts or failures.
tags => “apache_access”: Assigns a tag to events read from this input. Tags are useful for identifying and filtering events within Logstash, often used downstream in the output or filtering stages of the configuration. We are using tags for the latter
FILTER: is used to process the events.
Starting with conditionals:
(if "apache_access" in [tags]):
This checks if the tag apache_access
exists in the [tags] field of the incoming log events. We use this conditional to apply the appropriate GROK Filter for Apache access and error logs.
Grok Filter (for Apache Access Logs):
grok {
match => { "message" => "%{HTTPD_COMBINEDLOG}" }
}
The grok filter %{HTTPD_COMBINEDLOG}
is a predefined pattern in Logstash used to parse Apache combined access log format. This extracts fields like IP address, timestamp, HTTP method, URI, status code, etc., from the message field of incoming events.
Mutate Filter Remove
(optional): After the Apache logs are parsed, we use mutate-remove to remove certain fields.
mutate {
remove_field => [ "message","[log][file][path]","[event][original]" ]
}
Else Condition: The else block is executed if the apache_access
tag is not present in [tags]. This else block contains another GROK filter for Apache error logs.
grok {
match => { "message" => "%{HTTPD24_ERRORLOG}" }
}
This grok filter %{HTTPD24_ERRORLOG}
parses messages that match the Apache error log format. It extracts fields relevant to error logs like timestamp, log level, error message, etc.
GROK patterns can be found at: https://github.com/logstash-plugins/logstash-patterns-core/tree/main/patterns.
OUTPUT: The output plugin sends events to a particular destination.
The output block begins with an if condition. We are using if conditionals here
if "apache_access" in [tags] {}
This if conditional is used to route logs to OpenSearch to two separate indexes, apache_error
and apache_access
.
Let’s explore the OpenSearch Output plugin:
hosts => "https://XXX:25060" Your Open search Hostname
user => "doadmin" Your Open search Username
password => "XXXXX" OpenSearch Password
index => "apache_error" Index name in OpenSearch
ssl_certificate_verification => true Enabled SSL certificate verification
Once the Pipeline is configured, start the Logstash service:
systemctl enable logstash.service
systemctl start logstash.service
systemctl status logstash.service
You can verify that Logstash can connect to OpenSearch by testing connectivity:
curl -u your_username:your_password -X GET "https://your-opensearch-server:25060/_cat/indices?v"
Replace <your-opensearch-server> with your OpenSearch server’s hostname and <your_username>, <your_password> with your OpenSearch credentials.
Ensure that data is properly indexed in OpenSearch:
curl -u your_username:your_password -X GET "http://your-opensearch-server:25060/<your-index-name>/_search?pretty"
Replace <your-opensearch-server> with your OpenSearch server’s hostname and <your_username>, <your_password> with your OpenSearch credentials. Similarly, <your-index-name> with the index name.
Ensure firewall rules and network settings allow traffic between Logstash and OpenSearch on port 25060
.
The logs for Logstash can be found at /var/log/logstash/logstash-plain.log
For details, refer to Troubleshooting.
In this guide, we walked through setting up Logstash to collect and forward Apache logs to OpenSearch. Here’s a quick recap of what we covered:
Installing Logstash: We covered how to use either APT or YUM package managers, depending on your Linux distribution, to install Logstash on your Droplet.
Configuring Logstash: We created and adjusted the Logstash configuration file to ensure that Apache logs are correctly parsed and sent to OpenSearch.
Verifying in OpenSearch: We set up an index pattern in OpenSearch Dashboards to confirm that your logs are being indexed properly and are visible for analysis.
With these steps completed, you should now have a functional setup where Logstash collects Apache logs and sends them to OpenSearch.
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!