Welcome to Day 11 of the “12 Days of DigitalOcean” series! Yesterday, we set up DigitalOcean Spaces to securely store attachments from inbound emails. Today, we’ll take it a step further by integrating Google Sheets to store both the extracted receipt details and the URLs of uploaded attachments.
Google Sheets is simple, familiar, and ideal for managing and tracking receipt data. By the end of this tutorial, your app will seamlessly store extracted receipt details and attachment URLs in a Google Sheet for easy organization. Let’s get started!
Note: While we’re using Google Sheets for this project, you can use any database, such as PostgreSQL or MongoDB, for a more robust storage solution. If you prefer PostgreSQL, check out these tutorials from the Birthday Reminder series to get started:
To get the most out of this tutorial, we assume the following:
Receipts
: Ensure you have a blank Google Sheet titled Receipts
(or any other name) ready to store the extracted data. You’ll share this sheet with the service account created during this tutorial.Note: Even if you don’t have everything set up yet, you’ll still learn how to:
gspread
library to update Google Sheets programmatically.Before we can save anything to Google Sheets, we need to make sure our app has permission to interact with it. This involves creating a Google Cloud Project, enabling the necessary APIs, and setting up a service account to handle the heavy lifting.
A Google Cloud Project is the foundation for everything you’ll do with Google APIs. It acts as your app’s home base for managing resources and configurations.
Go to the Google Cloud Console and click New Project.
Name your project (e.g., Receipt Processor
) and click Create.
The Google Sheets API lets your app write to Google Sheets, and the Google Drive API gives your app access to files stored in Google Drive.
Your app can’t log in to Google Sheets the same way you do. Instead, it needs a service account—a special bot that handles authentication and permissions for your app. This ensures your app can securely access Google Sheets without requiring manual logins or user interaction.
Go to APIs & Services > Credentials and click + Create Credentials.
Select Service Account
Give your service account a name (e.g., Receipt Bot
) and a description like “Handles communication with Google Sheets for receipt tracking.”
<$> [info]
Note: Your service account will generate an email address (e.g., something@project-id.iam.gserviceaccount.com
). You’ll need this later to share access to your Google Sheet.
<$>
Click Create and Continue until you’re back at the credentials screen. You can skip assigning roles if you don’t need extra permissions.
The service account needs credentials to prove its identity. These come in the form of a JSON file.
In the Credentials screen, locate your service account and click the pencil icon to edit it.
Go to the Keys tab and click Add Key > Create New Key.
Select JSON and download the file.
<$> [info] Note: This file contains everything your app needs to authenticate with Google Sheets. Treat it like a password—don’t share it or commit it to Git. <$>
Finally, give the service account permission to access your Google Sheet.
Get the client_email from the JSON file you downloaded
Open your Google Sheet. Click Share, paste the email address, and give it Editor access.
Now your app can read from and write to the Google Sheet securely.
We don’t want sensitive credentials like the service account JSON lying around in our codebase—it’s a security risk. Instead, we’ll store the credentials as an environment variable in DigitalOcean App Platform, where they’ll be safe and accessible only at runtime.
The tricky part is that Google’s credentials JSON file is formatted for readability, but environment variables need a single-line string. Let’s fix that first.
Google’s credentials JSON is pretty, but we need it compact for storage as an environment variable. Here’s how to convert it into a single-line string:
Run this Python snippet on your local machine:
import json
with open("path/to/service-account.json", "r") as file:
creds = json.load(file)
print(json.dumps(creds))
This will output the JSON as a single line. Copy the result.
Now, let’s store the single-line JSON securely in DigitalOcean App Platform:
Go to your app’s Settings > Environment Variables.
Add a new variable:
GOOGLE_CREDS={"type": "service_account", "project_id": "receipt-processor-45a6b", ......}
Check the Encrypt option to keep the credentials safe. Click Save.
This action will trigger an automatic redeployment of your app.
That’s it! Your credentials are now securely stored, and your app is ready to use them at runtime.
Now that your credentials are securely stored, it’s time to connect your app to Google Sheets and update it to handle receipt data and attachments.
Before diving into the code, let’s make sure you’ve installed the necessary dependencies and updated your requirements.txt
file. This ensures your app has all the libraries it needs to run seamlessly.
Run the following command to install all required Python libraries:
pip install flask boto3 python-dotenv gspread oauth2client openai
Next, freeze your dependencies into a requirements.txt
file:
pip freeze > requirements.txt
This step captures all your app’s dependencies, making it easier to deploy and manage in DigitalOcean App Platform.
This step allows your app to authenticate with Google Sheets using the credentials stored in GOOGLE_CREDS
. Once authenticated, the app can read and write to the sheet programmatically.
Add this code to your app:
import gspread
from oauth2client.service_account import ServiceAccountCredentials
import os
import json
# Load credentials from environment variables
# GOOGLE_CREDS is the single-line JSON string that contains the service account credentials.
creds_json = os.getenv("GOOGLE_CREDS")
creds_dict = json.loads(creds_json) # Convert the JSON string back into a dictionary.
# Define the required scopes for accessing Google Sheets and Google Drive.
# The "spreadsheets" scope allows the app to read/write Sheets, and the "drive" scope allows access to Sheets stored in Drive.
scope = ["https://spreadsheets.google.com/feeds", "https://www.googleapis.com/auth/drive"]
# Authenticate using the service account credentials.
credentials = ServiceAccountCredentials.from_json_keyfile_dict(creds_dict, scope)
# Create a gspread client to interact with Google Sheets.
sheets_client = gspread.authorize(credentials)
# Open the Google Sheet by name. Replace "Receipts" with the name of your Sheet.
# This provides access to the first worksheet in the Sheet, which you can use to read/write rows.
sheet = sheets_client.open("Receipts").sheet1
Your app processes email data and attachments—this function ensures both are saved to Google Sheets. Each row includes receipt details (e.g., vendor, amount, currency) and attachment URLs.
Add this function to your app:
def save_to_google_sheets(extracted_data, attachment_urls):
"""
Save extracted receipt data and attachment URLs to Google Sheets.
"""
try:
# Combine all attachment URLs into a single string, separated by commas.
# This ensures all URLs are stored in one cell in the Sheet.
attachments_str = ", ".join([attachment["url"] for attachment in attachment_urls])
# Append a new row with extracted data and attachment URLs to the Google Sheet.
# Each element in the list corresponds to a column in the Sheet.
sheet.append_row([
extracted_data.get("vendor", ""),
extracted_data.get("amount", ""),
extracted_data.get("currency", ""),
extracted_data.get("date", ""),
attachments_str # Store all attachment URLs in a single column
])
# Log a success message to confirm data was saved.
logging.info("Data and attachments saved to Google Sheets.")
except Exception as e:
# Log an error if something goes wrong while saving to the Sheet.
logging.error(f"Failed to save data to Google Sheets: {e}")
Here’s the complete code for your app, consolidating all the pieces we’ve worked on so far.
from flask import Flask, request, jsonify
import os
import base64
import uuid
import boto3
from dotenv import load_dotenv
from openai import OpenAI
import logging
import json
import gspread
from oauth2client.service_account import ServiceAccountCredentials
# Load environment variables
load_dotenv()
# Initialize Flask app
app = Flask(__name__)
# Configure logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
# Initialize DigitalOcean GenAI client
SECURE_AGENT_KEY = os.getenv("SECURE_AGENT_KEY")
AGENT_BASE_URL = os.getenv("AGENT_BASE_URL")
AGENT_ENDPOINT = f"{AGENT_BASE_URL}/api/v1/"
client = OpenAI(base_url=AGENT_ENDPOINT, api_key=SECURE_AGENT_KEY)
# DigitalOcean Spaces credentials
SPACES_ACCESS_KEY = os.getenv("SPACES_ACCESS_KEY")
SPACES_SECRET_KEY = os.getenv("SPACES_SECRET_KEY")
SPACES_BUCKET = os.getenv("SPACES_BUCKET_NAME")
SPACES_REGION = os.getenv("SPACES_REGION")
SPACES_ENDPOINT = f"https://{SPACES_BUCKET}.{SPACES_REGION}.digitaloceanspaces.com"
# Initialize DigitalOcean Spaces client
session = boto3.session.Session()
s3_client = session.client(
's3',
region_name=SPACES_REGION,
endpoint_url=SPACES_ENDPOINT,
aws_access_key_id=SPACES_ACCESS_KEY,
aws_secret_access_key=SPACES_SECRET_KEY
)
# Google Sheets API setup
creds_json = os.getenv("GOOGLE_CREDS")
if not creds_json:
raise ValueError("Google credentials not found in environment variables.")
creds_dict = json.loads(creds_json)
scope = ["https://spreadsheets.google.com/feeds", "https://www.googleapis.com/auth/drive"]
credentials = ServiceAccountCredentials.from_json_keyfile_dict(creds_dict, scope)
sheets_client = gspread.authorize(credentials)
sheet = sheets_client.open("Receipts").sheet1 # Replace "Receipts" with your sheet name
def extract_text_from_email(email_content):
"""Extract relevant details from the email content using DigitalOcean GenAI."""
logging.debug("Extracting details from email content.")
prompt = (
"Extract the following details from the email:\n"
"- Date of transaction\n"
"- Amount\n"
"- Currency\n"
"- Vendor name\n\n"
f"Email content:\n{email_content}\n\n"
"Ensure the output is in JSON format with keys: date, amount, currency, vendor."
)
response = client.chat.completions.create(
model="your-model-id", # Replace with your GenAI model ID
messages=[{"role": "user", "content": prompt}]
)
logging.debug("GenAI processing completed.")
return json.loads(response.choices[0].message.content)
def decode_and_save_attachment(attachment):
"""Decode base64-encoded attachment and save it locally with a unique name."""
file_name = attachment.get("Name")
encoded_content = attachment.get("Content")
if not file_name or not encoded_content:
logging.warning("Invalid attachment, skipping.")
return None
unique_file_name = f"{uuid.uuid4()}_{file_name}"
file_path = os.path.join("/tmp", unique_file_name)
try:
with open(file_path, "wb") as file:
file.write(base64.b64decode(encoded_content))
logging.info(f"Attachment saved locally: {file_path}")
return file_path
except Exception as e:
logging.error(f"Failed to decode and save attachment {file_name}: {e}")
return None
def upload_attachment_to_spaces(file_path):
"""Upload a file to DigitalOcean Spaces and return its public URL."""
file_name = os.path.basename(file_path)
object_name = f"email-receipt-processor/{file_name}"
try:
s3_client.upload_file(file_path, SPACES_BUCKET, object_name, ExtraArgs={"ACL": "public-read"})
file_url = f"https://{SPACES_BUCKET}.{SPACES_REGION}.cdn.digitaloceanspaces.com/{object_name}"
logging.info(f"Attachment uploaded to Spaces: {file_url}")
return file_url
except Exception as e:
logging.error(f"Failed to upload attachment {file_name} to Spaces: {e}")
return None
def process_attachments(attachments):
"""Process all attachments and return their URLs."""
attachment_urls = []
for attachment in attachments:
file_path = decode_and_save_attachment(attachment)
if file_path:
file_url = upload_attachment_to_spaces(file_path)
if file_url:
attachment_urls.append({"file_name": os.path.basename(file_path), "url": file_url})
os.remove(file_path) # Clean up local file
return attachment_urls
def save_to_google_sheets(extracted_data, attachment_urls):
"""Save extracted receipt data and attachment URLs to Google Sheets."""
try:
# Combine all attachment URLs into a single string (comma-separated)
attachments_str = ", ".join([attachment["url"] for attachment in attachment_urls])
# Append a new row with extracted data and attachment URLs
sheet.append_row([
extracted_data.get("vendor", ""),
extracted_data.get("amount", ""),
extracted_data.get("currency", ""),
extracted_data.get("date", ""),
attachments_str # Store all attachment URLs in a single column
])
logging.info("Data and attachments saved to Google Sheets.")
except Exception as e:
logging.error(f"Failed to save data to Google Sheets: {e}")
@app.route('/inbound', methods=['POST'])
def handle_inbound_email():
"""Process inbound emails and return extracted JSON."""
logging.info("Received inbound email request.")
data = request.json
email_content = data.get("TextBody", "")
attachments = data.get("Attachments", [])
if not email_content:
logging.error("No email content provided.")
return jsonify({"error": "No email content provided"}), 400
extracted_data = extract_text_from_email(email_content)
attachment_urls = process_attachments(attachments)
# Save extracted data and attachment URLs to Google Sheets
save_to_google_sheets(extracted_data, attachment_urls)
response_data = {
"extracted_data": extracted_data,
"attachments": attachment_urls
}
# Log the final combined data
logging.info("Final Response Data: %s", response_data)
return jsonify(response_data)
if __name__ == "__main__":
logging.info("Starting Flask application.")
app.run(port=5000)
To deploy the updated Flask app, follow the steps from Day 7: Building and Deploying the Email-Based Receipt Processor. Here’s a quick summary:
Push Your Updated Code to GitHub: After making the necessary changes to your Flask app, commit and push the updated code to GitHub. This will trigger an automatic deployment in DigitalOcean’s App Platform.
git add .
git commit -m "Add attachment processing with DigitalOcean Spaces"
git push origin main
Monitor Deployment: You can track the progress in the Deployments section of your app’s dashboard.
Verify Your Deployment: After the deployment completes, navigate to your app’s public URL and test its functionality. You can also check the runtime logs in the dashboard to confirm that the app started successfully.
Now that your app is fully configured and ready, it’s time to test the entire workflow. We’ll ensure that the email body is processed, attachments are decoded and uploaded to DigitalOcean Spaces, and the final output includes receipt details and attachment URLs, all saved in Google Sheets.
Here’s how you can test step by step:
Send a Test Email: Send an email to Postmark with a text body and an attachment. If you’re unsure how to configure Postmark, check Day 8: Connecting Postmark to Your Flask App where we walked through setting up Postmark to forward emails to your app.
Check Postmark Activity JSON: In the Postmark dashboard, navigate to the Activity tab. Locate the email you sent, and ensure that the JSON payload includes the text body and Base64-encoded attachment data. This confirms Postmark is correctly forwarding the email data to your app.
Monitor the Logs: Check the runtime logs in your DigitalOcean App Platform dashboard to ensure the app processes the JSON payload. We covered how to access runtime logs in Day 9: Automating Receipt Parsing with DigitalOcean’s GenAI Agent.
Verify Spaces Upload: Visit your DigitalOcean Space to confirm that the files were uploaded successfully. You should see the attachments in your bucket.
Check Google Sheets: Open your Google Sheet and confirm that receipt details and attachment URLs are saved as a new row. The details should include:
By the end of these steps, your app will have successfully completed the full email-to-Google Sheets workflow, setting the stage for further automation.
Amazing work! Today, you’ve learned how to seamlessly integrate Google Sheets into your app to manage receipt data. Specifically, you:
Set up a Google Cloud Project and enabled the Sheets API.
Created a service account and securely stored its credentials in DigitalOcean App Platform.
Used gspread
to programmatically update Google Sheets with receipt details and attachment URLs.
Up Next: In the final tutorial, we’ll complete the automation by enabling your app to send confirmation emails whenever a receipt is processed successfully. These emails will include the extracted receipt details and a direct link to the Google Sheet for easy access and verification.
See you in the last chapter of the 12 Days of DigitalOcean series! 🚀
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!