Note: This tutorial is updated for Python 3.12+. The techniques shown work across most modern Python 3 versions.
Python strings equality can be checked using ==
operator or __eq__()
function. Python strings are case sensitive, so these equality check methods are also case sensitive.
Let’s look at some examples to check if two strings are equal or not.
s1 = 'Apple'
s2 = 'Apple'
s3 = 'apple'
# case sensitive equals check
if s1 == s2:
print(f"s1 and s2 are equal: {s1 == s2}")
if s1.__eq__(s2):
print(f"s1 and s2 are equal: {s1.__eq__(s2)}")
You can also encapsulate string comparisons in a function:
def compare_strings(a: str, b: str) -> None:
print(f"{a!r} == {b!r}: {a == b}")
print(f"{a!r} != {b!r}: {a != b}")
compare_strings(s1, s2)
compare_strings(s1, s3)
Output:
s1 and s2 are equal.
s1 and s2 are equal.
If you want to perform inequality check, you can use !=
operator.
if s1 != s3:
print(f"s1 and s3 are not equal: {s1 != s3}")
Output: s1 and s3 are not equal: True
Sometimes we don’t care about the case while checking if two strings are equal, we can use casefold()
, lower()
or upper()
functions for case-insensitive equality check.
if s1.casefold() == s3.casefold():
print(f"s1.casefold(): {s1.casefold()}")
print(f"s3.casefold(): {s3.casefold()}")
print(f"s1 and s3 are equal in case-insensitive comparison: {s1.casefold() == s3.casefold()}")
if s1.lower() == s3.lower():
print(f"s1.lower(): {s1.lower()}")
print(f"s3.lower(): {s3.lower()}")
print(f"s1 and s3 are equal in case-insensitive comparison: {s1.lower() == s3.lower()}")
if s1.upper() == s3.upper():
print(f"s1.upper(): {s1.upper()}")
print(f"s3.upper(): {s3.upper()}")
print(f"s1 and s3 are equal in case-insensitive comparison: {s1.upper() == s3.upper()}")
Output:
apple
apple
s1 and s3 are equal in case-insensitive comparison
apple
apple
s1 and s3 are equal in case-insensitive comparison
APPLE
APPLE
s1 and s3 are equal in case-insensitive comparison
Let’s look at some examples where strings contain special characters.
s1 = '$#ç∂'
s2 = '$#ç∂'
print(f"s1 == s2? {s1 == s2}")
print(f"s1 != s2? {s1 != s2}")
print(f"s1.lower() == s2.lower()? {s1.lower() == s2.lower()}")
print(f"s1.upper() == s2.upper()? {s1.upper() == s2.upper()}")
print(f"s1.casefold() == s2.casefold()? {s1.casefold() == s2.casefold()}")
Output:
s1 == s2? True
s1 != s2? False
s1.lower() == s2.lower()? True
s1.upper() == s2.upper()? True
s1.casefold() == s2.casefold()? True
That’s all for checking if two strings are equal or not in Python.
You can checkout complete script and more Python String examples from our GitHub Repository.
==
vs .lower()
vs .casefold()
The ==
operator checks for exact equality, including case sensitivity. The .lower()
and .casefold()
methods, on the other hand, perform case-insensitive comparisons. The difference between the two is that .lower()
is not suitable for all Unicode characters, while .casefold()
is.
Example:
s1 = 'Apple'
s2 = 'apple'
# Using ==
print(s1 == s2) # Output: False
# Using .lower()
print(s1.lower() == s2.lower()) # Output: True
# Using .casefold()
print(s1.casefold() == s2.casefold()) # Output: True
is
for stringsThe is
operator checks for object identity, not equality. It’s important to avoid using is
for string comparison, as it can lead to unexpected results. Instead, use the ==
operator for string comparison.
Example:
s1 = 'Hello'
s2 = 'Hello'
# Using is
print(s1 is s2) # Output: True, but this checks object identity, not equality
# Using ==
print(s1 == s2) # Output: True, this checks for equality
When comparing strings with special characters or Unicode, it’s important to use the appropriate method for your use case. If you need a case-insensitive comparison, use .casefold()
. If you need a case-sensitive comparison, use ==
.
Example:
s1 = 'ççç'
s2 = 'ÇÇÇ'
# Case-sensitive comparison using ==
print(s1 == s2) # Output: False
# Case-insensitive comparison using .casefold()
print(s1.casefold() == s2.casefold()) # Output: True
is
instead of ==
for string comparisonThis is a common mistake when comparing strings in Python. The is
operator checks for object identity, not equality. To fix this, use the ==
operator for string comparison.
Example of the mistake:
s1 = 'Hello'
s2 = 'Hello'
print(s1 is s2) # This will return True if s1 and s2 refer to the same object, not if they have the same value.
Corrected code:
s1 = 'Hello'
s2 = 'Hello'
print(s1 == s2) # This will return True if s1 and s2 have the same value.
None
or non-string typesWhen comparing a string to None or non-string types, Python will raise a TypeError
. To avoid this, make sure to compare strings to other strings or to None explicitly.
Example of the mistake:
s1 = 'Hello'
s2 = None
print(s1 == s2) # This will raise a TypeError because s2 is not a string.
Corrected code:
s1 = 'Hello'
s2 = None
if isinstance(s2, str):
print(s1 == s2)
else:
print("s2 is not a string.")
Trailing or leading whitespace can lead to misleading results when comparing strings. To fix this, use the strip()
method to remove any leading or trailing whitespace before comparing strings.
Example of the mistake:
s1 = 'Hello'
s2 = ' Hello '
print(s1 == s2) # This will return False because of the leading and trailing whitespace in s2.
Corrected code:
s1 = 'Hello'
s2 = ' Hello '
print(s1 == s2.strip()) # This will return True after removing the leading and trailing whitespace from s2.
Text often arrives from files, databases, or external APIs with unspecified or incorrect encodings. A common pitfall is comparing a UTF‑8–decoded str
to text that was decoded as Latin‑1 (ISO‑8859‑1), which silently maps bytes 0x80–0xFF to different code points. The strings may look similar yet never match. Always standardize on UTF‑8 at boundaries and fail fast on decode errors. If you must salvage data, re‑encode/redo decode deterministically.
raw = b"Stra\xc3\x9fe" # UTF-8 bytes for "Straße"
wrong = raw.decode('latin-1') # Mis-decoded
right = raw.decode('utf-8') # Correct
print(wrong == right) # False
print(wrong.encode('latin-1').decode('utf-8') == right) # True after repair
Invisible characters—Zero‑Width Space (ZWSP \u200b
), Byte Order Mark (BOM \ufeff
), Zero‑Width Joiner/Non‑Joiner—can sneak in via copy/paste, web forms, or PDFs. They derail equality checks, sorting, and deduplication. Detect by inspecting code points or lengths and remove them as part of your input sanitation pipeline. Consider logging sanitized and raw values separately for diagnostics to avoid masking the original issue during investigations.
s = "hello\u200bworld" # visually "helloworld"
print(len("helloworld"), len(s)) # 10 vs 11
print("helloworld" == s) # False
CLEAN = ''.join(ch for ch in s if ch not in "\u200b\u200c\u200d\ufeff")
print("helloworld" == CLEAN) # True
unicodedata.normalize
?Visually identical characters can have different underlying representations (precomposed vs. combining forms). Normalization converts them into a canonical form so that equality is meaningful. Prefer NFKC
for compatibility comparisons (folds certain look‑alikes) or NFC
when preserving canonical text is essential. Normalize, then casefold if performing case‑insensitive checks. Apply normalization once at boundaries to avoid repeated costs.
import unicodedata
s1 = "e\u0301" # "e" + combining acute
s2 = "é" # precomposed
print(s1 == s2) # False in raw form
n1 = unicodedata.normalize('NFKC', s1)
n2 = unicodedata.normalize('NFKC', s2)
print(n1 == n2) # True after normalization
print(n1.casefold() == n2.casefold()) # Robust, case-insensitive
Practice | Description | Code Example | When to Use |
---|---|---|---|
Normalize at boundaries | Convert text to canonical form before comparison | unicodedata.normalize('NFKC', s).strip().casefold() |
User-visible text; use NFC if data fidelity matters |
Standardize encoding | Pick one encoding (UTF-8) and enforce it on all I/O | text.encode('utf-8').decode('utf-8') |
All input/output boundaries; fail fast on decode errors |
Remove invisible characters | Strip ZWSP/ZWNJ/ZWJ/BOM before comparison | ''.join(ch for ch in s if ch not in "\u200b\u200c\u200d\ufeff") |
When semantics allow removal of formatting characters |
Type consistency | Never mix bytes and str ; decode/encode explicitly |
b.decode('utf-8') == s |
Always; decode bytes to strings before comparison |
Pre-normalize once | Transform text once, then compare many times | normalized = text.casefold(); if normalized == target |
Hot loops, validators, ETL jobs |
Use casefold() over lower() | Prefer Unicode-aware case transformation | s1.casefold() == s2.casefold() |
Case-insensitive Unicode comparisons |
Log both raw and canonical | Keep original and processed values for debugging | logger.debug(f"Raw: {raw}, Canonical: {canonical}") |
During debugging to avoid over-sanitizing evidence |
Exact canonical forms | Compare exact forms, avoid locale-dependent behavior | unicodedata.normalize('NFC', s) |
Security-sensitive paths |
casefold()
over lower()
)When you operate in multilingual contexts (German, Turkish, accented Latin), simple lower()
can quietly fail. For example, German ß should compare equal to ss in a case-insensitive match, and Turkish İ (capital dotted I) lowers to "i\u0307"
(i + combining dot) rather than just "i"
. Python’s casefold()
performs a more aggressive, Unicode‑aware transformation that aligns better with user expectations and search semantics. In short: if your readers or data aren’t strictly ASCII, reach for casefold()
.
s1 = "straße"
s2 = "STRASSE"
print(s1.lower() == s2.lower()) # False on many systems
print(s1.casefold() == s2.casefold()) # True: "straße" → "strasse"
s3 = "İstanbul" # Capital dotted I
s4 = "istanbul"
print(s3.lower() == s4) # Often False: "İ" → "i\u0307"
print(s3.casefold() == s4) # More reliable for Unicode text
Equality checks are O(n) in string length; the operator itself is fast, but repeated comparisons in hot loops, validators, or ETL jobs add up. The biggest win is to pre‑normalize once (strip, normalize, casefold) rather than transforming on every comparison. Also, compare canonical forms of data at the edges (I/O boundaries) so your core code paths operate on clean, comparable text.
import timeit
# Naïve: transform inside the loop
setup = "a='Straße'; b='STRASSE'"
naive = timeit.timeit("a.casefold()==b.casefold()", setup=setup, number=2_000_000)
# Better: precompute once, compare many times
setup2 = "a='Straße'.casefold(); b='STRASSE'.casefold()"
optimized = timeit.timeit("a==b", setup=setup2, number=2_000_000)
print({"naive_s": naive, "optimized_s": optimized}) # Optimized is typically faster
str
and bytes
are different types. Direct equality between them is always False
, even if the visible characters match, and mixed encodings can create subtle bugs. Normalize at the boundary: decode incoming bytes to str
or encode strings to bytes with a known charset (usually UTF‑8). Keep comparisons within the same type and normalized form.
b = b"hello"
s = "hello"
print(b == s) # False: bytes vs str
print(b.decode('utf-8') == s) # True: decode to text first
payload = b"Stra\xc3\x9fe" # UTF-8 for "Straße"
text = payload.decode('utf-8')
print(text.casefold() == "strasse") # True
Production data often carries invisible artifacts (zero‑width spaces, non‑breaking spaces, mixed normalization forms). A robust comparison pipeline should trim, remove invisibles, normalize Unicode, and only then compare (often with casefold()
). This is especially important for log de‑duplication, idempotent APIs, or user‑input reconciliation.
import re, unicodedata
ZW_INVISIBLES = "\u200b\u200c\u200d\ufeff" # ZWSP, ZWNJ, ZWJ, BOM
def canonicalize(s: str) -> str:
s = ''.join(ch for ch in s if ch not in ZW_INVISIBLES)
s = unicodedata.normalize('NFKC', s)
return s.strip().casefold()
api_val = " Stra\u00dfe\u200b " # "Straße" + zero-width space + padding
user_val = "strasse"
print(canonicalize(api_val) == canonicalize(user_val)) # True
Unicode contains many “confusable” characters—letters and digits that look nearly identical but have different code points. Attackers exploit these for spoofing domains, usernames, and phishing (homograph attacks). For example, Cyrillic “а” (U+0430) and Latin “a” (U+0061) are visually indistinguishable, and “rn” resembles “m”. To guard against this, always canonicalize and normalize strings with unicodedata.normalize
, and consider using Unicode Security Mechanisms (UTS #39) to detect confusables. Here’s a demo:
import unicodedata
# Cyrillic 'а' (U+0430) vs Latin 'a'
latin = "apple"
cyrillic = "аpple" # first 'a' is Cyrillic
print(latin == cyrillic) # False, but looks identical
print([ord(c) for c in cyrillic]) # [1072, 112, 112, 108, 101]
# 'rn' vs 'm'
print("rn" == "m") # False, but can look similar in some fonts
For critical systems, apply normalization and check for confusables using libraries like confusable_homoglyphs
to reduce spoofing risks.
While ==
is the fastest way to compare strings for equality, alternatives like re.fullmatch
(regular expressions) or string methods (startswith
, endswith
) are slower and not intended for strict equality. Regex incurs extra overhead due to pattern compilation and matching logic, making it much slower at scale. For bulk comparisons, normalize strings once and use direct ==
. Here’s a timing demo:
import re, timeit, unicodedata
s1 = unicodedata.normalize('NFC', "Straße")
s2 = unicodedata.normalize('NFC', "STRAßE")
# Direct equality after normalization + casefold
def eq(): return s1.casefold() == s2.casefold()
# Regex fullmatch
def regex(): return re.fullmatch(re.escape(s1.casefold()), s2.casefold()) is not None
# startswith/endswith (not equality, but for illustration)
def starts(): return s2.casefold().startswith(s1.casefold())
print("eq:", timeit.timeit(eq, number=1_000_000))
print("regex:", timeit.timeit(regex, number=1_000_000))
print("startswith:", timeit.timeit(starts, number=1_000_000))
Direct equality with normalization is orders of magnitude faster and more robust for equality checks, especially in hot code paths or large datasets.
The following table summarizes the main methods for comparing strings in Python, with expert recommendations for robust, secure, and internationalized code. These guidelines are based on the official Python documentation, Unicode standards, and best practices from experienced Python developers.
Method | Summary & Expert Guidance |
---|---|
== (equality) |
Recommended for most use cases. Compares the values of two strings for exact equality, character by character. This operation is fast (O(n)), highly optimized in CPython, and reliable for both ASCII and Unicode—provided you standardize inputs first (e.g., strip whitespace, normalize Unicode, and casefold if needed). Avoid locale-dependent assumptions. For critical applications (authentication, deduplication, data validation), always canonicalize inputs before using == . |
is (identity) |
Do not use for string equality. Checks if two variables point to the exact same object in memory, not if their contents are equal. May appear to work for short literals due to Python’s string interning, but this is an implementation detail and not guaranteed. Use is only for singleton checks (e.g., x is None ). For string content comparison, always use == . |
.lower() |
Basic case-insensitive comparison for ASCII. Converts all cased characters to lowercase. Works for many English-only or simple cases, but is not fully Unicode-aware. Some characters (e.g., German ß, Turkish İ) are not handled correctly. For multilingual or internationalized applications, .lower() is insufficient—prefer .casefold() . |
.casefold() |
Best practice for Unicode-aware, case-insensitive comparison. .casefold() is more aggressive than .lower() , handling complex scripts and special cases (e.g., ß→ss, dotted/dotless I). For maximum reliability, combine with Unicode normalization (unicodedata.normalize ) to ensure visually identical strings are truly equal. Slightly slower than .lower() , but essential for robust, global-ready code. |
regex (re.fullmatch ) |
Not recommended for equality checks. Regular expressions are powerful for pattern matching and validation, but are much slower and more complex than == for simple equality. Use regex only when you need to match a pattern or schema, not for direct string comparison. For large-scale or performance-critical code, normalize and use == . |
References:
Expert tip: For security-sensitive or user-facing applications, always normalize and casefold user input before comparison, and consider using libraries such as confusable_homoglyphs
to detect visually similar but different Unicode characters.
==
for strings in Python?Yes, you can use the ==
operator to compare strings in Python. This operator checks if the values of the strings are equal.
Example:
s1 = 'Hello'
s2 = 'Hello'
print(s1 == s2) # Output: True
You can check if a string is equal in Python by using the ==
operator. This operator checks if the values of the strings are equal.
Example:
s1 = 'Hello'
s2 = 'Hello'
print(s1 == s2) # Output: True
You can check if a variable is equal to a string in Python by using the ==
operator. This operator checks if the value of the variable is equal to the string.
Example:
var = 'Hello'
print(var == 'Hello') # Output: True
==
for strings?Yes, you use the ==
operator for strings in Python. This operator checks if the values of the strings are equal.
Example:
s1 = 'Hello'
s2 = 'Hello'
print(s1 == s2) # Output: True
You can check if two strings are equal in Python by using the ==
operator. This operator checks if the values of the strings are equal.
Example:
s1 = 'Hello'
s2 = 'Hello'
print(s1 == s2) # Output: True
The ==
operator checks if the values of two objects are equal, while the is
operator checks if two objects are the same (i.e., they refer to the same memory location).
Example:
s1 = 'Hello'
s2 = 'Hello'
print(s1 == s2) # Output: True
print(s1 is s2) # Output: True, but this checks object identity, not equality
You can do a case-insensitive string comparison in Python by converting both strings to lowercase or uppercase using the lower()
or upper()
method, respectively.
Example:
s1 = 'Hello'
s2 = 'hello'
print(s1.lower() == s2.lower()) # Output: True
Your string comparison might not be working due to various reasons such as:
Example:
s1 = 'Hello'
s2 = 'hello'
print(s1 == s2) # Output: False, due to case sensitivity
Yes, you can compare a string to None in Python. However, you should ensure that the object you are comparing is a string or None explicitly to avoid TypeError.
Example:
s1 = 'Hello'
s2 = None
if isinstance(s2, str):
print(s1 == s2)
else:
print("s2 is not a string.")
You can compare multi-line strings in Python by using the ==
operator. However, ensure that the strings are formatted correctly, including any newline characters.
Example:
s1 = """Hello
World"""
s2 = """Hello
World"""
print(s1 == s2) # Output: True
In this comprehensive guide, we have delved into the intricacies of string comparison in Python. We began by exploring the fundamental methods for checking string equality, including the use of the ==
operator and the __eq__()
function. We also highlighted the importance of considering case sensitivity in string comparisons and demonstrated how to perform case-insensitive comparisons using the lower()
, upper()
, and casefold()
methods.
Furthermore, we discussed the key differences between the ==
operator and the is
operator, emphasizing the importance of using ==
for string comparison to ensure accurate results. Additionally, we touched upon common errors that can occur during string comparison, such as ignoring case sensitivity or comparing strings to non-string objects, and provided valuable debugging tips to overcome these issues.
To further enhance your understanding of string manipulation in Python, we recommend exploring the following tutorials:
By reading these tutorials, you’ll gain a deeper understanding of string manipulation and comparison in Python, enabling you to write more efficient and effective code.
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
Java and Python Developer for 20+ years, Open Source Enthusiast, Founder of https://www.askpython.com/, https://www.linuxfordevices.com/, and JournalDev.com (acquired by DigitalOcean). Passionate about writing technical articles and sharing knowledge with others. Love Java, Python, Unix and related technologies. Follow my X @PankajWebDev
I help Businesses scale with AI x SEO x (authentic) Content that revives traffic and keeps leads flowing | 3,000,000+ Average monthly readers on Medium | Sr Technical Writer @ DigitalOcean | Ex-Cloud Consultant @ AMEX | Ex-Site Reliability Engineer(DevOps)@Nutanix
Building future-ready infrastructure with Linux, Cloud, and DevOps. Full Stack Developer & System Administrator @ DigitalOcean | GitHub Contributor | Passionate about Docker, PostgreSQL, and Open Source | Exploring NLP & AI-TensorFlow | Nailed over 50+ deployments across production environments.
Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.
Full documentation for every DigitalOcean product.
The Wave has everything you need to know about building a business, from raising funding to marketing your product.
Stay up to date by signing up for DigitalOcean’s Infrastructure as a Newsletter.
New accounts only. By submitting your email you agree to our Privacy Policy
Scale up as you grow — whether you're running one virtual machine or ten thousand.
Sign up and get $200 in credit for your first 60 days with DigitalOcean.*
*This promotional offer applies to new accounts only.