Introduction
You tin comparison strings successful Python utilizing the equality (==) and comparison (<, >, !=, <=, >=) operators. There are nary typical methods to comparison 2 strings. The == usability is the correct prime for almost each drawstring comparison: it checks whether 2 strings incorporate the aforesaid characters, and it useful the aforesaid measurement everyplace Python runs.
Python drawstring comparison compares the characters successful some strings 1 by one. When different characters are found, Python looks astatine each character’s Unicode codification point, which is simply the number Unicode assigns to that character. The characteristic pinch the little number is considered to beryllium smaller. The charismatic Python connection reference describes it for illustration this: “Strings (instances of str) comparison lexicographically utilizing the numerical Unicode codification points (the consequence of the built-in usability ord()) of their characters.”
In this article, you’ll study really each of the operators useful erstwhile comparing strings. You’ll besides study really == differs from is, why is gives unreliable results for strings, and really to comparison strings while ignoring precocious and little lawsuit utilizing lower() and casefold(). Finally, you’ll screen the trickier cases: checking for substrings, measuring really akin 2 strings are pinch difflib, comparing multiline strings, and handling accented characters.
Key Takeaways:
- Use == and != to cheque whether 2 strings person the aforesaid value. This is the recommended measurement to comparison strings successful Python.
- Python compares strings characteristic by character, utilizing each character’s Unicode codification point, which is the aforesaid number the built-in ord() usability returns.
- Uppercase letters ever count arsenic smaller than lowercase letters because their codification points are lower: ord('A') is 65 while ord('a') is 97.
- The is usability checks whether 2 names constituent to the nonstop aforesaid entity successful memory, not whether the values match, truthful it should ne'er beryllium utilized to comparison drawstring values.
- Python sometimes reuses (interns) drawstring objects down the scenes, which makes is return True for immoderate adjacent strings and False for others. You cannot trust connected this behavior.
- To comparison strings while ignoring precocious and little case, telephone .casefold() connected some strings. It handles typical characters, specified arsenic the German ß, that .lower() misses.
- Use the successful usability to cheque whether 1 drawstring appears wrong another, and startswith() and endswith() to cheque really a drawstring originates aliases ends, arsenic recommended by PEP 8.
- Use difflib.SequenceMatcher to measurement really akin 2 strings are, and unicodedata.normalize() earlier comparing matter that whitethorn incorporate accented characters.
- The Python 2 cmp() usability was removed successful Python 3, truthful codification that relies connected it must usage the comparison operators instead.
Python equality and comparison operators
Declare the drawstring variable:
fruit1 = 'Apple'The pursuing array shows the results of comparing identical strings (Apple to Apple) utilizing different operators.
| Equality | print(fruit1 == 'Apple') | True |
| Not adjacent to | print(fruit1 != 'Apple') | False |
| Less than | print(fruit1 < 'Apple') | False |
| Greater than | print(fruit1 > 'Apple') | False |
| Less than aliases adjacent to | print(fruit1 <= 'Apple') | True |
| Greater than aliases adjacent to | print(fruit1 >= 'Apple') | True |
Both the strings are precisely the same. In different words, they’re equal. The equality usability and the different equal to operators return True.
If you comparison strings of different values, past you get the nonstop other output.
If you comparison strings that incorporate the aforesaid substring, specified arsenic Apple and ApplePie, past the longer drawstring is considered larger. You tin publication much astir really these operators behave pinch different information types successful this Python operators reference.
Comparing personification input to measure equality utilizing operators
This illustration codification takes and compares input from the user. Then the programme uses the results of the comparison to people further accusation astir the alphabetical bid of the input strings. In this case, the programme assumes that the smaller drawstring comes earlier the larger string.
fruit1 = input('Enter the sanction of the first fruit:\n') fruit2 = input('Enter the sanction of the 2nd fruit:\n') if fruit1 < fruit2: print(fruit1 + " comes earlier " + fruit2 + " successful the dictionary.") elif fruit1 > fruit2: print(fruit1 + " comes aft " + fruit2 + " successful the dictionary.") else: print(fruit1 + " and " + fruit2 + " are the same.")Here’s an illustration of the imaginable output erstwhile you participate different values:
Output
Enter the sanction of the first fruit: Apple Enter the sanction of the 2nd fruit: Banana Apple comes earlier Banana in the dictionary.Here’s an illustration of the imaginable output erstwhile you participate identical strings:
Output
Enter the sanction of the first fruit: Orange Enter the sanction of the 2nd fruit: Orange Orange and Orange are the same.Note: For this illustration to work, some strings must usage accordant casing throughout. For example, if the personification enters the strings pome and Banana, past the output will beryllium pome comes aft Banana successful the dictionary, which is incorrect.
This discrepancy occurs because the Unicode codification constituent values of uppercase letters are ever smaller than the Unicode codification constituent values of lowercase letters: the worth of a is 97 and the worth of B is 66. You tin trial this yourself by utilizing the ord() function to people the Unicode codification constituent worth of the characters.
How Python orders strings: lexicographic comparison
Python compares strings lexicographically, which is simply a general measurement of saying “like a dictionary orders words”. Python walks some strings from near to right, comparing 1 characteristic astatine a time. The first position wherever the characters disagree decides the result, based connected the Unicode codification constituent of each character. If 1 drawstring runs retired of characters first and each earlier characters matched, the shorter drawstring is the smaller one.
You tin spot the numbers down the ordering by passing characters to ord():
print(ord('A'), ord('Z'), ord('a'), ord('z'))Running this prints the pursuing output:
Output
65 90 97 122Uppercase A done Z usage codification points 65 done 90, while lowercase a done z usage 97 done 122. Because each uppercase missive has a little codification constituent than each lowercase letter, "Zebra" < "apple" evaluates to True moreover though “apple” comes first alphabetically. The aforesaid norm explains really Python sorts lists of strings:
print("apple" < "banana") print("apple" == "Apple") print("Zebra" < "apple") print(sorted(["banana", "Apple", "cherry"]))Running this prints the pursuing output:
Output
True False True ['Apple', 'banana', 'cherry']The capitalized Apple sorts earlier the lowercase strings for the aforesaid codification constituent reason. If you request existent alphabetical sorting that ignores case, benignant pinch a cardinal usability specified arsenic sorted(words, key=str.casefold). For matter successful languages pinch accented characters, the modular room locale module tin benignant matter utilizing the rules of a circumstantial language, which gives much earthy results than plain codification constituent ordering.
Understanding == vs is erstwhile comparing strings
Use == erstwhile you want to cognize whether 2 strings incorporate the aforesaid characters, and prevention is for checks for illustration x is None. The == usability compares values, while is compares identity: it returns True only erstwhile some names constituent to the nonstop aforesaid entity successful memory. Two strings tin clasp the aforesaid matter while still being 2 abstracted objects.
The pursuing illustration shows the difference:
str1 = "hello" str2 = "hello" print(str1 == str2) print(str1 is str2) str3 = "".join(["he", "llo"]) print(str3 == str1) print(str3 is str1)Running this prints the pursuing output:
Output
True True True FalseAll 3 variables clasp the worth hello, truthful == returns True successful some comparisons. But is returns True for the first 2 strings and False for the drawstring built pinch join(). The adjacent conception explains why.
CPython drawstring interning and why it creates confusion
CPython, the modular type of Python that astir group install, saves representation by reusing immoderate drawstring objects alternatively of creating copies. This is called drawstring interning. Strings written straight successful your codification (string literals), and short strings that look for illustration adaptable names, are often interned. That is why str1 and str2 supra extremity up pointing astatine the aforesaid entity and is happens to return True.
Strings created while the programme runs, specified arsenic the consequence of join(), joining variables pinch +, aliases personification input, are usually not interned. That is why str3 is str1 returns False moreover though the values are equal.
You cannot trust connected these rules. They are soul specifications of CPython: they person changed betwixt Python versions and activity otherwise successful different versions of Python specified arsenic PyPy. Code that uses is for drawstring values whitethorn walk your tests pinch short literals and past neglect later pinch existent data. This is precisely why the shape is simply a predominant root of beginner bugs and a recurring mobility connected Stack Overflow.
PEP 8 guidance connected utilizing is
Python’s charismatic style guideline says is belongs to typical one-of-a-kind objects for illustration None. PEP 8 states: “Comparisons to singletons for illustration None should ever beryllium done pinch is aliases is not, ne'er the equality operators.” In different words, worth comparisons, including each drawstring comparisons, beryllium to == and !=. Following this norm keeps your codification correct nary matter really interning behaves.
Performance of drawstring comparison methods
For mundane code, the velocity quality betwixt drawstring comparison methods is tiny, truthful take the method that is correct alternatively than the 1 that looks fastest successful a benchmark. Under the hood, the == usability successful CPython first checks whether some sides are the aforesaid object, past compares their lengths, and only past compares the existent characters. This intends moreover agelong adjacent strings comparison quickly.
The pursuing benchmark uses the timeit module to comparison 3 approaches connected 2 adjacent 1,000-character strings:
import timeit str1 = "a" * 1000 str2 = "a" * 1000 equality_time = timeit.timeit(lambda: str1 == str2, number=100000) identity_time = timeit.timeit(lambda: str1 is str2, number=100000) casefold_time = timeit.timeit(lambda: str1.casefold() == str2.casefold(), number=100000) print(f"Equality (==): {equality_time:.4f} seconds") print(f"Identity (is): {identity_time:.4f} seconds") print(f"Casefolded equality: {casefold_time:.4f} seconds")Running this prints output akin to the pursuing (exact timings alteration by instrumentality and Python version):
Output
Equality (==): 0.0025 seconds Identity (is): 0.0024 seconds Casefolded equality: 0.0621 secondsThe is cheque is somewhat faster than == because it only compares representation addresses, but it answers a different mobility and gives incorrect results for adjacent strings that are abstracted objects, truthful the mini redeeming is ne'er worthy it. The existent costs successful this benchmark is the case-insensitive comparison: calling casefold() creates 2 caller strings connected each comparison, making it astir 25 times slower here. If you comparison the aforesaid strings galore times while ignoring case, casefold them once, shop the result, and comparison the stored values.
Best practices for comparing case-insensitive and Unicode strings
When comparing strings, you request to deliberation astir 2 things: lawsuit (the quality betwixt uppercase and lowercase characters) and language-specific characters specified arsenic accented letters. The pursuing champion practices support drawstring comparisons meticulous and efficient.
Case-insensitive comparisons pinch lower()
To comparison strings while ignoring precocious and little case, usage the .lower() method to person some strings to lowercase earlier comparing them. This attack is elemental and useful good for astir cases. Here’s an example:
str1 = "Hello World" str2 = "HELLO WORLD" print(str1.lower() == str2.lower())Running this prints the pursuing output:
Output
TrueHowever, .lower() whitethorn not beryllium capable for languages that person much analyzable lawsuit rules, specified arsenic German aliases Turkish.
Unicode-safe comparisons pinch casefold()
For much precocious lawsuit handling, usage the .casefold() method. The Python documentation describes it this way: “Casefolding is akin to lowercasing but much fierce because it is intended to region each lawsuit distinctions successful a string.” The classical illustration is the German lowercase missive ß, which is balanced to ss erstwhile lawsuit is ignored. The .lower() method leaves ß unchanged, while .casefold() converts it to ss:
str1 = "Straße" str2 = "STRASSE" print(str1.lower() == str2.lower()) print(str1.casefold() == str2.casefold())Running this prints the pursuing output:
Output
False TrueThe .lower() comparison fails because "Straße".lower() is "straße" while "STRASSE".lower() is "strasse". The .casefold() comparison succeeds because some strings casefold to "strasse". If your app handles matter successful galore languages, for illustration .casefold(); the Python archiving states that “casefolded strings whitethorn beryllium utilized for caseless matching.”
Unicode normalization
When moving pinch world text, you request to grip typical characters and accents correctly. This includes characters for illustration umlauts (ü), accents (é), and different accent marks. The tricky portion is that the aforesaid visible characteristic tin beryllium stored successful different ways: é tin beryllium a azygous codification constituent (U+00E9) aliases the plain missive e followed by a abstracted accent people (U+0301). They look identical connected screen, but the == usability sees them arsenic different strings.
To comparison specified strings correctly, first person some to the aforesaid modular shape (this is called Unicode normalization) pinch the unicodedata module:
import unicodedata str1 = "caf\u00e9" # "café" arsenic a azygous precomposed codification point str2 = "cafe\u0301" # "café" arsenic "e" positive a combining accent print(str1 == str2) nfc1 = unicodedata.normalize("NFC", str1) nfc2 = unicodedata.normalize("NFC", str2) print(nfc1 == nfc2)Running this prints the pursuing output:
Output
False TrueBeyond normalization, 2 much strategies thief pinch world text:
- Language-aware comparison: Use comparison functions aliases libraries that understand the circumstantial connection being used. These tin grip language-specific rules for sorting and comparison.
- Preprocessing: Clean up strings earlier comparing them, depending connected what your exertion needs. This tin see removing accent marks aliases converting accented characters to their plain equivalents.
Here’s an illustration that removes an accent earlier comparing:
str7 = "café" # String pinch accent str8 = "cafe" # String without accent # Preprocess to region diacritical marks preprocessed_str7 = str7.replace('é', 'e') print(preprocessed_str7 == str8)Running this prints the pursuing output:
Output
TrueBy pursuing these champion practices, you tin guarantee that your drawstring comparisons are accurate, efficient, and culturally sensitive, moreover erstwhile moving pinch ample strings and world text.
Comparing parts of strings: in, startswith(), and endswith()
Sometimes you don’t request to comparison full strings. You only request to cheque whether 1 drawstring appears inside, astatine the commencement of, aliases astatine the extremity of another. Use the successful usability to cheque whether a drawstring contains different string, and the startswith() and endswith() methods to cheque really a drawstring originates aliases ends:
message = "Hello, World!" print("World" in message) print(message.startswith("Hello")) print(message.endswith("World!"))Running this prints the pursuing output:
Output
True True TrueAll 3 checks attraction astir case, truthful harvester them pinch casefold() erstwhile lawsuit should not matter, for illustration "world" successful message.casefold(). PEP 8 recommends these methods complete cutting the drawstring up pinch slicing: “Use ''.startswith() and ''.endswith() alternatively of drawstring slicing to cheque for prefixes aliases suffixes.” They are clearer and debar mini indexing mistakes. For much analyzable patterns, specified arsenic “starts pinch a digit followed by a dash”, usage regular expressions from the modular room re module instead.
Fuzzy drawstring comparison pinch difflib
Sometimes you request to cognize really akin 2 strings are, not conscionable whether they are precisely equal. For that, usage the difflib module from the modular library. Its SequenceMatcher people calculates a similarity people betwixt 0 and 1, wherever 1 intends the strings are identical:
from difflib import SequenceMatcher str1 = "Hello, World!" str2 = "Hello, Universe!" print(SequenceMatcher(None, str1, str2).ratio())Running this prints the pursuing output:
Output
0.6206896551724138A people of astir 0.62 tells you the 2 strings are mostly akin but not the same. This method is useful for spotting near-duplicate records, suggesting corrections for misspelled input, aliases matching user-entered names against a known list. Pick a cutoff worth that fits your information (0.8 is simply a communal starting point) and dainty thing supra it arsenic a apt match. For nonstop equality checks, enactment pinch ==; SequenceMatcher is overmuch slower and is meant for measuring similarity, not testing equality.
Comparing multiline strings
Multiline strings are compared precisely for illustration single-line strings: == looks astatine each character, and that intends spaces, newline characters, and indentation each count. Two strings that look the aforesaid erstwhile printed tin still beryllium unequal because 1 uses Unix-style \n statement endings and the different uses Windows-style \r\n:
text1 = "first line\nsecond line" text2 = "first line\r\nsecond line" print(text1 == text2) print(text1.splitlines() == text2.splitlines())Running this prints the pursuing output:
Output
False TrueThe nonstop comparison fails because of the invisible \r character, while splitlines() splits connected immoderate benignant of statement ending and produces adjacent lists. When formatting differences should beryllium ignored, cleanable some strings up earlier comparing: usage strip() to region spaces astatine the commencement and end, splitlines() to disregard line-ending differences, aliases " ".join(text.split()) to compression each repeated whitespace down to azygous spaces. The correct cleanup depends connected whether the whitespace carries meaning successful your data.
Handling Unicode, ASCII, and byte strings successful Python
Python 3 has 2 different types for text-like data, str and bytes, and knowing which 1 you are comparing helps you debar galore bugs. The pursuing subsections locomotion done each one.
Unicode strings
Unicode strings are the modular measurement to correspond matter successful Python. They are sequences of Unicode characters, which are represented by the str type. Unicode strings are the default drawstring type successful Python 3. They tin incorporate characters from immoderate language, including non-ASCII characters for illustration accents, umlauts, and non-Latin scripts.
Here’s an illustration of creating a Unicode drawstring successful Python:
unicode_str = "Hëllo, Wørld!" print(unicode_str)Running this prints the pursuing output:
Output
Hëllo, Wørld!Notice really the drawstring contains non-ASCII characters for illustration the accented ‘e’ (ë) and the slashed ‘o’ (ø). These characters are correctly represented and tin beryllium manipulated for illustration immoderate different drawstring successful Python.
ASCII strings
ASCII strings are a subset of Unicode strings that only incorporate characters from the ASCII characteristic set. ASCII strings are typically utilized erstwhile moving pinch bequest systems aliases erstwhile there’s a request to guarantee compatibility pinch systems that only support ASCII characters.
In Python, ASCII strings are besides represented by the str type, but they are constricted to characters pinch ASCII codification points (0-127). Here’s an illustration of creating an ASCII drawstring successful Python:
ascii_str = "Hello, World!" print(ascii_str)Running this prints the pursuing output:
Output
Hello, World!Notice really the drawstring only contains characters from the ASCII characteristic set.
Byte strings
Byte strings, connected the different hand, are sequences of bytes, which are represented by the bytes type successful Python. Byte strings are typically utilized erstwhile moving pinch binary data, specified arsenic reference aliases penning files, web communication, aliases cryptographic operations.
Here’s an illustration of creating a byte drawstring successful Python:
byte_str = b"Hello, World!" print(byte_str)Running this prints the pursuing output:
Output
b'Hello, World!'Notice the b prefix earlier the drawstring literal, which indicates that it’s a byte string. Byte strings tin beryllium converted to Unicode strings utilizing the decode() method, and vice versa utilizing the encode() method.
For example, to person a Unicode drawstring to a byte string:
unicode_str = "Hëllo, Wørld!" byte_str = unicode_str.encode('utf-8') print(byte_str)Running this prints the pursuing output:
Output
b'H\xc3\xabllo, W\xc3\xb8rld!'And to person a byte drawstring backmost to a Unicode string:
byte_str = b'H\xc3\xabllo, W\xc3\xb8rld!' unicode_str = byte_str.decode('utf-8') print(unicode_str)Running this prints the pursuing output:
Output
Hëllo, Wørld!By knowing the differences betwixt Unicode, ASCII, and byte strings successful Python, you tin efficaciously activity pinch various types of matter information and guarantee that your applications grip matter correctly, sloppy of the connection aliases characteristic group used. If you want a refresher connected drawstring fundamentals first, spot this article connected working pinch strings successful Python.
How Python drawstring comparison compares to different languages
If you activity successful much than 1 programming language, Python’s attack is simple: 1 operator, ==, compares drawstring values, and location is nary easy measurement to accidentally comparison representation addresses erstwhile you meant to comparison values. The array beneath shows really the aforesaid task looks successful different languages.
| Python | a == b | a is b | is checks entity personality and should not beryllium utilized for worth comparison |
| JavaScript | a === b | (Not abstracted for primitives) | === compares primitive drawstring values without type coercion; == coerces types first |
| Java | a.equals(b) | a == b | == connected String objects compares references—a classical root of bugs |
| C | strcmp(a, b) == 0 | a == b | == compares pointers; strcmp() returns negative, zero, aliases positive |
Java’s == trap is the closest lucifer to misusing is successful Python: some comparison entity references, and some sometimes look to activity because the connection reuses drawstring objects down the scenes, past neglect unpredictably. C’s strcmp() return style (negative, zero, aliases positive) is besides what Python 2’s removed cmp() usability copied. Keeping these differences successful mind helps you debar carrying habits from 1 connection into another.
Which method should you use?
The correct comparison method depends connected the mobility you are asking astir the strings. This array brings together the recommendations from this tutorial:
| ==, != | Checking if 2 strings are adjacent aliases not equal | The default prime for comparing strings |
| <, >, <=, >= | Ordering strings | Based connected Unicode codification points; uppercase sorts earlier lowercase |
| is, is not | Checking against typical objects for illustration None | Never usage for drawstring worth comparison |
| casefold() + == | Equality while ignoring precocious and little case | Handles each languages; for illustration complete lower() for world text |
| in | Checking if a drawstring contains different string | Cares astir case; harvester pinch casefold() if needed |
| startswith(), endswith() | Checking really a drawstring originates aliases ends | Recommended by PEP 8 complete slicing |
| difflib.SequenceMatcher | Measuring really akin 2 strings are | Returns a people betwixt 0 and 1; slower than == |
| unicodedata.normalize() + == | Text pinch accented characters | Convert some sides to the aforesaid shape (NFC aliases NFD) first |
One much bully wont is worthy adopting: erstwhile a worth mightiness beryllium None, cheque worth is not None earlier comparing it to a string, for illustration if worth is not None and worth == "expected". Comparing None to a drawstring pinch == is safe (it simply returns False), but a clear None cheque shows your intent and prevents errors from calls for illustration value.casefold(), which raise an AttributeError erstwhile worth is None.
FAQs
1. How do I comparison 2 strings successful Python?
The equality usability == is utilized to comparison 2 strings successful Python. It checks if the values of the strings are equal, characteristic by character. This intends that the comparison is done based connected the existent characters successful the strings, not their representation locations. For example:
str1 = "Hello, World!" str2 = "Hello, World!" print(str1 == str2)This prints True because some strings incorporate precisely the aforesaid characters.
2. What is the quality betwixt == and is successful Python drawstring comparison?
The equality usability == is utilized to comparison the values of 2 strings, while the personality usability is checks if some strings are the aforesaid entity successful memory. This favoritism is important because 2 strings tin person the aforesaid worth but beryllium different objects successful memory. For example:
str1 = "".join(["Hello, ", "World!"]) str2 = "Hello, World!" print(str1 == str2) print(str1 is str2)This prints True and past False: the strings person the aforesaid worth but are different objects successful memory, truthful == returns True while is returns False. Always usage == for drawstring worth comparison.
3. How tin I comparison strings case-insensitively successful Python?
To comparison strings while ignoring precocious and little case, you tin usage the .lower() method to person some strings to lowercase earlier comparing, aliases the much thorough .casefold() method for world text. This way, the lawsuit of the characters does not impact the result. For example:
str1 = "Hello, World!" str2 = "HELLO, WORLD!" print(str1.lower() == str2.lower())This prints True. For strings that whitethorn incorporate non-ASCII characters, for illustration str1.casefold() == str2.casefold().
4. What is the champion measurement to cheque if a drawstring starts aliases ends pinch a circumstantial substring?
You tin usage the .startswith() and .endswith() methods to cheque if a drawstring starts aliases ends pinch a circumstantial substring. These methods return True if the drawstring starts aliases ends pinch the specified substring, and False otherwise. For example:
str1 = "Hello, World!" print(str1.startswith("Hello")) print(str1.endswith("World!"))Both checks people True. PEP 8 recommends these methods complete drawstring slicing because they are cleaner and little correction prone.
5. How do I comparison aggregate strings astatine once?
You tin usage the == usability to comparison aggregate strings astatine once. This tin beryllium done by chaining aggregate == operators together. For example:
str1 = "Hello, World!" str2 = "Hello, World!" str3 = "Hello, World!" print(str1 == str2 == str3)This prints True only if each drawstring successful the concatenation is adjacent to the adjacent one. Chained comparison is balanced to str1 == str2 and str2 == str3.
6. What are the capacity differences betwixt different drawstring comparison methods?
The velocity differences betwixt drawstring comparison methods successful Python are very mini for astir usage cases. The is usability is somewhat faster than == because it only compares representation addresses alternatively than drawstring contents, but it answers a different mobility and must not beryllium utilized for worth comparison. The slow operations are the ones that create caller strings, specified arsenic calling .lower() aliases .casefold() connected each comparison. If you comparison the aforesaid values galore times while ignoring case, casefold them erstwhile and reuse the result. The .startswith() and .endswith() methods are besides faster and clearer than slicing the drawstring and comparing the pieces.
7. Can I comparison strings successful different encodings successful Python?
Yes, you tin comparison strings that came from different encodings. However, you must first person some byte strings to regular str matter utilizing the .decode() method truthful you are comparing matter pinch text. For example:
str1 = b"Hello, World!".decode('utf-8') str2 = b"Hello, World!".decode('utf-8') print(str1 == str2)This prints True. Comparing a bytes entity straight to a str entity ever returns False successful Python 3, truthful decode first and past compare.
8. How do I cheque if 2 strings are astir identical aliases similar?
You tin usage the difflib module to cheque if 2 strings are astir identical aliases similar. The difflib.SequenceMatcher people provides a measurement to measurement the similarity betwixt 2 sequences, including strings. For example:
from difflib import SequenceMatcher str1 = "Hello, World!" str2 = "Hello, Universe!" print(SequenceMatcher(None, str1, str2).ratio())This prints 0.6206896551724138. The ratio() method returns a measurement of the sequences’ similarity arsenic a float successful the scope [0, 1]. A ratio of 1 intends the sequences are identical, and a ratio of 0 intends they person thing successful common.
9. What does lexicographic drawstring comparison mean successful Python?
Lexicographic comparison intends Python compares strings characteristic by character, for illustration a dictionary orders words, utilizing the Unicode codification constituent worth of each characteristic (the aforesaid worth returned by ord()). The first characteristic that differs decides the result, truthful "apple" < "banana" evaluates to True because a has a little codification constituent than b. Uppercase letters count arsenic smaller than lowercase letters because their codification points are lower, which is why "Zebra" < "apple" is besides True.
10. How do I cheque if a drawstring contains different drawstring successful Python?
Use the successful operator: "world" successful "hello world" returns True. The cheque cares astir case, truthful casefold some sides erstwhile lawsuit should not matter. To cheque really a drawstring originates aliases ends, usage str.startswith() and str.endswith(), and to find the position of the substring usage str.find() aliases str.index().
11. What is the quality betwixt lower() and casefold() for drawstring comparison?
The lower() method converts uppercase letters to lowercase utilizing elemental rules, while casefold() goes further and handles typical characters that lower() misses. The quality shows up pinch characters specified arsenic the German ß: "Straße".lower() keeps the ß, but "Straße".casefold() converts it to "strasse", which correctly matches "STRASSE".casefold(). If your app handles matter successful galore languages, for illustration casefold().
Conclusion
In this article, you learned really to comparison strings successful Python utilizing the equality (==) and comparison (<, >, !=, <=, >=) operators. You besides learned why is checks personality alternatively than worth and should beryllium saved for checks for illustration x is None. Finally, you covered the harder cases: ignoring lawsuit pinch casefold(), checking for substrings pinch in, startswith(), and endswith(), measuring similarity pinch difflib, and comparing accented matter safely pinch Unicode normalization. This is simply a basal accomplishment successful Python programming, and mastering drawstring comparison is basal for moving pinch matter data.
To further grow your knowledge of Python strings, we urge exploring the pursuing tutorials:
- How to Check if Two Strings Are Equal successful Python (With Examples)
- How To Check If a String Contains Another String successful Python
- Python Find String successful List: Complete Guide pinch Examples
- Python String Functions: Complete Guide pinch Examples
- An Introduction to Working pinch Strings successful Python 3
This activity is licensed nether a Creative Commons Attribution-NonCommercial- ShareAlike 4.0 International License.
English (US) ·
Indonesian (ID) ·