Built-in Types: Str, List, Dict, and Set
TL;DR
Python’s built-in types (str, list, dict, set) come packed with dozens of methods for manipulation, searching, and transformation. Dictionaries maintain insertion order since Python 3.7, and many lesser-known methods like dict.setdefault(), str.removeprefix(), and set operations on dictionary views can simplify common patterns.
Interesting!
Dictionary views support set operations. You can perform intersection, union, and difference operations directly on dict.keys() to find common keys between dictionaries, unique keys, or differences without converting to sets first.
python code snippet start
d1 = {'a': 1, 'b': 2, 'c': 3}
d2 = {'b': 4, 'c': 5, 'd': 6}
# Common keys
d1.keys() & d2.keys() # {'b', 'c'}
# All keys
d1.keys() | d2.keys() # {'a', 'b', 'c', 'd'}
# Keys only in d1
d1.keys() - d2.keys() # {'a'}python code snippet end
String Power Methods
Beyond basic slicing and concatenation, strings have specialized methods for common tasks:
python code snippet start
# Remove prefix/suffix (Python 3.9+)
url = "https://example.com"
url.removeprefix("https://") # "example.com"
filename = "report.pdf"
filename.removesuffix(".pdf") # "report"
# Partition - split into exactly 3 parts
email = "user@example.com"
user, sep, domain = email.partition("@")
# user='user', sep='@', domain='example.com'
# Split on line boundaries (handles \n, \r\n, \r)
text = "line1\nline2\r\nline3"
text.splitlines() # ['line1', 'line2', 'line3']
# Zero-fill numbers for fixed-width formatting
"42".zfill(5) # "00042"
"-42".zfill(5) # "-0042" - sign aware
# Center/justify text
"Title".center(20, "=") # "=======Title========"
"Left".ljust(10, ".") # "Left......"python code snippet end
String formatting has evolved through several generations. F-strings (Python 3.6+) now support debug specifiers:
python code snippet start
name = "Alice"
age = 30
# Debug format - includes variable name
print(f"{name=}, {age=}") # name='Alice', age=30python code snippet end
Unicode Strings
Python 3 strings are Unicode by default, with specialized methods for international text handling.
Case folding for comparisons: The casefold() method applies Unicode case-folding transformations for case-insensitive string matching. Unlike lower(), which performs simple lowercase conversion, casefold() handles complex case mappings defined in the Unicode standard - converting characters like German ß to “ss”, Greek Σ to “σ”, and other multi-character expansions. Use casefold() when comparing user input, search terms, or any international text where case should be ignored.
python code snippet start
# casefold() handles Unicode case folding rules
"Maße".casefold() # "masse" (ß becomes ss)
"Maße".lower() # "maße" (ß unchanged)
# Critical for case-insensitive matching
"Straße".casefold() == "STRASSE".casefold() # True
"Straße".lower() == "STRASSE".lower() # False
# Greek example
"ΣΊΣΥΦΟΣ".casefold() == "σίσυφος".casefold() # Truepython code snippet end
Encoding and decoding: Strings encode to bytes, bytes decode to strings.
python code snippet start
# Encode to different formats
text = "Hello, 世界"
text.encode('utf-8') # b'Hello, \xe4\xb8\x96\xe7\x95\x8c'
text.encode('utf-16') # b'\xff\xfeH\x00e\x00l\x00l\x00o\x00...'
# Decode with error handling
data = b'caf\xe9'
data.decode('latin-1') # "café"
data.decode('utf-8', errors='ignore') # "caf" - skip invalid
data.decode('utf-8', errors='replace') # "caf�" - replacement charpython code snippet end
Unicode character access: Every character has a code point accessible via ord() and chr().
python code snippet start
# Character to code point
ord('A') # 65
ord('€') # 8364
ord('🐍') # 128013
# Code point to character
chr(65) # 'A'
chr(128013) # '🐍'
# Useful for character ranges
"".join(chr(i) for i in range(0x1F600, 0x1F610)) # Emoji rangepython code snippet end
Reference: Unicode HOWTO - Python Documentation
List Methods and Pitfalls
Lists provide in-place modification methods. Key distinction: methods that modify in-place return None, not the modified list.
python code snippet start
numbers = [3, 1, 4, 1, 5]
# In-place operations return None
result = numbers.sort() # result is None
print(numbers) # [1, 1, 3, 4, 5]
# Use sorted() for a new list
numbers = [3, 1, 4, 1, 5]
result = sorted(numbers) # result is [1, 1, 3, 4, 5]python code snippet end
Common pitfall with list repetition:
python code snippet start
# Wrong - all sublists are the same object
matrix = [[]] * 3
matrix[0].append(1)
# Result: [[1], [1], [1]]
# Correct - create separate lists
matrix = [[] for _ in range(3)]
matrix[0].append(1)
# Result: [[1], [], []]python code snippet end
Dictionary Convenience Methods
The setdefault() method combines get-or-create logic in one call:
python code snippet start
# Without setdefault
counts = {}
for word in ["apple", "banana", "apple"]:
if word not in counts:
counts[word] = 0
counts[word] += 1
# With setdefault
counts = {}
for word in ["apple", "banana", "apple"]:
counts[word] = counts.setdefault(word, 0) + 1python code snippet end
Dictionary merging became cleaner in Python 3.9:
python code snippet start
defaults = {'color': 'blue', 'size': 'medium'}
custom = {'size': 'large', 'style': 'bold'}
# Merge operator (newer wins)
merged = defaults | custom
# {'color': 'blue', 'size': 'large', 'style': 'bold'}
# In-place merge
defaults |= custompython code snippet end
Since Python 3.7, dictionaries preserve insertion order. The popitem() method leverages this by removing in LIFO order:
python code snippet start
d = {"first": 1, "second": 2, "third": 3}
d.popitem() # ('third', 3) - most recently added
d.popitem() # ('second', 2)python code snippet end
Set Operations
Sets excel at membership testing and eliminating duplicates. They support mathematical set operations with intuitive operators:
python code snippet start
evens = {2, 4, 6, 8}
primes = {2, 3, 5, 7}
evens | primes # Union: {2, 3, 4, 5, 6, 7, 8}
evens & primes # Intersection: {2}
evens - primes # Difference: {4, 6, 8}
evens ^ primes # Symmetric difference: {3, 4, 5, 6, 7, 8}
# Test relationships
{1, 2} <= {1, 2, 3} # Subset: True
{1, 2} < {1, 2} # Proper subset: False
evens.isdisjoint({1, 3, 5}) # No overlap: Truepython code snippet end
Safe removal:
python code snippet start
s = {1, 2, 3}
s.remove(4) # Raises KeyError
s.discard(4) # No error, silent no-oppython code snippet end
Lesser-Known Type Features
String translation for character mapping:
python code snippet start
# Form 1: Two equal-length strings (character-to-character mapping)
trans = str.maketrans('aeiou', '12345')
'hello world'.translate(trans) # 'h2ll4 w4rld'
# Form 2: Three arguments (mapping + deletion)
# Third argument specifies characters to delete
trans = str.maketrans('aeiou', '12345', 'world')
'hello world'.translate(trans) # 'h2ll4 '
# Form 3: Dictionary mapping (most flexible)
# Maps ordinals or characters to ordinals/strings/None
trans = str.maketrans({
'h': 'H', # Character to character
ord('e'): '3', # Ordinal to string
'o': None, # Character to None (delete)
108: 'L' # Ordinal (for 'l') to character
})
'hello world'.translate(trans) # 'H3LL wrLd'python code snippet end
Numeric type conversions:
python code snippet start
# Float to exact rational representation
(3.14).as_integer_ratio() # (707065141471711, 2251799813685248)
# Bit manipulation on integers
(42).bit_length() # 6 (needs 6 bits)
(42).bit_count() # 3 (three 1-bits) - Python 3.10+
# Hexadecimal round-trip for exact float storage
h = (3.14159).hex() # '0x1.921f9f01b866ep+1'
float.fromhex(h) # 3.14159python code snippet end
Bytes to hex conversion:
python code snippet start
data = b'\xde\xad\xbe\xef'
data.hex() # 'deadbeef'
bytes.fromhex('deadbeef') # b'\xde\xad\xbe\xef'
# With separators (Python 3.8+)
data.hex(' ') # 'de ad be ef'
data.hex(':', 2) # 'dead:beef'python code snippet end
Python’s built-in types form the foundation for all data manipulation. Understanding their methods, operators, and performance characteristics enables writing more concise and efficient code.
For specialized container types beyond these basics, check out the collections module
with defaultdict, Counter, and deque. If you’re working with data structures more generally, the data structures tutorial
provides comprehensive coverage. Dictionary comprehensions are explored in detail in the PEP 274 article
.
Reference: Built-in Types - Python Documentation