Built-in Types: Str, List, Dict, and Set

December 30, 2025

TL;DR

Python’s built-in types (str, list, dict, set) come packed with dozens of methods for manipulation, searching, and transformation. Dictionaries maintain insertion order since Python 3.7, and many lesser-known methods like dict.setdefault(), str.removeprefix(), and set operations on dictionary views can simplify common patterns.

Interesting!

Dictionary views support set operations. You can perform intersection, union, and difference operations directly on dict.keys() to find common keys between dictionaries, unique keys, or differences without converting to sets first.

d1 = {'a': 1, 'b': 2, 'c': 3}
d2 = {'b': 4, 'c': 5, 'd': 6}

# Common keys
d1.keys() & d2.keys()  # {'b', 'c'}

# All keys
d1.keys() | d2.keys()  # {'a', 'b', 'c', 'd'}

# Keys only in d1
d1.keys() - d2.keys()  # {'a'}

String Power Methods

Beyond basic slicing and concatenation, strings have specialized methods for common tasks:

# Remove prefix/suffix (Python 3.9+)
url = "https://example.com"
url.removeprefix("https://")  # "example.com"

filename = "report.pdf"
filename.removesuffix(".pdf")  # "report"

# Partition - split into exactly 3 parts
email = "user@example.com"
user, sep, domain = email.partition("@")
# user='user', sep='@', domain='example.com'

# Split on line boundaries (handles \n, \r\n, \r)
text = "line1\nline2\r\nline3"
text.splitlines()  # ['line1', 'line2', 'line3']

# Zero-fill numbers for fixed-width formatting
"42".zfill(5)      # "00042"
"-42".zfill(5)     # "-0042" - sign aware

# Center/justify text
"Title".center(20, "=")  # "=======Title========"
"Left".ljust(10, ".")    # "Left......"

String formatting has evolved through several generations. F-strings (Python 3.6+) now support debug specifiers:

name = "Alice"
age = 30

# Debug format - includes variable name
print(f"{name=}, {age=}")  # name='Alice', age=30

Unicode Strings

Python 3 strings are Unicode by default, with specialized methods for international text handling.

Case folding for comparisons: The casefold() method applies Unicode case-folding transformations for case-insensitive string matching. Unlike lower(), which performs simple lowercase conversion, casefold() handles complex case mappings defined in the Unicode standard - converting characters like German ß to “ss”, Greek Σ to “σ”, and other multi-character expansions. Use casefold() when comparing user input, search terms, or any international text where case should be ignored.

# casefold() handles Unicode case folding rules
"Maße".casefold()  # "masse" (ß becomes ss)
"Maße".lower()     # "maße" (ß unchanged)

# Critical for case-insensitive matching
"Straße".casefold() == "STRASSE".casefold()  # True
"Straße".lower() == "STRASSE".lower()        # False

# Greek example
"ΣΊΣΥΦΟΣ".casefold() == "σίσυφος".casefold()  # True

Encoding and decoding: Strings encode to bytes, bytes decode to strings.

# Encode to different formats
text = "Hello, 世界"
text.encode('utf-8')      # b'Hello, \xe4\xb8\x96\xe7\x95\x8c'
text.encode('utf-16')     # b'\xff\xfeH\x00e\x00l\x00l\x00o\x00...'

# Decode with error handling
data = b'caf\xe9'
data.decode('latin-1')              # "café"
data.decode('utf-8', errors='ignore')    # "caf" - skip invalid
data.decode('utf-8', errors='replace')   # "caf�" - replacement char

Unicode character access: Every character has a code point accessible via ord() and chr().

# Character to code point
ord('A')      # 65
ord('€')      # 8364
ord('🐍')     # 128013

# Code point to character
chr(65)       # 'A'
chr(128013)   # '🐍'

# Useful for character ranges
"".join(chr(i) for i in range(0x1F600, 0x1F610))  # Emoji range

Reference: Unicode HOWTO - Python Documentation

List Methods and Pitfalls

Lists provide in-place modification methods. Key distinction: methods that modify in-place return None, not the modified list.

numbers = [3, 1, 4, 1, 5]

# In-place operations return None
result = numbers.sort()  # result is None
print(numbers)           # [1, 1, 3, 4, 5]

# Use sorted() for a new list
numbers = [3, 1, 4, 1, 5]
result = sorted(numbers)  # result is [1, 1, 3, 4, 5]

Common pitfall with list repetition:

# Wrong - all sublists are the same object
matrix = [[]] * 3
matrix[0].append(1)
# Result: [[1], [1], [1]]

# Correct - create separate lists
matrix = [[] for _ in range(3)]
matrix[0].append(1)
# Result: [[1], [], []]

Dictionary Convenience Methods

The setdefault() method combines get-or-create logic in one call:

# Without setdefault
counts = {}
for word in ["apple", "banana", "apple"]:
    if word not in counts:
        counts[word] = 0
    counts[word] += 1

# With setdefault
counts = {}
for word in ["apple", "banana", "apple"]:
    counts[word] = counts.setdefault(word, 0) + 1

Dictionary merging became cleaner in Python 3.9:

defaults = {'color': 'blue', 'size': 'medium'}
custom = {'size': 'large', 'style': 'bold'}

# Merge operator (newer wins)
merged = defaults | custom
# {'color': 'blue', 'size': 'large', 'style': 'bold'}

# In-place merge
defaults |= custom

Since Python 3.7, dictionaries preserve insertion order. The popitem() method leverages this by removing in LIFO order:

d = {"first": 1, "second": 2, "third": 3}
d.popitem()  # ('third', 3) - most recently added
d.popitem()  # ('second', 2)

Set Operations

Sets excel at membership testing and eliminating duplicates. They support mathematical set operations with intuitive operators:

evens = {2, 4, 6, 8}
primes = {2, 3, 5, 7}

evens | primes    # Union: {2, 3, 4, 5, 6, 7, 8}
evens & primes    # Intersection: {2}
evens - primes    # Difference: {4, 6, 8}
evens ^ primes    # Symmetric difference: {3, 4, 5, 6, 7, 8}

# Test relationships
{1, 2} <= {1, 2, 3}     # Subset: True
{1, 2} < {1, 2}         # Proper subset: False
evens.isdisjoint({1, 3, 5})  # No overlap: True

Safe removal:

s = {1, 2, 3}

s.remove(4)    # Raises KeyError
s.discard(4)   # No error, silent no-op

Lesser-Known Type Features

String translation for character mapping:

# Form 1: Two equal-length strings (character-to-character mapping)
trans = str.maketrans('aeiou', '12345')
'hello world'.translate(trans)  # 'h2ll4 w4rld'

# Form 2: Three arguments (mapping + deletion)
# Third argument specifies characters to delete
trans = str.maketrans('aeiou', '12345', 'world')
'hello world'.translate(trans)  # 'h2ll4 '

# Form 3: Dictionary mapping (most flexible)
# Maps ordinals or characters to ordinals/strings/None
trans = str.maketrans({
    'h': 'H',           # Character to character
    ord('e'): '3',      # Ordinal to string
    'o': None,          # Character to None (delete)
    108: 'L'            # Ordinal (for 'l') to character
})
'hello world'.translate(trans)  # 'H3LL wrLd'

Numeric type conversions:

# Float to exact rational representation
(3.14).as_integer_ratio()  # (707065141471711, 2251799813685248)

# Bit manipulation on integers
(42).bit_length()   # 6 (needs 6 bits)
(42).bit_count()    # 3 (three 1-bits) - Python 3.10+

# Hexadecimal round-trip for exact float storage
h = (3.14159).hex()         # '0x1.921f9f01b866ep+1'
float.fromhex(h)            # 3.14159

Bytes to hex conversion:

data = b'\xde\xad\xbe\xef'
data.hex()                  # 'deadbeef'
bytes.fromhex('deadbeef')   # b'\xde\xad\xbe\xef'

# With separators (Python 3.8+)
data.hex(' ')               # 'de ad be ef'
data.hex(':', 2)            # 'dead:beef'

Python’s built-in types form the foundation for all data manipulation. Understanding their methods, operators, and performance characteristics enables writing more concise and efficient code.

For specialized container types beyond these basics, check out the collections module with defaultdict, Counter, and deque. If you’re working with data structures more generally, the data structures tutorial provides comprehensive coverage. Dictionary comprehensions are explored in detail in the PEP 274 article .

Reference: Built-in Types - Python Documentation

Built-in Types: Str, List, Dict, and Set

TL;DR

Interesting!

String Power Methods

Unicode Strings

List Methods and Pitfalls

Dictionary Convenience Methods

Set Operations

Lesser-Known Type Features

PEP 274: Dictionary Comprehensions - Elegant Dict Creation

Tutorial: Data Structures - Python's Built-in Containers Mastery

Collections Module: Specialized Data Structures for Python Power Users

Collections Module: Specialized Data Structures for Python Power Users

Tutorial: Data Structures - Python's Built-in Containers Mastery

PEP 274: Dictionary Comprehensions - Elegant Dict Creation

Tutorial: Classes - Object-Oriented Programming in Python

Codecs Module: Mastering Text Encoding and Decoding

Math Module: Mathematical Functions and Constants

Gzip Module: Efficient File Compression and Decompression

Textwrap Module: Elegant Text Formatting and Wrapping