With __getattr__ and __setattr__, then descriptors– we have seen how we can add custom behavior to specific attributes, now we’ll take the next step into metaprogramming entire classes.
What happens when Python sees a class statement?
If you’ve ever tried to reference a class by its own name within the body you’ve run into an error:
class MyClass:# NOT ALLOWED! MyClass isn't defined yetdef func(a: MyClass) ->None: ...
This is because, on the def line, the class is still being defined.
(If you need to do what we’re showing in the example, you can use “MyClass” to create a forward-reference, or use typing.Self as the type.)
A class is created in steps:
First code inside the class body executes one line at a time, these are gathered into a “namespace”, essentially a dictionary:
class Icon(Widget): WIDTH =32def__init__(self): ...def show(self): ...
This dictionary is then passed into the type() function. A constructor that creates new types.
# `type` takes three arguments:# - name of the new type# - a tuple of base classes# - the namespace dictionary with all memberstype("Icon", (Widget,), namespace_dict)
We can call this function ourselves to create dynamic types:
class Unit:def__init__(self, num):self.num = numdef__str__(self):returnf"{self.num}{self.symbol}"UNITS = [ ("Meter", "m"), ("Second", "s"), ("Watt", "W"),]TYPES = {}# dynamically create subclasses for each unitfor name, symbol in UNITS: TYPES[name] =type(name, (Unit,), {"symbol": symbol})# expose subclasses to local namespacelocals().update(**TYPES)print(Meter(3))print(Second(10))
3m
10s
Class Decorators
Class decorators are similar to function decorators, functions which take a class and return a new class.
Part of understanding how this works is recognizing that a class is comprised of the namespace we saw above. This is accessible on a class cls as cls.__dict__.
With this, we can add to, remove, or otherwise manipulate a class definition.
Recall that without a __repr__ a class prints an ugly version of itself that isn’t very useful:
class Vector3:def__init__(self, x, y, z):self.x = xself.y = yself.z = zprint(Vector3(1, 2, 3))
<__main__.Vector3 object at 0x7f8b98b2cad0>
# A class decorator, takes a class, returns a classdef autorepr(cls):def__repr__(self): attrs =", ".join(f"{k}={v!r}"for k, v inself.__dict__.items() )returnf"{cls.__name__}({attrs})" cls.__repr__=__repr__return cls@autoreprclass Vector2:def__init__(self, x, y):self.x = xself.y = y# @ syntax is still doing the same thing it did with functions:# Vector2 = autorepr(Vector2)@autoreprclass Vector3:def__init__(self, x, y, z):self.x = xself.y = yself.z = zprint(Vector2(1, 2)) # Vector2(x=1, y=2)print(Vector3(1, 2, 3)) # Vector3(x=1, y=2, z=3)
Vector2(x=1, y=2)
Vector3(x=1, y=2, z=3)
__init_subclass__
Python 3.6 added a powerful new dunder method that is invoked on the parent class when a subclass is instantiated.
It is common to want to register child classes with their parent, and this gives a way to do so automatically without an additional call or decorator:
class Serializer: _registry = {}# __init_subclass__ takes the argument cls, the subclass being created# as well as any number of optional kwargs. here we add format as# an argumentdef__init_subclass__(cls, format, **kwargs):super().__init_subclass__(**kwargs) Serializer._registry[format] = cls@classmethoddef get(cls, format):ifformatnotin cls._registry:raiseValueError(f"No serializer for {format!r}")return cls._registry[format]()# when we subclass Serializer, __init_subclass__ will be called.# our optional additional parameter (format) is passed hereclass JSONSerializer(Serializer, format="json"):def dumps(self, data): ...class CSVSerializer(Serializer, format="csv"):def dumps(self, data): ...# at this point, Serializer._registry contains two entries# from two calls to __init_subclass__ for each of the above classesprint(f"{Serializer._registry=}")s = Serializer.get("json") # returns a JSONSerializer instanceprint(f'{Serializer.get("json")=}')
Serializer._registry={'json': <class '__main__.JSONSerializer'>, 'csv': <class '__main__.CSVSerializer'>}
Serializer.get("json")=<__main__.JSONSerializer object at 0x7f8b4c8e9810>
We can of course inspect & modify the cls reference here as well, just like we did in a decorator. This gives us the ability to have all subclasses of a given type have behavior enforced.
Limitations
The biggest limitation with __init_subclass__ is that the method is called after the namespace is created & passed to type(). The received cls is the fully-realized type already.
Sometimes we want to intercept the collected namespace and modify it before it is handed off to the type() constructor– which finally brings us to metaclasses.
Aside: type vs. object
This can be hard to reason about, so let’s review what type and object are:
object
Everything in Python is a subclass of object, this is a base class that provides common functionality (memory management, that ugly default repr, etc.)
When we create a new class, it is a subclass of `object:
# same as class MyClass:pass# same as class MyClass(object):pass
An instance of MyClass is an instance of object, since isinstance consideres all subclasses to also be of their parent types:
class MyClass:passmyobj = MyClass()# both True!print(isinstance(myobj, MyClass))print(isinstance(myobj, object))
True
True
type on the other hand is the type of the class itself, not an instance of the class:
# not a type!isinstance(myobj, type)
False
# is a type!isinstance(MyClass, type)
True
Where this can be somewhat confusing is that MyClass is also an object– everything in Python is.
As we saw above, the type() function is a constructor that makes an instance of the class.
Metaclasses
Class creation goes through three phases:
Prepare: create the namespace the body will execute in.
Execute: run the body, storing names into that namespace.
Build: call type(name, bases, namespace) to produce the class object
This process expanded out for class Icon(Widget) looks like:
namespace =type.__prepare__("Icon", (Widget,)) # prepareexec(body, namespace) # execute in namespaceFoo =type.__new__(type, "Foo", (Base,), namespace) # call type constructor
A metaclass replaces type in these operations as the underlying class which will have a __prepare__ and __new__ that can be called to create the new type.
# simple metaclass that just extends `type`'s existing implementationclass MyMeta(type):def__new__(mcs, name, bases, namespace):print(f"Building class {name} with attrs: {list(namespace)}")returnsuper().__new__(mcs, name, bases, namespace)class Foo(metaclass=MyMeta): x =1 y =2
mcs vs. cls vs. self
The first parameter is conventionally named to help you understand the type:
mcs for __new__, it is the user-defined metaclass
cls for @classmethods as it will be the user-defined class
self for instances of classes
__new__
__new__ is a metaconstructor, a function that creates new types.
Typically the implementation of __new__ would be to modify the name, base classes, and/or namespace (typically the latter)– then pass them along to super().__new__, our parent type.
__prepare__(name, bases, **kwargs)
If defined, __prepare__ runs even earlier, it returns the namespace object used in the rest of the class definition.
Note that it does not receive a cls or self argument, this is because the type does not exist yet!
__prepare__ allows you to inject variables into the namespace that can then be used in the class definition. Here we add a field() keyword only present within classes:
from dataclasses import dataclassfrom pprint import pprint@dataclassclass _Field: kind: type default: object=None required: bool=Falseclass SchemaMeta(type):def__prepare__(name, bases, **kwargs):return {"field": _Field} # inject field as a name in the class bodydef__new__(mcs, name, bases, namespace, **kwargs): fields = {k: v for k, v in namespace.items() ifisinstance(v, _Field)} cls =super().__new__(mcs, name, bases, dict(namespace)) cls._fields = fieldsreturn cls# common to set metaclass on a base class & use inherited classesclass Schema(metaclass=SchemaMeta):passclass UserSchema(Schema):# where is this `field` function?# is coming from the __prepare__d dict# and only in scope within the class body name = field(str, required=True) email = field(str, required=True) age = field(int, default=0) bio = field(str, default="")pprint(UserSchema._fields)
class StrictTaskMeta(type): required = {"run", "email"} run_return_type =strdef__new__(mcs, clsname, bases, namespace):# if there are base classes don't use their run/name keys# to satisfy the constraintif bases: missing = mcs.required - namespace.keys()if missing:raiseTypeError(f"{clsname} must define: {missing}")if"run"in namespace: namespace["run"].__annotations__["return"] == mcs.run_return_typereturnsuper().__new__(mcs, clsname, bases, namespace)class Task(metaclass=StrictTaskMeta):passclass WorkingTask(Task): email ="admin@example.com"def run(self) ->str:return"ok"