Typing select queries in Python

The recently accepted PEP 646 introduced variadic generics! While the main motivation for this feature was to improve typing for numerical libraries it also means that is now possible to accurately type the results of an SQL select query statement in Python!

There are some limitations to this however, unfortunately we cannot return named properties and instead must use a tuple form.

This post will walk through the process of typing and implementing the equivalent of an SQL select function, if you just want to see the final results then click here.

Getting Started

This post assumes that you are already familiar with generics in python, if you are not then I suggest you read through this tutorial.

You will also need a recent version of typing-extensions and pyright.

Introduction to PEP 646

The acceptance of PEP 646 means we can now create generics with an arbitrary number of type variables instead of being limited to a pre-defined number!

The new features that this introduces is TypeVarTuple and Unpack which must be used in combination with each other to represent variadic generics.

A not particularly useful example of this is inserting an integer to the start of a tuple (yes tuples are immutable but lets assume for this example that we would return a new tuple):

from typing import Tuple
from typing_extensions import TypeVarTuple, Unpack


T = TypeVarTuple('T')


def insert_int(seq: Tuple[Unpack[T]]) -> Tuple[int, Unpack[T]]:
    ...


insert_int(('a', 1))  # return type: Tuple[int, str, int]
from typing import Tuple
from typing_extensions import TypeVarTuple, Unpack


T = TypeVarTuple('T')


def insert_int(seq: Tuple[Unpack[T]]) -> Tuple[int, Unpack[T]]:
    ...


insert_int(('a', 1))  # return type: Tuple[int, str, int]

Minimal example

The minimum code required to represent a select satement statically is actually fairly small.

from typing import Tuple
from typing_extensions import Unpack, TypeVarTuple, reveal_type


FieldT = TypeVarTuple('FieldT')


class User:
    id: int
    name: str
    email: str
    hashed_password: str

    @classmethod
    def select(cls, *fields: Unpack[FieldT]) -> Tuple[Unpack[FieldT]]:
        ...


user = User.select(User.id, User.name)
reveal_type(user)
from typing import Tuple
from typing_extensions import Unpack, TypeVarTuple, reveal_type


FieldT = TypeVarTuple('FieldT')


class User:
    id: int
    name: str
    email: str
    hashed_password: str

    @classmethod
    def select(cls, *fields: Unpack[FieldT]) -> Tuple[Unpack[FieldT]]:
        ...


user = User.select(User.id, User.name)
reveal_type(user)

Now running this code through pyright will give us the following output:

$ pyright typed_select.py
typed_select.py:20:13 - information: Type of "user" is "Tuple[int, str]"
$ pyright typed_select.py
typed_select.py:20:13 - information: Type of "user" is "Tuple[int, str]"

Wow, this is exactly what we were looking for!

Now lets try actually running this code and see what happens:

$ python typed_select.py
Traceback (most recent call last):
  File "/Users/robert/code/craigie.dev/sources/typing-select/001.py", line 19, in <module>
    user = User.select(User.id, User.name)
AttributeError: type object 'User' has no attribute 'id'
$ python typed_select.py
Traceback (most recent call last):
  File "/Users/robert/code/craigie.dev/sources/typing-select/001.py", line 19, in <module>
    user = User.select(User.id, User.name)
AttributeError: type object 'User' has no attribute 'id'

Hmmmm that is annoying.

This is happening because we've given the User class type hints but haven't actually given these fields any values.

Let's assign some default values for the fields we are selecting:

from typing import Tuple
from typing_extensions import Unpack, TypeVarTuple, reveal_type


FieldT = TypeVarTuple('FieldT')


class User:
    id: int = 0
    name: str = ''
    email: str
    hashed_password: str

    @classmethod
    def select(cls, *fields: Unpack[FieldT]) -> Tuple[Unpack[FieldT]]:
        ...


user = User.select(User.id, User.name)
reveal_type(user)
from typing import Tuple
from typing_extensions import Unpack, TypeVarTuple, reveal_type


FieldT = TypeVarTuple('FieldT')


class User:
    id: int = 0
    name: str = ''
    email: str
    hashed_password: str

    @classmethod
    def select(cls, *fields: Unpack[FieldT]) -> Tuple[Unpack[FieldT]]:
        ...


user = User.select(User.id, User.name)
reveal_type(user)

We can now sucessfuly run this code:

$ python typed_select.py
Runtime type is 'NoneType'
$ python typed_select.py
Runtime type is 'NoneType'

Implementation

Now let's implement the select() function so that we can actually use it at runtime.

Lets start by defining a Field class so that we don't have to assign weird defaults for each field and so we can store a reference to the field's name in the database.

class Field:
    name: str

    def __init__(self, name: str) -> None:
        self.name = name
class Field:
    name: str

    def __init__(self, name: str) -> None:
        self.name = name

We now have to define a wrapper function over the Field class so that type checkers don't complain that we're assigning a Field instance when we can only assign strings for example.
This works as we tell the type checker that we are returning Any which disables type checking.

def field(*, name: str) -> Any:
    return Field(name=name)
def field(*, name: str) -> Any:
    return Field(name=name)

Now lets update our User model to use the new Field class

class User:
    id: int = field(name='id')
    name: str = field(name='name')
    email: str = field(name='email')
    hashed_password: str = field(name='hashed_password')
class User:
    id: int = field(name='id')
    name: str = field(name='name')
    email: str = field(name='email')
    hashed_password: str = field(name='hashed_password')

To make this easier to implement we'll define a list of records as our database instead of using a real database driver.

fake_db = [
    {
        'id': 1,
        'name': 'Robert',
        'email': 'robert@craigie.dev',
        'hashed_password': 'foo',
    },
    {
        'id': 2,
        'name': 'Tegan',
        'email': 'tegan@craigie.dev',
        'hashed_password': 'bar',
    },
]
fake_db = [
    {
        'id': 1,
        'name': 'Robert',
        'email': 'robert@craigie.dev',
        'hashed_password': 'foo',
    },
    {
        'id': 2,
        'name': 'Tegan',
        'email': 'tegan@craigie.dev',
        'hashed_password': 'bar',
    },
]

Now all that we need to do is implement the select() function

    @classmethod
    def select(cls, *fields: Unpack[FieldT]) -> Tuple[Unpack[FieldT]]:
        if any(not isinstance(field, Field) for field in fields):
            raise TypeError(
                'Expected all select arguments to be an instance of Field'
            )

        # for simplicites sake we'll just take the first record
        user = fake_db[0]

        # we have to add a type: ignore comment here as the type checker
        # cannot understand the relationship between this expression
        # and the input.
        return tuple(  # type: ignore
            user[field.name] for field in fields if isinstance(field, Field)
        )
    @classmethod
    def select(cls, *fields: Unpack[FieldT]) -> Tuple[Unpack[FieldT]]:
        if any(not isinstance(field, Field) for field in fields):
            raise TypeError(
                'Expected all select arguments to be an instance of Field'
            )

        # for simplicites sake we'll just take the first record
        user = fake_db[0]

        # we have to add a type: ignore comment here as the type checker
        # cannot understand the relationship between this expression
        # and the input.
        return tuple(  # type: ignore
            user[field.name] for field in fields if isinstance(field, Field)
        )

Let's type check it and make sure we haven't made any mistakes.

$ pyright typed_select.py
typed_select.py:60:13 - information: Type of "user" is "Tuple[str, int]"
$ pyright typed_select.py
typed_select.py:60:13 - information: Type of "user" is "Tuple[str, int]"

All good! Now lets run it and see what we get!

$ python typed_select.py
Runtime type is 'tuple'
('Robert', 1)
$ python typed_select.py
Runtime type is 'tuple'
('Robert', 1)

Amazing! We've just managed to statically type and implement a select query!

There are some potential improvements that could be made to this, for example, returning a custom object that provides some helper methods instead of a raw tuple.

Final Implementation

from typing import Tuple, Any
from typing_extensions import Unpack, TypeVarTuple, reveal_type


FieldT = TypeVarTuple('FieldT')


class Field:
    name: str

    def __init__(self, name: str) -> None:
        self.name = name


def field(*, name: str) -> Any:
    return Field(name=name)


fake_db = [
    {
        'id': 1,
        'name': 'Robert',
        'email': 'robert@craigie.dev',
        'hashed_password': 'foo',
    },
    {
        'id': 2,
        'name': 'Tegan',
        'email': 'tegan@craigie.dev',
        'hashed_password': 'bar',
    },
]


class User:
    id: int = field(name='id')
    name: str = field(name='name')
    email: str = field(name='email')
    hashed_password: str = field(name='hashed_password')

    @classmethod
    def select(cls, *fields: Unpack[FieldT]) -> Tuple[Unpack[FieldT]]:
        if any(not isinstance(field, Field) for field in fields):
            raise TypeError(
                'Expected all select arguments to be an instance of Field'
            )

        # for simplicites sake we'll just take the first record
        user = fake_db[0]

        # we have to add a type: ignore comment here as the type checker
        # cannot understand the relationship between this expression
        # and the input.
        return tuple(  # type: ignore
            user[field.name] for field in fields if isinstance(field, Field)
        )


user = User.select(User.name, User.id)
reveal_type(user)  # Revealed type is Tuple[str, int]
print(user)        # ('Robert', 1)
from typing import Tuple, Any
from typing_extensions import Unpack, TypeVarTuple, reveal_type


FieldT = TypeVarTuple('FieldT')


class Field:
    name: str

    def __init__(self, name: str) -> None:
        self.name = name


def field(*, name: str) -> Any:
    return Field(name=name)


fake_db = [
    {
        'id': 1,
        'name': 'Robert',
        'email': 'robert@craigie.dev',
        'hashed_password': 'foo',
    },
    {
        'id': 2,
        'name': 'Tegan',
        'email': 'tegan@craigie.dev',
        'hashed_password': 'bar',
    },
]


class User:
    id: int = field(name='id')
    name: str = field(name='name')
    email: str = field(name='email')
    hashed_password: str = field(name='hashed_password')

    @classmethod
    def select(cls, *fields: Unpack[FieldT]) -> Tuple[Unpack[FieldT]]:
        if any(not isinstance(field, Field) for field in fields):
            raise TypeError(
                'Expected all select arguments to be an instance of Field'
            )

        # for simplicites sake we'll just take the first record
        user = fake_db[0]

        # we have to add a type: ignore comment here as the type checker
        # cannot understand the relationship between this expression
        # and the input.
        return tuple(  # type: ignore
            user[field.name] for field in fields if isinstance(field, Field)
        )


user = User.select(User.name, User.id)
reveal_type(user)  # Revealed type is Tuple[str, int]
print(user)        # ('Robert', 1)