Slide 1

Slide 1 text

The next generation SQLAlchemy with asyncio Joongi Kim / Lablup Inc.

Slide 2

Slide 2 text

No content

Slide 3

Slide 3 text

§ SQLAlchemy in 1 minute – Core API & ORM § Python databases in the asyncio ecosystem – aiopg + aiopg.sa / asyncpg + asyncpgsa § Technical issues when using aiopg & SQLAlchemy § Landing of SQLAlchemy v1.4 § Technical issues during the migration Agenda 1

Slide 4

Slide 4 text

§ “The de-facto standard DBAPI & ORM for Python programs” SQLAlchemy in 1 minute ORM Core DBAPI Database Schema / Types SQL Engine Dialect Connection Pool https://speakerdeck.com/zzzeek/introduction-to- sqlalchemy-pycon-2013?slide=8 2

Slide 5

Slide 5 text

§ 1-to-1 translation of Python's chained function constructor to SQL statement SQLAlchemy Core API >>> from sqlalchemy import Table, Column, Integer, String, MetaData >>> metadata = MetaData() >>> users = Table('users', metadata, ... Column('id', Integer, primary_key=True), ... Column('name', String), ... Column('fullname', String), ... ) >>> metadata.create_all(engine) https://docs.sqlalchemy.org/en/14/core/tutorial.html 3

Slide 6

Slide 6 text

§ 1-to-1 translation of Python's chained function constructor to SQL statement SQLAlchemy Core API >>> ins = users.insert().values(name='jack', fullname='Jack Jones') >>> str(ins) 'INSERT INTO users (name, fullname) VALUES (:name, :fullname)' >>> s = select(users.c.name, users.c.fullname) >>> result = conn.execute(s) >>> for row in result: ... print(row) (u'jack', u'Jack Jones') (u'wendy', u'Wendy Williams') https://docs.sqlalchemy.org/en/14/core/tutorial.html 4

Slide 7

Slide 7 text

§ 1-to-1 mapping of Python objects to SQL table rows SQLAlchemy ORM https://docs.sqlalchemy.org/en/14/orm/tutorial.html >>> from sqlalchemy.orm import declarative_base >>> from sqlalchemy import Column, Integer, String >>> Base = declarative_base() >>> class User(Base): ... __tablename__ = 'users' ... id = Column(Integer, primary_key=True) ... name = Column(String) ... fullname = Column(String) ... nickname = Column(String) ... ... def __repr__(self): ... return "" % ( ... self.name, self.fullname, self.nickname, ... ) >>> Base.metadata.create_all(engine) 5

Slide 8

Slide 8 text

§ 1-to-1 mapping of Python objects to SQL table rows SQLAlchemy ORM https://docs.sqlalchemy.org/en/14/orm/tutorial.html >>> from sqlalchemy.orm import sessionmaker >>> Session = sessionmaker(bind=engine) >>> session = Session() >>> ed_user = User(name='ed', fullname='Ed Jones', nickname='edsnickname') >>> session.add(ed_user) >>> session.commit() >>> for instance in session.query(User).order_by(User.id): ... print(instance.name, instance.fullname) ed Ed Jones wendy Wendy Williams mary Mary Contrary fred Fred Flintstone 6

Slide 9

Slide 9 text

§ SQLAlchemy core binding for asyncio – aiopg + aiopg.sa – asyncpg + asyncpgsa § ORM on the Greenlet ecosystem – peewee § Yet another ORM with multi-database backends – databases + orm Python ORMs in the asyncio ecosystem 7

Slide 10

Slide 10 text

aiopg + aiopg.sa (Core only!) import asyncio import sqlalchemy as sa from aiopg.sa import create_engine metadata = sa.MetaData() tbl = sa.Table(…) async def create_table(conn): await conn.execute('DROP TABLE IF EXISTS tbl') await conn.execute('CREATE TABLE tbl (id serial PRIMARY KEY, val varchar(255))') async def main(): async with create_engine( user='postgres', password='helloworld', host='127.0.0.1', port=15432, database='testing', ) as engine: async with engine.acquire() as conn: await create_table(conn) async with engine.acquire() as conn: await conn.execute(tbl.insert().values(val='abc')) async for row in conn.execute(tbl.select()): print(row.id, row.val) tbl = sa.Table( 'tbl', metadata, sa.Column('id', sa.Integer, primary_key=True), sa.Column('val', sa.String(255)), ) 8

Slide 11

Slide 11 text

asyncpg + asyncpgsa (Core only!) import asyncio import sqlalchemy as sa import asyncpgsa metadata = sa.MetaData() tbl = sa.Table(…) async def create_table(conn): await conn.execute('DROP TABLE IF EXISTS tbl') await conn.execute('CREATE TABLE tbl (id serial PRIMARY KEY, val varchar(255))') async def main(): pool = await asyncpgsa.create_pool( host='127.0.0.1', port=15432, database='testing', user='postgres', password='helloworld', min_size=5, max_size=10, ) async with pool.acquire() as conn: await create_table(conn) async with pool.acquire() as conn: await conn.execute(tbl.insert().values(val='abc')) for row in await conn.fetch(tbl.select()): print(row['id'], row['val']) tbl = sa.Table( 'tbl', metadata, sa.Column('id', sa.Integer, primary_key=True), sa.Column('val', sa.String(255)), ) 9

Slide 12

Slide 12 text

databases + SQLAlchemy (Core only) import asyncio import sqlalchemy as sa from databases import Database metadata = sa.MetaData() tbl = sa.Table(…) async def create_table(conn): await conn.execute('DROP TABLE IF EXISTS tbl') await conn.execute('CREATE TABLE tbl (id serial PRIMARY KEY, val varchar(255))') async def main(): database = Database( 'postgresql://postgres:[email protected]:15432/testing' ) await database.connect() async with database.connection() as conn: async with conn.transaction(): await create_table(conn) async with database.connection() as conn: async with conn.transaction(): await conn.execute(query=tbl.insert(), values={'val': 'abc'}) for row in await conn.fetch_all(query=tbl.select()): print(row['id'], row['val']) await database.disconnect() 10

Slide 13

Slide 13 text

§ So, we could not use ORM. 😭 – Introduced lots of boilerplate codes to convert between Row objects and actual Python objects. § Still, SQLAlchemy offers: – Structured schema version management via alembic – Seamless custom type conversion for complex-type columns (e.g., JSON) § Binding – Tried asyncpgsa at first but later moved to aiopg... Backend.AI + SQLAlchemy 11

Slide 14

Slide 14 text

§ Unable to cancel the async task while beginning a transaction – The timing looks like a slit but it happens in production! – Solution: had to wrap many db transactions used in long-running websocket connection handlers with asyncio.shield()... ▹ asyncio.shield() itself is not an ideal API since it still may get cancelled upon event loop shutdown. – Patched after 1.5yr since the first report (aio-libs/aiopg#332) Technical issues with aiopg 12

Slide 15

Slide 15 text

§ No Windows support! – Backend.AI has a plan to support Windows setup. – AbstractEventLoop.{add_reader(), remove_reader()} are not implemented on ProactorEventLoop Technical issues with aiopg 13

Slide 16

Slide 16 text

§ Handling schema migrations – One answer for SQLAlchemy: alembic – Tricky case: adding/removing enum constants ▹ Need to cast enum constants to text back and forth ▹ Need to handle removal for downgrade – Easy-to-mistake case: Referring other table's schema in migrations ▹ Migrations may run in arbitrary git commits & branches! ▹ Referred table schemas must be duplicated in the migration script to reflect its definition at the time of writing the migration. Technical issues with Alembic 14

Slide 17

Slide 17 text

Table definitions in Alembic migrations 15

Slide 18

Slide 18 text

§ Incremental updates of PostgreSQL JSONB columns – How to “update” an hierarchy of nested objects by merging with the existing value? – Solution: wrote a custom helper that generates SQL value expression Technical issues with SQLAlchemy def sql_json_merge( col, key: Tuple[str, ...], obj: Mapping[str, Any], *, _depth: int = 0, ): expr = sa.func.coalesce( col if _depth == 0 else col[key[:_depth]], sa.text("'{}'::jsonb"), ).concat( sa.func.jsonb_build_object( key[_depth], ( sa.func.coalesce(col[key], sa.text("'{}'::jsonb")) .concat(sa.func.cast(obj, psql.JSONB)) if _depth == len(key) - 1 else sql_json_merge(col, key, obj=obj, _depth=_depth + 1) ) ) ) return expr sa.update(tbl).values(status_data=sql_json_merge( tbl.c.status_data, ("scheduler",), {"last_try": datetime.now(tzutc()).isoformat(), "passed_predicates": ["concurrency", "keypair_resource_policy"]}, ) { "scheduler": { "first_try": "2021-05-01T02:01:01.00Z", "last_try": "2021-05-01T02:03:04.55Z", "passed_predicates": ["concurrency"], } } 16

Slide 19

Slide 19 text

§ https://www.sqlalchemy.org/blog/2021/03/15/sqlalchemy-1.4.0-released/ – “This is the most extensive release for SQLAlchemy in at least ten years” – Native asyncio support for both Core and ORM! – Type annotation support (sqlalchemy2-stubs) – Universal & transparent SQL compilation cache – 2.0-style transaction models (optional) – Core & ORM queries with unified query composition Landing of SQLAlchemy v1.4 17

Slide 20

Slide 20 text

§ Recommended upgrade path – https://docs.sqlalchemy.org/en/14/changelog/migration_20.html SQLAlchemy Roadmap 1.3 1.4 2.0 ü A stepping stone for existing apps ü Ensure no deprecation warnings raised ü Subtle changes ü Changed behavior of SELECT and join() methods ü Python 3.6+ required ü Turn on "RemovedIn20Warning" ü Explicit connection context manager required ü Explicit binding for metadata required ü Add "future" flag to Engine and Session objects for new transaction behaviors ü No library-level autocommit ü Rows are now named tuples instead of dicts ü session.execute() everywhere in ORM ü ... 18

Slide 21

Slide 21 text

SQLAlchemy v1.4 Core import asyncio import sqlalchemy as sa from sqlalchemy.ext.asyncio import create_async_engine metadata = sa.MetaData() tbl = sa.Table(…) async def create_table(conn): await conn.run_sync(metadata.drop_all) await conn.run_sync(metadata.create_all) async def main(): engine = create_async_engine( "postgresql+asyncpg://postgres:[email protected]:15432/testing" ) async with engine.begin() as conn: await create_table(conn) async with engine.begin() as conn: await conn.execute(sa.insert(tbl).values(val='abc')) result = await conn.stream(sa.select(tbl)) async for row in result: print(row.id, row.val) await engine.dispose() 19

Slide 22

Slide 22 text

SQLAlchemy v1.4 ORM import asyncio import sqlalchemy as sa from sqlalchemy.ext.asyncio import ( AsyncSession, create_async_engine ) from sqlalchemy.orm import declarative_base, sessionmaker Base = declarative_base() class A(Base): __tablename__ = "a" __mapper_args__ = {"eager_defaults": True} id = sa.Column(sa.Integer, primary_key=True) val = sa.Column(sa.String(255)) created_at = sa.Column( sa.DateTime, server_default=sa.func.now(), ) async def create_table(conn): await conn.run_sync(Base.metadata.drop_all) await conn.run_sync(Base.metadata.create_all) async def main(): engine = create_async_engine( "postgresql+asyncpg://...", ) async with engine.begin() as conn: await create_table(conn) async_session = sessionmaker( bind=engine, expire_on_commit=False, class_=AsyncSession, ) async with async_session() as session: async with session.begin(): session.add_all( [A(val="abc"), A(val="def")] ) await session.commit() async with async_session() as session: result = await session.execute(sa.select(A)) for a in result.scalars(): print(a.id, a.val, a.created_at) await engine.dispose() 20

Slide 23

Slide 23 text

§ Invisible errors – Transactions silently hangs up! ▹ It should have reported “operation in progress” DBAPI error – Patched through a series of v1.4 patch releases (v1.4.13+) ▹ nested transaction begin() missing Technical issues during SA 1.4 migration 21

Slide 24

Slide 24 text

§ Swallowing of task cancellation – Patched in v1.4.19 (discovered and not released yet at the time of writing this CFP…..) – Yes, “proper” cancellation is a difficult problem in asyncio…. § greenlet is missing! – pip install "SQLAlchemy[postgresql_asyncpg]" – Update poetry to 1.1.8 or later! – Some envs may still need to install greenlet manually (e.g., M1 Mac) ... Technical issues during SA 1.4 migration 22

Slide 25

Slide 25 text

§ Transaction isolation levels – Status transition of Backend.AI sessions1) involves many async tasks on multiple different nodes ▹ Should be globally synchronized to guarantee resource constraints! – READ COMMITTED vs. REPEATABLE READ vs. SERIALIZABLE Technical issues during SA 1.4 migration 1) Backend.AI session is a collection of containers running in one or more computing agent nodes in GPU clusters. 23

Slide 26

Slide 26 text

§ Transaction isolation levels – Solution: SERIALIZABLE + “proper” retrying of transactions + advisory lock + marking “read-only” transactions explicitly for performance Technical issues during SA 1.4 migration class ExtendedAsyncSAEngine(sqlalchemy.ext.asyncio.AsyncEngine): @aiotools.ctxmgr async def begin_readonly(self, deferrable: bool = False) -> AsyncIterator[AsyncConnection]: async with self.connect() as conn: conn_with_exec_opts = await conn.execution_options( postgresql_readonly=True, postgresql_deferrable=deferrable, ) async with conn_with_exec_opts.begin(): yield conn_with_exec_opts @aiotools.actxmgr async def advisory_lock(self, lock_id: int) -> AsyncIterator[None]: async with self.connect() as lock_conn: await lock_conn.exec_driver_sql(f"SELECT pg_advisory_lock({lock_id:d})") try: yield finally: await lock_conn.exec_driver_sql(f"SELECT pg_advisory_unlock({lock_id:d})") 24

Slide 27

Slide 27 text

§ Transaction isolation levels – Solution: SERIALIZABLE + “proper” retrying of transactions + advisory lock + marking “read-only” transactions explicitly for performance Technical issues during SA 1.4 migration async def enumerate_instances(self, check_shadow=True): async with self.db.begin_readonly() as conn: query = (sa.select('*').select_from(agents)) ... async for row in (await conn.stream(query)): yield row async def update_instance(self, inst_id, updated_fields): async def _update() -> None: async with self.db.begin() as conn: query = ( sa.update(agents) .values(**updated_fields) .where(agents.c.id == inst_id) ) await conn.execute(query) await execute_with_retry(_update) TQueryResult = TypeVar('TQueryResult') async def execute_with_retry( txn_func: Callable[[], Awaitable[TQueryResult]], ) -> TQueryResult: max_retries: Final = 10 num_retries = 0 while True: if num_retries == max_retries: raise RuntimeError("serialization failed") try: return await txn_func() except DBAPIError as e: num_retries += 1 if getattr(e.orig, 'pgcode', None) == '40001': await asyncio.sleep((num_retries - 1) * 0.02) continue raise 25

Slide 28

Slide 28 text

§ Planning removal of boilerplates like below used due to no async ORM... – Still ongoing! SA 1.4 ORM adoption @attr.s(auto_attribs=True, slots=True) class PendingSession: kernels: List[KernelInfo] access_key: AccessKey session_id: SessionId ... internal_data: Optional[MutableMapping[str, Any]] preopen_ports: List[int] @classmethod def from_row(cls, row: Row) -> PendingSession: return cls( kernels=[], access_key=row['access_key'], session_creation_id=row['session_creation_id'], session_id=row['session_id'], ... internal_data=row['internal_data'], preopen_ports=row['preopen_ports'], ) 26

Slide 29

Slide 29 text

§ Transaction isolation levels – Bigger view of the problem ▹ Is RDBMS suitable for strongly-consistent distributed state management? ▹ In early versions of Backend.AI, we used Redis but have migrated to PostgreSQL in favor of well-structured data schemes as more features added. – Why now (with SQLAlchemy v1.4 and not with aiopg)? ▹ Maybe due to their internal connection pooling mechanism differences... ▹ SA reuses the underlying DBAPI connection but recreates the connection wrapper (“fairy”) objects everytime Technical issues during SA 1.4 migration 27

Slide 30

Slide 30 text

§ SQLAlchemy v1.4 is out! – with asyncio-native ORM and Core APIs based on asyncpg § Bleeding edge is bleeding edge... – It takes time for everyone (even experience developers) to deeply get used with asyncio (+cancellation) § Strongly-consistent state management on RDBMS is difficult. – Consider non-RDBMS solutions if possible – Use a proper transaction isolation level and retry serialization failures Summary 28

Slide 31

Slide 31 text

Bonus: What it's like replacing the DB engine of a live product EDS TV Commercial (2000) https://www.youtube.com/watch?v=L2zqTYgcpfg

Slide 32

Slide 32 text

§ https://github.com/achimnol/pycon-kr2021-samples § https://docs.sqlalchemy.org/en/14/ – https://docs.sqlalchemy.org/en/14/_static/dragons.png § https://aiopg.readthedocs.io/en/stable/examples.html § https://asyncpgsa.readthedocs.io/en/latest/ References 29