Python Bytes podkast

#473 A clean room rewrite?

16.03.2026
0:00
46:10
Do tyłu o 15 sekund
Do przodu o 15 sekund
Topics covered in this episode:
Watch on YouTube

About the show

Sponsored by us! Support our work through:

Connect with the hosts

Join us on YouTube at pythonbytes.fm/live to be part of the audience. Usually Monday at 10am PT. Older video versions available there too.

Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to our friends of the show list, we'll never share it.

Michael #1: chardet ,AI, and licensing

  • Thanks Ian Lessing
  • Wow, where to start?
  • A bit of legal precedence research.
  • Chardet dispute shows how AI will kill software licensing, argues Bruce Perens on the Register
  • Also see this GitHub issue.
  • Dan Blanchard, maintainer of a Python character encoding detection library called chardet, released a new version of the library under a new software license. (LGPL → MIT)
  • Dan is allowed to make this change because v7 is a complete “clean room” rewrite using AI
  • BTW, v7 is WAY better:
    • The result is a 48x increase in detection speed for a project that lives in the hot loops of many projects. That will lead to noticeable performance increases for literally millions of users (the package gets ~130M downloads per month).
    • It paves a path towards inclusion in the standard library (assuming they don’t institute policies against using AI tools).
    • Thread-safe detect() and detect_all() with no measurable overhead; scales on free-threaded Python 3.13t+
  • An individual claiming to be Mark Pilgrim, the original creator of the library, opened an issue in the project's GitHub repo arguing that Blanchard had no right to change the software license, citing the LPGL requirement that the license remain unchanged.
  • A 'complete rewrite' is irrelevant, since they had ample exposure to the originally licensed code (i.e. this is not a 'clean room' implementation).
  • Blanchard disagreed, citing how version 7.0.0 and 6.0.0 compare when subjected to JPlag, a library for detecting plagiarism.
  • Blanchard told The Register he had wanted to get chardet added to the Python standard library for more than a decade since it’s a core dependency to most Python projects.

Brian #2: refined-github

  • Suggested by Matthias Schöttle
  • A browser plugin that improves the GitHub experience
  • A sampling
    • Adds a build/CI status icon next to the repo’s name.
    • Adds a link back to the PR that ran the workflow.
    • Enables tab and shift tab for indentation in comment fields.
    • Auto-resizes comment fields to fit their content and no longer show scroll bars.
    • Highlights the most useful comment in issues.
    • Changes the default sort order of issues/PRs to Recently updated.
  • But really, it’s a huge list of improvements

Michael #3: pgdog: PostgreSQL connection pooler, load balancer and database sharder

  • PgDog is a proxy for scaling PostgreSQL.
  • It supports connection pooling, load balancing queries and sharding entire databases.
  • Written in Rust, PgDog is fast, secure and can manage thousands of connections on commodity hardware.
  • Features
    • PgDog is an application layer load balancer for PostgreSQL
    • Health Checks: PgDog maintains a real-time list of healthy hosts. When a database fails a health check, it's removed from the active rotation and queries are re-routed to other replicas
    • Single Endpoint: PgDog can detect writes (e.g. INSERT, UPDATE, CREATE TABLE, etc.) and send them to the primary, leaving the replicas to serve reads
    • Failover: PgDog monitors Postgres replication state and can automatically redirect writes to a different database if a replica is promoted
    • Sharding: PgDog is able to manage databases with multiple shards

Brian #4: Agentic Engineering Patterns

Extras

Brian:

Michael:

Joke: Ergonomic keyboard

Also pretty good and related:

Links

Więcej odcinków z kanału "Python Bytes"