Thoughts on PostgreSQL in 2024
A question I often hear, and also ask myself, is “where is PostgreSQL going?” This is a deep question: it’s not limited to the work on the core database engine, but rather everything going on in the community, including related open source projects and event and community development. Even with the popularity of PostgreSQL, which was selected as DB Engine’s “DBMS of the Year” for the fourth time, it’s a good idea to step back at times and reflect on what PostgreSQL will look like in the future. While it may not necessarily lead to immediate changes, it does help give context to all the work going on in the community.
The new year is a great opportunity to ask “where is PostgreSQL going?” and is a question I’ve been personally reflecting on. So here are some of my thoughts on where PostgreSQL is going as we enter into 2024. This is not meant to be a roadmap, but rather personal thoughts on where PostgreSQL is going.
PostgreSQL feature development
At the PGCon 2023 Developer Meeting, I proposed a topic entitled “What are the big challenges for PostgreSQL users?”. The goal of this was to talk about both common user requests and understand where database workloads were heading to determine if we are building PostgreSQL towards where database workloads are going. Based on many conversation and observations, I proposed three broad feature buckets to look at:
- Developer features
These are all ongoing areas of work for 2024 and beyond, but there are definitely steps PostgreSQL can take in the coming year to make improvements in all of these areas. Below I dive into more details about each of these feature groups.
Continuing to improve the availability of a PostgreSQL cluster is the first, second, and third most requested “feature” I hear about from both current and prospective PostgreSQL users. I’m not exaggerating either: while restarting PostgreSQL can be nearly instantaneous, there are use cases where that can be too much time (though those are at the extreme). Additionally, operations with locks that block writes for prolonged periods of time can be considered as “downtime.”
While most PostgreSQL users can currently achieve their uptime requirements, there is a class of workloads with critical uptime requirements that we can better support in PostgreSQL with additional development effort. Most of this section (and blog post) focuses on one feature area where continued improvements will allow PostgreSQL to be deployed in even more environments that have these requirements.
How logical replication can help with active-active, blue/green, zero-downtime upgrading, and other workflows
For existing PostgreSQL users and users looking to migrate to PostgreSQL, features around availability are the biggest ask. Typically, this centers around high availability, or the ability to continue to have access to the database (especially read/write access) during a planned (update) or unplanned (outage) disruption. PostgreSQL provides many features to support high availability, including streaming replication. However, maximizing HA still requires the use of an additional service or a utility like Patroni to achieve many uptime goals.
Many users I talk to are happy with the availability they can get with PostgreSQL: it works for most of their use cases. However, I’ve been seeing an emerging trend of workloads on PostgreSQL that need even higher availability, where a 15-30s offline window isn’t good enough. This is both for planned outages (e.g. minor version upgrades, major version upgrades) and unplanned outages. I’ve even talked to users who have workloads that can only be unavailable for 1s – and while I was initially skeptical, when I heard what the workloads were for, I did agree that 1s was a reasonable requirement for them!
A key feature for PostgreSQL that will continue to improve availability is logical replication. Logical replication allows the real-time streaming of changes from a database into any system that can understand the PostgreSQL logical replication protocol. Logical replication in PostgreSQL has been around for awhile, but recent releases have added significant enhancements that can better support availability use-cases, including functionality and performance features.
One advantage this has over physical (or binary) replication is that you can use logical replication to stream changes from a PostgreSQL 15 to a PostgreSQL 16 system as part of a major version upgrade. This can help reduce the amount of downtime it takes to perform a major version upgrade (here is an example of how Instacart used logical replication to get to zero-downtime on major version upgrades), but there is still work to be done in PostgreSQL to improve this use case and other high availability use cases. Additional features will help unlock more seamless ways of supporting blue-green deployments in PostgreSQL.
Logical replication can also be used as part of the high availability mechanism itself. One technique, “active-active replication,” allows multiple databases can simultaneously accept writes and replicate the changes amongst themselves. This technique is typically used in systems that have that “no more than 1s of unavailability” requirement: if a writer database is unavailable, then an application can switch its database traffic to a different writer database without waiting for it to be promoted. While this sounds ideal, building and managing an active-active system is extremely complicated: it impacts application design, requires you to have a write-conflict management and resolution strategy, and requires careful fault tolerance monitoring to help ensure data integrity (e.g. a “conflict storm”) and replication health (e.g. what happens if an instance can’t replicate changes for several hours?).
However, both the major version upgrade and active-active cases do present a roadmap for how we can continue to improve logical replication in PostgreSQL. Amit Kapila, who has led many of the logical replication feature efforts, and I developed a talk this year called The journey towards active-active replication in PostgreSQL (video of us co-presening one version) that talks about why solving for these use cases are important, the current state-of-the-art of PostgreSQL logical replication, and what work we need to do to get PostgreSQL to better support these cases. The good news: as of PostgreSQL 16, we have most of the foundational blocks for supporting active-active, blue-green deployments, and zero downtime major version upgrades – and even if they are not in core, there are PostgreSQL extensions that can provide this functionality (disclosure: I’ve been involved with one such extension,
There are multiple efforts in 2024 to help close these feature gaps. Targeted for PostgreSQL 17 (usual disclaimer that these may not be included), there has been a focus on ensuring logical replication can work with key workflows, such as
pg_upgrade and in high availability systems, and working to support replication of additional changes (e.g. sequences). Beyond that, we must continue to support more commands in logical replication (e.g. DDL), continue to improve performance (more parallelism support, worker optimizations), and add features that simplify management of logical replication (node synchronization/resynchronization).
All of these efforts will make it possible to use PostgreSQL in more workloads that have very high uptime requirements, and simplify how users roll out new changes to their production environments. While there’s still more work to do with enhancing logical replication in PostgreSQL, it looks like 2024 will give us more features that help users run PostgreSQL in critical environments.
Unblocking the locks
Another area of availability to consider is around schema maintenance operations (i.e. DDL statements), such an
ALTER TABLE that takes an
ACCESS EXCLUSIVE lock on the table that blocks all other write operations on that table. For many users, this is the same thing as being unavailable, even if it’s only to a subset of their data. Lack of full support for nonblocking/online schema maintenance operations in PostgreSQL has become more noticeable as other relational databases include support for this feature.
There are various utilities and extensions that let you run nonblocking schema updates, but it would be more convenient, and likely performant, to support more nonblocking schema changes natively in PostgreSQL. Based on the design, we may already have the foundation to build out this feature, but it will take some time. While I’m not aware of active implementation efforts, I do think in 2024 we need to make more progress on making it possible for users to run most, if not all, DDL commands without blocking writes, if they so choose.
Performance is very much a “what have you done for me lately” feature: we can always go faster! The good news is that PostgreSQL has a reputation of vertically scaling, or being able to scale as you provide more hardware resources to a single instance. While there are use cases where horizontally scaling both reads and writes makes sense, we do need to continue to ensure PostgreSQL can continue to scale as compute and memory resources continue to grow.
Here’s a more “practical” way of putting it: there is an Amazon EC2 instance that has 448 vCPU and 24TB of RAM – is PostgreSQL able to fully maximize its use of all of those resources on a single instance? Looking at the current and upcoming hardware that PostgreSQL users will use gives us a measured target for how we can continue to improve PostgreSQL performance.
As we enter 2024, there are already multiple efforts that will help make it possible to continue to vertically scale PostgreSQL. One of the biggest efforts, and one that’s been an ongoing multi-year project, is to support direct IO (DIO) and asynchronous IO (AIO) in PostgreSQL. For details, I’ll defer to Andres Freund’s PGConf.EU slides on the status of adding AIO to PostgreSQL, but it looks like in 2024 that we’ll be much closer to full AIO support.
Another effort I’m intrigued by is parallel recovery. PostgreSQL users with heavy write workloads tend to postpone checkpoints to defer I/O workload. This can be problematic on a busy system if PostgreSQL crashes and a checkpoint has not occurred for awhile. When PostgreSQL restarts, it enters “crash recovery” where it replays every change since the last checkpoint so it can reach a consistent state. During crash recovery, PostgreSQL cannot accept reads or writes, which means that it’s unavailable. This is problematic for busy stems: while PostgreSQL can accept concurrent writes, it can only replay changes with a single process. If a crash on a busy system occurred an hour after the last checkpoint, it could take several more hours to reach a consistent state while the system is offline!
One way to help overcome this limitation is to support “parallel recovery,” or being able to replay changes in parallel. At PGCon 2023, Koichi Suzuki gave a detailed presentation on how PostgreSQL can support parallel recovery. This would apply not only to crash recovery, but how PostgreSQL can replay any WAL changes (e.g. point-in-time-recovery). While this is a very challenging problem to solve, supporting parallel recovery helps PostgreSQL to continue to scale vertically, as users can further optimize for heavy write workloads and mitigate the risk of a crash causing an untenable delay in coming back online.
This is not an exhaustive list of performance-related features. There are many more efforts around PostgreSQL server performance, including indexing optimizations, locking improvements, leveraging hardware acceleration, and more. This in addition to work on clients, such as drivers and connection poolers, that can bring additional performance gains to how apps interact with PostgreSQL. Looking at what the community is working on in 2024, I do believe we’ll continue to see general performance gains across all areas of PostgreSQL.
I view “developer features” as a fairly broad category around how users can architect and build their apps around PostgreSQL. This includes SQL syntax, functions, procedural language support, and other features that help users both build apps and transition from other database systems. One example of such an innovation is the
multirange data type, added in PostgreSQL 14, which let users group non-contiguous ranges together. This had many practical purposes, such as in scheduling, and personally let me reduce hundreds of lines of PL/pgSQL code into roughly three lines. Developer features is also a way to keep track of how PostgreSQL can support emergent workloads, such as JSON or vectors.
Currently, a lot of innovation on PostgreSQL developer features is occurring in extensions, which is an advantage of PostgreSQL’s extensible model. In the server itself, there are areas where PostgreSQL is lagging behind its previous pace of releasing developer features. For example, PostgreSQL was the first relational database to support JSON as a queryable data type, but has been lagging on implementing syntax and features specified in the SQL/JSON standard. PostgreSQL 16 released several of the SQL/JSON syntax features, and there are multiple efforts targeted for 2024 that will include more of the SQL/JSON specification.
With that said, we should be investing in adding developer features in PostgreSQL that are not possible to add in extensions, such as SQL standard features. I suggest a focus of features that are already available in other databases, such as more of the SQL/JSON standard (e.g.
JSON_TABLE), system versioned tables (useful for auditing and “flashback” / bitemporal queries to view data at a specific point in time), and module support (useful for “packaging” stored procedures).
Additionally, with the previously mentioned focus on availability and performance, we should continue to simplify how users can migrate from other databases to PostgreSQL. As part of my day job, I had the opportunity to read through a lot of content around migration strategies from commercial databases to PostgreSQL, and there’s still ample opportunity to simplify the process while enhancing PostgreSQL capabilities. This includes features available in other databases (e.g. global temporary tables, global partitioned indexes, autonomous transactions) and adding more functionality and performance optimizations in PL/pgSQL (bulk data processing functions, schema variables, caching function metadata). All these things improve the PostgreSQL developer experience while making it easier for users coming from other relational databases to adopt PostgreSQL.
Finally, we need to see how we can continue to support the emergent workload coming from AI/ML data, specifically vector storage and search. At PGCon 2023, while folks wanted to see native vector support in PostgreSQL itself, there was consensus that implementing functionality in an extension like pgvector would let us support these workloads more quickly (and this strategy seems to have worked with great performance results on vector data). However, given many of the properties of vector workloads, there are additions we can make to PostgreSQL to further support them, including planner optimizations for working with TOAST’d data that’s in the active query path, and exploring how we can better support queries where the bulk filtering step occurs in the
ORDER BY clause.
I do think we can make a lot of progress on all of these areas in 2024 and continue to add features directly to PostgreSQL that make it easier to build applications, even as we see a boon of functionality in extensions around PostgreSQL.
But what about security?
I do want to quickly disucss security features. PostgreSQL does have a strong reputation for enabling workloads in security-focused environments, but there is always more to do. The past several years, adding native support for transparent data encryption (TDE) in PostgreSQL has received a lot of attention, but there are other areas we can continue to innovate. This includes adding support for additional authentication methods or mechanisms (OIDC is amongst the biggest asks) and exploring the possibility of a federation authorization model to allow PostgreSQL to inherit permissions from other systems. And while this is challenging today, I’d suggest we look at how we can support TDE on a per-database level. I’m keeping this discussion short as there are ways to satisfy the requirements that these features would add to PostgreSQL today, but we can certainly continue to build towards full native support.
And with that, let’s look at other areas where PostgreSQL can make progress in 2024.
PostgreSQL was designed to be extensible: you can add functionality to PostgreSQL without having to fork it. This includes new data types, indexing methods, ways to work with other database systems, utilities that make it easier to manage PostgreSQL features, additional programming languages, and even extensions that let you write your own extensions. People have built open source communities and companies around specific PostgreSQL extensions (e.g. PostGIS), and PostgreSQL extensions have made it possible to support all kinds of workloads (geospatial, timeseries, analytical, AI) from a single database. With thousands of available PostgreSQL extensions, they truly are a “force multiplier” for PostgreSQL and help drive significant adoption while letting users quickly build functionality for their databases!
The side-effect of all this is we’re now seeing “extension sprawl.” How do I know which extension to use? What is the level of support of an extension? How do I know an extension will continue to be actively maintained? How can I help contribute a feature to an extension? Even “where can I download an extension” has become a big question: while postgresql.org has an incomplete list of extensions and the community packages maintain a set of extensions, there are now multiple PostgreSQL extension repositories available (PGXN, dbdev, Trunk), and pgxman.
One of the strengths of the PostgreSQL community is that it is widely distributed, but we can make it easier to help guide users through the sprawl make informed choices about how they manage their data. I see 2024 as an opportunity to put more central resources into how we represent PostgreSQL extensions, and help users understand when to use certain extensions and their development maturity level, and likewise help extension builders with both governance and maintenance resources.
I wanted to round out thoughts for 2024 around community building. The PostgreSQL contributor community has significantly grown since I first started, and the community has done a better job of recognizing contributors to all parts of the project, not just the code base (noting that there is still room for improvement here). But we can continue to do better, and there are three areas I’d like to specifically highlight: mentorship and DEI, and transparency, which will help in all areas of the project.
During the developer meeting @ PGCon 2023, Melanie Plageman gave a very detailed analysis of the experience of being a newer contributor to PostgreSQL and the challenges it takes to ramp up. Melanie identified many problems: ramp up time on learning basic of how to contribute to PostgreSQL (getting started on the codebase, communicating on the mailing list), the effort to get a patch to a committable state and having a committer interested in it, guidance that may be given with the best of intentions (get started by reviewing patches!) which actually may be more challenging than writing code, and how feedback, when delivered, is delivered.
On the last point, how to give feedback, I want to call out an excellent blog post by Robert Haas that specifically addresses the power of giving praise while delivering feedback – these things do make a difference and it’s a good reminder in general that we should be supportive even while we’re being critical.
Back to Melanie’s points, mentorship is something we can do better across the community. Personally, I admit I have been bad at this in areas around project advocacy, including helping to get more people to contribute to the web infrastructure and the release process. This doesn’t mean PostgreSQL lacks mentorship – I can count numerous folks in the community as mentors – but we can be better in terms of how we can help people get started with contributing and finding mentors who can guide them on their journey.
2024 serves as a gateway to building better mentorship processes, and we’re looking to test some of these ideas at PGConf.dev 2024 in May 2024 in Vancouver.
(Some history: Before PGConf.dev, PGCon was the event where PostgreSQL contributors gathered to discuss strategic projects for the upcoming development cycle. PGCon was organized by Dan Langille from 2007 to 2023, and after umpteen years of organizing, he was ready to extend the efforts to a group of folks and helped to establish PGConf.dev).
PGConf.dev is a conference for folks who want to contribute to PostgreSQL, and covers topics around PostgreSQL development (both core server and all open source projects around PostgreSQL such as extensions and drivers), community building, and open source thought leadership. A big portion of PGConf.dev is dedicated to mentorship, and is planning to include workshops around how to contribute to PostgreSQL. If you’re looking for ways to help contribute to PostgreSQL, I strongly suggest attending or submitting a talk!
This leads into how the PostgreSQL community can improve in DEI. I strongly suggest reading the slides and watching the video (when it’s available) of Karen Jex and Lætitia AVROT’s PGConf.eu 2023 talk Trying to be Barbie in Ken’s Mojo Dojo Casa House, as it’s an insightful presentation on how we can continue to make the PostgreSQL community more inclusive. The community has made progress in this area (and Karen and Lætitia point to initiatives that have helped with this), but we can still be better, and we should actively and proactively work to address feedback to help ensure contributing to PostgreSQL is a welcoming experience. There are actions we can all take, for example, calling out an inappropriate (e.g. sexist) behavior as it happens and providing guidance on why it’s not appropriate.
Finally, there’s transparency. This might seem odd in open source, given, well, it’s open. But there are quite a few governance issues that are discussed not in the open, and it helps to understand how decisions are made. The PostgreSQL Code of Conduct Committee provides an excellent example of how the community can be transparent about issues that require sensitivity. Each year, the Code of Conduct committee publishes a report (here is the one from 2022) of its work, including high level descriptions of cases and overall statistics. This is a practice we can reproduce across many of the PostgreSQL teams that are involved in tasks that may require privacy due to their sensitivity.
Conclusion: This was originally supposed to be a shorter post
When I originally started writing this, I thought it’d be a pithy post that I’d finish in a few hours. A few days later…
In all seriousness, PostgreSQL is in a good place. It remains popular, and its reputation for reliability, robustness, and performance remain sound. But we can still do better, and the good news is that the community is actively working towards improving in every which way
While these are thoughts for what PostgreSQL can do in 2024 and beyond, there’s so much PostgreSQL already does today. In fact, asking questions like “where is PostgreSQL going” does give us an opportunity to step back and reflect on all the progress PostgreSQL has made over the past several years while looking ahead on what is to come!