PostgreSQL is a data management framework with the potential to engulf the entire database realm. Using Postgres for everything is becoming a mainstream best practice. ParadeDB and DuckDB add more performance, propelling PostgreSQL's analysis capabilities to the top tier of OLAP. Pigsty is a PostgreSQL distribution that aims to harness the collective power of PostgreSQL ecosystem extensions and democratize access to production-grade database services.
Wednesday, March 20, 2024This walkthrough teaches readers how to build a simple PostgreSQL server impersonator in Python. It’s a useful exercise in understanding the PostgreSQL protocol and studying attack patterns. The server mimics the initial PostgreSQL handshake sequence, including authentication, and successfully fools the psql client into thinking it's a PostgreSQL server.
Levels.fyi developed a scalable fuzzy search solution using PostgreSQL to help users quickly find relevant salary information. Its team's initial approach used simple LIKE queries, but evolved to use materialized views for optimization and full-text search with tsvector for improved results. To further improve search accuracy, they implemented a custom relevance algorithm that used factors like exact matches, popularity, and similarity.
PostgreSQL would be easier to develop with if it had versioned schema, better online schema migrations, and declarative state-based migrations.
PostgreSQL's query optimizer has improved massively over the past decade. Using the Join Order Benchmark (JOB), this author shows that tail latency has been nearly halved between PostgreSQL versions 8 and 16, with each major version offering an average 15% performance increase. One of the best decisions teams can make to make their database query speeds faster is to simply keep their Postgres instances up to date.
This developer discovered a significant performance issue in a database query used for indexing posts in their application Mattermost. The query was initially slow due to too much filtering, but was sped up by using PostgreSQL's row constructor comparisons. To help find this speed boost, the developer used the BUFFERS option in EXPLAIN statements for detailed insights and prioritized Index Cond over Filter for efficient queries.
This article describes a pattern for geographically distributing PostgreSQL databases for multi-tenant applications using only standard PostgreSQL functionality. The pattern involves separating per-tenant data from control plane data, placing tenant data in the nearest region, creating a global view using Foreign Data Wrappers, and partitioning, while keeping authentication and control plane data centralized. This approach lowers latencies, complies with data residency laws, and allows edge computing while maintaining most PostgreSQL features and ACID guarantees within tenants.
PostgreSQL with the pgvector extension offers an efficient way to store and query embeddings. It offers simplified querying, data consistency, and better performance compared to using separate databases for relational and vector data.
The 2024 Stack Overflow Developer Survey revealed that JavaScript and PostgreSQL remain the most popular technologies, while Rust and Markdown are the most admired. Developers are increasingly frustrated by technical debt at work, but they don't perceive AI as a threat to their jobs. Although 76% of developers are using or planning to use AI tools, many remain skeptical about their accuracy and ability to handle complex tasks.
PostgreSQL can be used as a search engine. Combining full-text search, semantic search with pgvector and fuzzy matching with pg_trgm makes PostgreSQL a good-enough search engine for a majority of use cases. This article goes into more advanced techniques to personalized search experiences, adjust for document length, debug rankings, and more.