Date: 2023-05-31
Time: 11:00–11:45
Room: DMS 1160
Level: Intermediate
Welp, I went and did it. I designed a database outside of the database, converting Postgres to a relatively dumb storage engine.
Why? The New York Times needed to develop a service in which all data in the database is fully-encrypted, such that in the event of a breach the data would be all but useless. That means we can’t take advantage of so many of PostgreSQL’s amazing semantic tools for data management an indexing.
We adopted a fairly simple key/value pattern, with a UUID primary key and an encrypted payload. But, crucially, we needed to provide an interface for looking up records by secondary IDs without actually storing those values. What did we do instead?
We took advantage of the emerging patterns of JSON Schema, JSON Path, and JSONB indexing to create hashed JSON objects for querying secondary keys. The result is a system that puts the power of entity schema management in the hands of our users while optimizing the search performance without compromising personal data.
This talk will describe the overall design in detail, highlighting the JSON schema definition, data encryption, indexing, and the query performance enabled by PostgreSQL’s GIN index and JSONPath search capabilities.
The following slides have been made available for this session: