PostgreSQL Specific Information

PostgreSQL Proxy Restrictions

  • PostgreSQL allows the user to send multiple statements in one query. Such query is treated as a transaction: if any of statements failed the query is roll backed entirely even in auto commit mode. The PostgreSQL proxy splits such queries to single statement queries and execute them separately via JDBC. This leads to the fact that if one of statements fails and auto commit mode is enabled all statements before it won't be roll backed. To use such a syntax, please use a true transaction start and commit to wrap the queries to avoid any issues.
  • The vdb property of suppressNoResult=true is available. With Postgres, if an update query is executed via the executeQuery result, it will generate an exception on return saying "No results were returned by the query". In some frameworks, this is detected and suppressed when using the native Postgres driver, but not with the Heimdall driver. In order to work around this behavior, this option can trigger us returning a null instead of a resultset, which appears to allow the calling code to work fine. This applies when using Heimdall as a JDBC driver, but should not apply when as a proxy.

Prepared Statements

Postgres (and other databases) have the option to use prepared statements. When used however, this triggers "state" on the connection and this state will potentially result in sub-optimal behavior of Heimdall or other proxies. In particular, it will break multiplexing behavior, which is needed to dramatically reduce connection counts. In the case of Java applications, there is a parameter that can be set, "preferQueryMode". If set to "simple" it will effectively "de-prepare" the statements in a safe way, and allow multiplexing to work. One caution should be noted however: Many applications rely on an undocumented behavior of Postres with prepared statements where typecasts that are otherwise required may not be required when prepared statements are used. If setting preferQueryMode to simple results in a typecast error, then this is effectively exposing a bug in the application code (or the framework generating the query).

Search_path support

When search paths are used, it is suggested that the search path be configured on the application user instead of dynamically changing the path at runtime. This ensures that a consistent path is used. If inconsistent search paths are used, this can break cache consistency and read/write split handling. To set the search path on a user, use the command:

alter user user1 set search_path='my_schema, "$user", public';

If this is not sufficient, or if overlapping table names are used across schemas, another resolution can be done to ensure read/write split and multiplexing works. In this case, you can add a JDBC URL parameter of currentSchema=${schema} to the URLs. This will fill in any value from the set search_path into the currentSchema value, and when we pull connections from the connection pools, we will use a connection connected with the proper value. You can then inspect the connections made with the "show pools" command.

Schema Assist for Dependencies

When using Postgres or a Postgres derived database, parsing is performed to detect view dependencies, in order to automate invalidation. This is only performed at proxy start at this time.

PG Notify Support

Postgres supports notification support, i.e. it provides a built-in pub/sub interface. When Postgres is used, this option can be explicitly controlled with a connection property of "hdUsePGNotify" and a value of true or false. To specify the database(s) to listen to, use "hdNotifyDBs" with a comma separated list of database names. Also, * can be used as a wildcard, with -dbname to remove from the list pulled from the database, i.e. "*,-template0,-template1" would be a solid starting point for configuring this feature. The catalogs are cached for up to one minute between attempts to poll, so if new catalogs are added, we will start listening for invalidations within a minute.

When used, this will override the need to have a grid cache for invalidations, although it does not automatically enable multi-primary Heimdall management nodes as a grid cache does. This also enables the below trigger invalidation that leverages the pg_notify interface. The purpose of this option is to allow a simplified deployment without the use of an external cache, but still maintain cache coherency between multiple proxies.

Trigger Invalidation with Postgres

Postgres provides a listen/notify interface that can be used to push invalidation messages directly from the database to the proxy. In order to configure this, use the following:

Connection property of hdUsePGNotify, value of true, then configure the following trigger function and trigger on each table:


CREATE OR REPLACE FUNCTION heimdall.changed()
RETURNS TRIGGER AS $$
BEGIN
  PERFORM pg_notify('heimdall', 'invalidate-trigger,'||current_database()||'.'||TG_TABLE_SCHEMA||'.'||TG_TABLE_NAME||','||((date_part('epoch', now())*1000)::bigint)::varchar);
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

-- for each table on the system, perform the following:
CREATE TRIGGER trigger_table_changed
  BEFORE INSERT OR UPDATE OR DELETE ON test
  EXECUTE PROCEDURE heimdall.changed();

The result of this is near instant invalidation of tables, even if a write is done to PostgreSQL without going through Heimdall. Log messages with debug mode will indicate when this is functional.

When this is in place, this adjusts a few other behaviors. When active, invalidations will be issued through the DB notify interface vs. the grid cache as well. This enables local only cache configurations to effectively work without a grid cache. Additionally, if a table is invalidated via the grid cache with the above configuration, then additional notification messages will not be generated for that table by Heimdall, but other tables that have not been invalidated by the trigger will continue to have invalidations from Heimdall. This allows a mixed setup, so that frequently updated tables may have the trigger, while infrequently updated tables (or new tables) are invalidated directly. This will reduce the overhead of maintaining the triggers as the DB schema changes, while optimizing the number of invalidation messages needed.

This also supports invalidation of tables that are changed via stored procedures, other triggers, or are modified in other ways. It is still a requirement that stored procedures be tagged with the tables they are reading, but writes will be automatically accounted for.

Alternate Trigger Invalidation with Postgres (via polling)

This technique requires polling against the database, which adds load and latency to the invalidation process, but is documented for completion.

Connection properties:

  • dbTriggerTracking=true
  • dbChanges=select ts,name from heimdall.tableupdates where ts > @ts
  • dbTimeMs=select (date_part('epoch', now())*1000)::bigint

Note: The following example is drafted for use with older Postgres, so doesn't use the ON CONFLICT syntax available in PG 9.5+


CREATE SCHEMA heimdall;

CREATE TABLE heimdall.tableupdates (
name text PRIMARY KEY,
ts bigint not null );

CREATE OR REPLACE FUNCTION heimdall.changed()
RETURNS TRIGGER AS $$
BEGIN
  LOOP
    UPDATE heimdall.tableupdates SET ts=(date_part('epoch', now())*1000)::bigint, name=current_database()||'.'||TG_TABLE_SCHEMA||'.'||TG_TABLE_NAME;
    IF found THEN
      RETURN NEW;
    END IF;
    BEGIN
       INSERT INTO heimdall.tableupdates VALUES ((date_part('epoch', now())*1000)::bigint, current_database()||'.'||TG_TABLE_SCHEMA||'.'||TG_TABLE_NAME);
       RETURN NEW;
    EXCEPTION WHEN unique_violation THEN
    END;
  END LOOP;
  RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE OR REPLACE TRIGGER trigger_table_changed
  BEFORE INSERT OR UPDATE OR DELETE ON test
  FOR EACH ROW
  EXECUTE PROCEDURE heimdall.changed();

Active Directory Password & Group Synchronization

CREATE OR REPLACE FUNCTION heimdall.sync_user(username text, password text, ldapgroups text)
    RETURNS Void
    LANGUAGE plpgsql
AS
$function$
DECLARE
    sql       text;
    grouplist text[];
    g         text;
    groupsToRevoke text[];

BEGIN
    -- prevent syncing special users (if necessary)
    IF username IN ('api') THEN
        RAISE NOTICE 'heimdall.sync_user: skip syncing for special user "%"', username;
        RETURN;
    END IF;

    -- do not synchronize any user that is flagged as superuser
    IF EXISTS(SELECT usesuper FROM pg_user WHERE user = pg_catalog.quote_literal(username)) THEN
        RAISE NOTICE 'heimdall.sync_user: User % is superuser, not synchronizing', username;
        RETURN;
    END IF;

    -- sync user and password
    BEGIN
        sql := 'CREATE USER ' || quote_ident(username) || ' WITH PASSWORD ''' || 'REDACTED' || ''';';
        RAISE NOTICE 'heimdall.sync_user: %', sql;
        sql := 'CREATE USER ' || quote_ident(username) || ' WITH PASSWORD ''' || password || ''';';
        EXECUTE sql;
    EXCEPTION
        WHEN Duplicate_Object THEN -- alter password when user exists
            sql := 'ALTER USER ' || quote_ident(username) || ' WITH PASSWORD ''' || 'REDACTED' || ''';';
            RAISE NOTICE 'heimdall.sync_user: user exists, %', sql;
            sql := 'ALTER USER ' || quote_ident(username) || ' WITH PASSWORD ''' || password || ''';';
            EXECUTE sql;
    END;

   -- sync group/role membership
    grouplist := string_to_array(ldapgroups, ',');
    groupsToRevoke := array(SELECT b.rolname 
                            FROM (SELECT rolname, oid 
                                  FROM pg_roles 
                                  WHERE rolname != ALL(grouplist) EXCEPT SELECT usename, usesysid 
                                                                         FROM pg_user) b 
                                                                         WHERE pg_has_role(username, b.oid, 'member'));
    IF array_length(groupsToRevoke, 1) > 0 THEN
        FOR i IN array_lower(groupsToRevoke, 1) .. array_upper(groupsToRevoke, 1) LOOP
            g := groupsToRevoke[i];
            IF position('=' in g) > 0 THEN -- capture the name if ldapgroups is passed in as cn=groupname
                g := trim(TRAILING '"' FROM split_part(groupsToRevoke[i], '=', 2));
            ELSE
                g := trim(BOTH '"' FROM trim(BOTH FROM groupsToRevoke[i]));
            END IF;

            -- now revoke membership of the role from the user
            sql := 'REVOKE ' || quote_ident(g) || ' FROM ' || quote_ident(username) || ';';
            RAISE NOTICE 'heimdall.sync_user: %', sql;
            IF sql IS NOT NULL THEN
                EXECUTE sql;
            END IF;
        END LOOP;
    END IF;

    IF array_length(grouplist, 1) > 0 THEN    
        FOR i IN array_lower(grouplist, 1) .. array_upper(grouplist, 1) LOOP
            g := grouplist[i];
            IF position('=' in g) > 0 THEN -- capture the name if ldapgroups is passed in as cn=groupname
                g := trim(TRAILING '"' FROM split_part(grouplist[i], '=', 2));
            ELSE
                g := trim(BOTH '"' FROM trim(BOTH FROM grouplist[i]));
            END IF;

            -- make sure groups are created
            BEGIN
                sql := 'CREATE ROLE ' || quote_ident(g);
                EXECUTE sql;
                RAISE NOTICE 'heimdall.sync_user: %', sql;
            EXCEPTION
                WHEN Duplicate_Object THEN
                    RAISE NOTICE 'heimdall.sync_user: group % already exists', g;
            END;

            -- now grant membership of the role to the user
            sql := 'GRANT ' || quote_ident(g) || ' TO ' || quote_ident(username) || ';';
            RAISE NOTICE 'heimdall.sync_user: %', sql;
            EXECUTE sql;

        END LOOP;
    END IF;
RETURN;
END;
$function$;