PostgreSQL Specific Information
PostgreSQL Proxy Restrictions
- PostgreSQL allows the user to send multiple statements in one query. Such query is treated as a transaction: if any of statements failed the query is roll backed entirely even in auto commit mode. The PostgreSQL proxy splits such queries to single statement queries and execute them separately via JDBC. This leads to the fact that if one of statements fails and auto commit mode is enabled all statements before it won't be roll backed. To use such a syntax, please use a true transaction start and commit to wrap the queries to avoid any issues.
- The vdb property of suppressNoResult=true is available. With Postgres, if an update query is executed via the executeQuery result, it will generate an exception on return saying "No results were returned by the query". In some frameworks, this is detected and suppressed when using the native Postgres driver, but not with the Heimdall driver. In order to work around this behavior, this option can trigger us returning a null instead of a resultset, which appears to allow the calling code to work fine. This applies when using Heimdall as a JDBC driver, but should not apply when as a proxy.
Prepared Statements
Postgres (and other databases) have the option to use prepared statements. When used however, this triggers "state" on the connection and this state will potentially result in sub-optimal behavior of Heimdall or other proxies. In particular, it will break multiplexing behavior, which is needed to dramatically reduce connection counts. In the case of Java applications, there is a parameter that can be set, "preferQueryMode". If set to "simple" it will effectively "de-prepare" the statements in a safe way, and allow multiplexing to work. One caution should be noted however: Many applications rely on an undocumented behavior of Postres with prepared statements where typecasts that are otherwise required may not be required when prepared statements are used. If setting preferQueryMode to simple results in a typecast error, then this is effectively exposing a bug in the application code (or the framework generating the query).
Search_path support
When search paths are used, it is suggested that the search path be configured on the application user instead of dynamically changing the path at runtime. This ensures that a consistent path is used. If inconsistent search paths are used, this can break cache consistency and read/write split handling. To set the search path on a user, use the command:
alter user user1 set search_path='my_schema, "$user", public';
If this is not sufficient, or if overlapping table names are used across schemas, another resolution can be done to ensure read/write split and multiplexing works. In this case, you can add a JDBC URL parameter of currentSchema=${schema} to the URLs. This will fill in any value from the set search_path into the currentSchema value, and when we pull connections from the connection pools, we will use a connection connected with the proper value. You can then inspect the connections made with the "show pools" command.
Schema Assist for Dependencies
When using Postgres or a Postgres derived database, parsing is performed to detect view dependencies, in order to automate invalidation. This is only performed at proxy start at this time.
PG Notify Support
Postgres supports notification support, i.e. it provides a built-in pub/sub interface. When Postgres is used, this option can be explicitly controlled with a connection property of "hdUsePGNotify" and a value of true or false. To specify the database(s) to listen to, use "hdNotifyDBs" with a comma separated list of database names. Also, * can be used as a wildcard, with -dbname to remove from the list pulled from the database, i.e. "*,-template0,-template1" would be a solid starting point for configuring this feature. The catalogs are cached for up to one minute between attempts to poll, so if new catalogs are added, we will start listening for invalidations within a minute.
When used, this will override the need to have a grid cache for invalidations, although it does not automatically enable multi-primary Heimdall management nodes as a grid cache does. This also enables the below trigger invalidation that leverages the pg_notify interface. The purpose of this option is to allow a simplified deployment without the use of an external cache, but still maintain cache coherency between multiple proxies.
Clear pool option
There is an option to clear a user pool even when it is impossible to get a connection, e.g. when all connections from the pool are busy, they cannot be closed and the maximum number of connections is reached. Two parameters can be set to issue a clear pool before making a connection.
-
Application name - setting "ApplicationName" to "clear_pool" when attempting to connect will clear this users user pool of all previous connections. To do so supply the connection URL adding at its end the following parameters: "?ApplicationName=clear_pool&assumeMinServerVersion=VERSION", where VERSION is the version of Postgres Server currently in use. VERSION has to be at least 9.0 for the JDBC driver to recognize custom application name parameters. Alternatively, use psql and provide a custom application name as "application_name=clear_pool" when attempting a connection.
-
Custom option - provide clear_pool=true as an option by adding the following to the connection URL: "?options=clear_pool=true". The clear behavior is the same as it is when setting the application name to "clear_pool".
It's recommended to close those connections after connecting as reconnecting may occur which will clear the user's user pool again.
Trigger Invalidation with Postgres
Postgres provides a listen/notify interface that can be used to push invalidation messages directly from the database to the proxy. In order to configure this, use the following:
Connection property of hdUsePGNotify, value of true, then configure the following trigger function and trigger on each table:
CREATE OR REPLACE FUNCTION heimdall.changed()
RETURNS TRIGGER AS $$
BEGIN
PERFORM pg_notify('heimdall', 'invalidate-trigger,'||current_database()||'.'||TG_TABLE_SCHEMA||'.'||TG_TABLE_NAME||','||((date_part('epoch', now())*1000)::bigint)::varchar);
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
-- for each table on the system, perform the following:
CREATE TRIGGER trigger_table_changed
BEFORE INSERT OR UPDATE OR DELETE ON test
EXECUTE PROCEDURE heimdall.changed();
The result of this is near instant invalidation of tables, even if a write is done to PostgreSQL without going through Heimdall. Log messages with debug mode will indicate when this is functional.
When this is in place, this adjusts a few other behaviors. When active, invalidations will be issued through the DB notify interface vs. the grid cache as well. This enables local only cache configurations to effectively work without a grid cache. Additionally, if a table is invalidated via the grid cache with the above configuration, then additional notification messages will not be generated for that table by Heimdall, but other tables that have not been invalidated by the trigger will continue to have invalidations from Heimdall. This allows a mixed setup, so that frequently updated tables may have the trigger, while infrequently updated tables (or new tables) are invalidated directly. This will reduce the overhead of maintaining the triggers as the DB schema changes, while optimizing the number of invalidation messages needed.
This also supports invalidation of tables that are changed via stored procedures, other triggers, or are modified in other ways. It is still a requirement that stored procedures be tagged with the tables they are reading, but writes will be automatically accounted for.
Alternate Trigger Invalidation with Postgres (via polling)
This technique requires polling against the database, which adds load and latency to the invalidation process, but is documented for completion.
Connection properties:
- dbTriggerTracking=true
- dbChanges=select ts,name from heimdall.tableupdates where ts > @ts
- dbTimeMs=select (date_part('epoch', now())*1000)::bigint
Note: The following example is drafted for use with older Postgres, so doesn't use the ON CONFLICT syntax available in PG 9.5+
CREATE SCHEMA heimdall;
CREATE TABLE heimdall.tableupdates (
name text PRIMARY KEY,
ts bigint not null );
CREATE OR REPLACE FUNCTION heimdall.changed()
RETURNS TRIGGER AS $$
BEGIN
LOOP
UPDATE heimdall.tableupdates SET ts=(date_part('epoch', now())*1000)::bigint, name=current_database()||'.'||TG_TABLE_SCHEMA||'.'||TG_TABLE_NAME;
IF found THEN
RETURN NEW;
END IF;
BEGIN
INSERT INTO heimdall.tableupdates VALUES ((date_part('epoch', now())*1000)::bigint, current_database()||'.'||TG_TABLE_SCHEMA||'.'||TG_TABLE_NAME);
RETURN NEW;
EXCEPTION WHEN unique_violation THEN
END;
END LOOP;
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE OR REPLACE TRIGGER trigger_table_changed
BEFORE INSERT OR UPDATE OR DELETE ON test
FOR EACH ROW
EXECUTE PROCEDURE heimdall.changed();
Active Directory Password & Group Synchronization
CREATE OR REPLACE FUNCTION heimdall.sync_user(username text, password text, ldapgroups text)
RETURNS void
LANGUAGE plpgsql
AS $function$
DECLARE
sql text;
grouplist text[];
g text;
groupsToRevoke text[];
lookupTableName text;
lookupTableKey text; -- ldap_group
lookupTableValue text; -- role
lookupLdapGroups text[];
lookupRoles text[];
lg text;
current_schema_name text;
already_processed_group boolean;
create_roles_if_missing boolean;
BEGIN
current_schema_name := current_schema();
create_roles_if_missing := TRUE; -- set to false to prevent role creation
lookupTableName := 'lookup_table';
lookupTableKey := 'ldap_group';
lookupTableValue := 'role';
-- prevent syncing special users (if necessary)
IF username IN ('api') THEN
RAISE NOTICE '%.sync_user: skip syncing for special user "%"', current_schema_name, username;
RETURN;
END IF;
-- do not synchronize any user that is flagged as superuser
IF EXISTS(SELECT usesuper FROM pg_user WHERE user = pg_catalog.quote_literal(username)) THEN
RAISE NOTICE '%.sync_user: User % is superuser, not synchronizing', current_schema_name, username;
RETURN;
END IF;
IF EXISTS (
SELECT 1
FROM information_schema.tables
WHERE table_schema = current_schema_name AND table_name = lookupTableName
) THEN
-- get lookupLdapGroups
EXECUTE 'SELECT array_agg(lt.' || lookupTableKey || ') FROM ' || current_schema_name || '.' || lookupTableName || ' AS lt'
INTO lookupLdapGroups;
RAISE NOTICE '%.sync_user: Lookup table loaded: %', current_schema_name, lookupLdapGroups;
END IF;
-- sync user and password
BEGIN
sql := 'CREATE USER ' || quote_ident(username) || ' WITH PASSWORD ''' || password || ''';';
EXECUTE sql;
sql := 'CREATE USER ' || quote_ident(username) || ' WITH PASSWORD ''' || 'REDACTED' || ''';';
RAISE NOTICE '%.sync_user: % created', current_schema_name, sql;
EXCEPTION
WHEN Duplicate_Object THEN -- alter password when user exists
sql := 'ALTER USER ' || quote_ident(username) || ' WITH PASSWORD ''' || password || ''';';
BEGIN
EXECUTE sql;
sql := 'ALTER USER ' || quote_ident(username) || ' WITH PASSWORD ''' || 'REDACTED' || ''';';
RAISE NOTICE '%.sync_user: % updated', current_schema_name, sql;
EXCEPTION
WHEN SQLSTATE 'XX000' THEN
IF SQLERRM LIKE '%tuple concurrently updated%' THEN
-- ignore, may occur as we use connection pooling for synchronization
ELSE
RAISE;
END IF;
END;
END;
-- sync group/role membership
grouplist := string_to_array(ldapgroups, ',');
groupsToRevoke := array(SELECT b.rolname
FROM (SELECT rolname, oid
FROM pg_roles
WHERE rolname != ALL(grouplist) EXCEPT SELECT usename, usesysid
FROM pg_user) b
WHERE pg_has_role(username, b.oid, 'member'));
IF array_length(groupsToRevoke, 1) > 0 THEN
FOR i IN array_lower(groupsToRevoke, 1) .. array_upper(groupsToRevoke, 1) LOOP
g := groupsToRevoke[i];
IF position('=' in g) > 0 THEN -- capture the name if ldapgroups is passed in as cn=groupname
g := trim(TRAILING '"' FROM split_part(groupsToRevoke[i], '=', 2));
ELSE
g := trim(BOTH '"' FROM trim(BOTH FROM groupsToRevoke[i]));
END IF;
-- now revoke membership of the role from the user
sql := 'REVOKE ' || quote_ident(g) || ' FROM ' || quote_ident(username) || ';';
RAISE NOTICE '%.sync_user: %', current_schema_name, sql;
IF sql IS NOT NULL THEN
BEGIN
EXECUTE sql;
EXCEPTION
WHEN SQLSTATE 'XX000' THEN
IF SQLERRM LIKE '%tuple concurrently updated%' THEN
-- ignore, may occur as we use connection pooling for synchronization
ELSE
RAISE;
END IF;
END;
END IF;
END LOOP;
END IF;
IF array_length(grouplist, 1) > 0 then
FOR i IN array_lower(grouplist, 1) .. array_upper(grouplist, 1) LOOP
g := grouplist[i];
already_processed_group := false;
RAISE NOTICE '%.sync_user: processing group: %', current_schema_name, g;
IF position('=' in g) > 0 THEN -- capture the name if ldapgroups is passed in as cn=groupname
g := trim(TRAILING '"' FROM split_part(grouplist[i], '=', 2));
ELSE
g := trim(BOTH '"' FROM trim(BOTH FROM grouplist[i]));
END IF;
-- handle additional ldap_group -> role(s) aliases
IF array_length(lookupLdapGroups, 1) > 0 then
FOR lg_i IN array_lower(lookupLdapGroups, 1) .. array_upper(lookupLdapGroups, 1) LOOP
lg := lookupLdapGroups[lg_i];
RAISE NOTICE '%.sync_user: processing group: % vs. lookupGroup: %', current_schema_name, g, lg;
IF lg = g THEN
-- get all roles mapped to given ldap_group
EXECUTE 'SELECT array_agg(lt.' || lookupTableValue || ') FROM ' || current_schema_name || '.' || lookupTableName || ' AS lt WHERE lt.' || lookupTableKey || ' LIKE ' || quote_literal(lg)
INTO lookupRoles;
RAISE NOTICE '%.sync_user: ldap_group % mappings: %', current_schema_name, lg, lookupRoles;
-- if role is an empty string then no role is mapped for given ldap_group
IF array_length(lookupRoles, 1) > 0 THEN
-- process all roles mapped to given ldap_group
FOR r_i IN array_lower(lookupRoles, 1) .. array_upper(lookupRoles, 1) LOOP
g := lookupRoles[r_i];
IF g IS NOT NULL AND g <> '' THEN
IF create_roles_if_missing THEN
BEGIN
sql := 'CREATE ROLE ' || quote_ident(g);
EXECUTE sql;
RAISE NOTICE '%.sync_user: %', current_schema_name, sql;
EXCEPTION
WHEN Duplicate_Object THEN
RAISE NOTICE '%.sync_user: role % already exists', current_schema_name, g;
END;
END IF;
-- now grant membership of the role to the user
BEGIN
sql := 'GRANT ' || quote_ident(g) || ' TO ' || quote_ident(username) || ';';
EXECUTE sql;
RAISE NOTICE '%.sync_user: %', current_schema_name, sql;
EXCEPTION
WHEN SQLSTATE '42704' THEN
RAISE NOTICE '%.sync_user: role % does not exist', current_schema_name, g;
END;
already_processed_group := true;
END IF;
END LOOP;
END IF;
END IF;
END LOOP;
END IF;
--lookup_table does not exist
IF g IS NOT NULL AND g <> '' AND already_processed_group = FALSE THEN
-- make sure groups are created
IF create_roles_if_missing THEN
BEGIN
sql := 'CREATE ROLE ' || quote_ident(g);
EXECUTE sql;
RAISE NOTICE '%.sync_user: %', current_schema_name, sql;
EXCEPTION
WHEN Duplicate_Object THEN
RAISE NOTICE '%.sync_user: role % already exists', current_schema_name, g;
END;
END IF;
BEGIN
-- now grant membership of the role to the user
sql := 'GRANT ' || quote_ident(g) || ' TO ' || quote_ident(username) || ';';
EXECUTE sql;
RAISE NOTICE '%.sync_user: %', current_schema_name, sql;
EXCEPTION
WHEN SQLSTATE '42704' THEN
RAISE NOTICE '%.sync_user: role % does not exist', current_schema_name, g;
END;
END IF;
END LOOP;
END IF;
RETURN;
END;
$function$
;
Active Directory Password & Group Synchronization extra options
By default, a role with the same name is created in the database for each extracted ldap group. By creating the table below for each key found (ldap_group), a role with the specified name in the value (role) will be created instead. Composite Primary Key allows to map more than one role to particular ldap_group; If 'role' is empty then no role is created and granted to user for this particular ldap_group.
CREATE TABLE IF NOT EXISTS heimdall.lookup_table (
ldap_group varchar(255),
role varchar(64),
PRIMARY KEY (ldap_group, role)
)