VDB Overview

Virtual Databases (VDB) are the basic container for configurations, and aggregate all the settings that will apply to any traffic passed through the VDB. There are two ways to access a VDB, first via JDBC, which supports any JDBC compliant data source, or via a protocol level proxy. In proxy mode, a separate process will be created, and database traffic would be routed through this proxy using a specific port

Enabled: if unchecked, this will result in a proxy disabling access via the port specified. In JDBC mode, this will result in an SQL exception being thrown on a new connection attempt.

Acces/Secret Keys: These are generally used in distributed proxy or JDBC mode to provide a shared user/password to allow the remote proxies to pull the complete vdb configuration.

Access Mode: The first step in configuring a VDB is to select the access mode, which will adjust what options are presented, to simplify the configuration:

JDBC Configuration

When in JDBC mode, a JDBC URL is provided in the GUI, which is used to configure the application to access the Heimdall Data Access Layer. The JDBC URL is of the format:

JDBC:heimdall://IP:port/{vdb name}?

Optionally, several parameters can be used in conjunction with the Heimdall Connect string:

  • hduser={username}: The username to authenticate against the Heimdall server with.
  • hdpassword={password}: The password for the Heimdall server, again overriding the parameter in the data source.
  • user={username}: The username to use to authenticate against the database with. If the event the hduser is not specified, this will ALSO be used as the default hduser value, in cases where the users on both match.
  • password={password}: The password for the database server. In the case where the hdpassword is not specified, this will also be used as the default hdpassword. Note: When using the user and password, the data source credentials will still be used for health checking the application and should have appropriate access for this task.

Note: Typically, only the hduser and hdpassword is used in the Heimdall JDBC URL, as the data source can provide the user and password options. The ability to specify this here is to allow applications to specify their own username and password, in general to allow many users to access a data source at once.

Proxy Configuration

Proxy mode operates by having the management server either start a process on its own, or having a proxy on a remote system connect. Once started, the proxy will open a port that a client can connect to as if the proxy were the database itself, and access data via that process.

  • Management Server Proxy: This option allows the management server to start and manage restarts of a proxy instance on the same server as the management server itself. This option is a great way to simplify the testing of Heimdall in limited environments.

When in proxy mode, several options will be available, two options being required: the address binding type, and the proxy port. The address binding type specifies the behavior of the proxy:

  • Any: In this binding mode, all local IP addresses will be bound to, or more specifically, it binds to "0.0.0.0";
  • Localhost Only: When this option is set, the binding will be to the 127.0.0.1 IP only;
  • Specific IP: This allows a specific IP address to be bound for use. An example of this is to use 172.17.0.1 in a docker container when bridge mode is enabled, so that the proxy can bind for use by any other containers on the same host, without impacting any other proxies that may reside on another host.

Linux Note: When the Localhost Only option is enabled for a proxy when run, then the proxy will also attempt to listen on the Unix socket that matches what a database would listen to for the MySQL and Postgres database types in order to allow clients the greatest level of compatibility. To disable this behavior yet use localhost listening behavior, please use the "Specific IP" option and specify 127.0.0.1 as the Proxy Address.

Proxy port: The second required option. This specifies the value of the TCP port that the proxy will be listened to. If there is an error binding to a given port or IP, then a GUI alert will be issued when the proxy attempts to start. It is important that the ports do not conflict with other proxies being run on the same host or binding, as only one process can bind to a specific IP:port combination at once. This applies if installed on the same server as a database as well--if the database is on port 3306 for example, than the proxy can not use the same port.

There are several other options available in the proxy configuration as well:

  • Max Heap Size(MB): (only visible if the "management server proxy option is set) The setting for the Proxy java heap memory limit. Default is '600M' which with overhead, will generally consume up to about 1GB of total RAM. This will also result in the setting "Xms" to set the smallest heap size, so as to try to provide more consistency in the "free memory" graph on the dashboard and to ensure that the total memory is always available. Please note, that in distributed proxy mode, this setting has no effect--the java options need to be configured via the user data or heimdall.conf file on the remote instance.

  • multiplex: Do load balancing to the server at the transaction level vs. the connection level. This can drastically reduce the number of connections being established to the server. Requires connection pooling to avoid a dramatic performance drop, and may have side-effects that break certain apps. Test carefully before using in production. DelayedTransactions and multiplexing can be used together to improve the query distribution for apps that support them.

  • Proxy TLS Support: When enabled, and the client requests it, this option will enable TLS negotiation. Initially, a self-signed certificate will be generated for the proxy, which can then be customized in the Java keystore file in the proxy install directory.

  • Proxy TLS Required: When TLS is enabled, this option will force all connections to connect only with TLS. Any attempt to connect without TLS will be rejected. For Postgres and MySQL, warnings will be provided to the client on the disconnect, with SQL Server, only an alert on the GUI will be presented.

  • Authentication Mode: This dropdown allows selecting of the proxy authentication mechanisms. Please see the Theory->Proxy Authentication Theory->proxy authentication page for more information on authentication.

  • Authentication Test: This section allows testing of authentication at the proxy layer. As full SQL based authentication can provide options such as limiting the scope by source IP, what database, or even if you are connecting via SLS or not, this gives a convient UI to determine if an authentication request would pass or fail the proxy layer. Please note--this still requires authentication at the database level, which is not tested via this interface.

The Proxy configuration section also provides a button to "Restart Proxy", which allows all local and remove proxy nodes to be restarted at once.

Caching

This set of options configures the base cache used by a given VDB. There can be only one cache per VDB at any time, and if the cache settings for a given type are to be changed at runtime, it is required that the cache be disabled first, then enabled again with the new settings. This will allow the cache to be completely torn down, and reinitialized, allowing the entire type of cache to be changed without application restart.

Options (what is visible will vary depending on what type of cache is selected):

  • Enable query caching: Check to enable caching–if not checked, no cache rules will trigger caching, although rule processing will be done as normal.
  • PCI/HIPAA Cache: When enabled, no cached objects will actually be transmitted to the grid-cache, but the cache will be used to message invalidation requests. This is to avoid transmitting possible sensitive data across unencrypted connections, while maintaining cache synchronization integrity. This option is mutually exclusive with disabling the grid cache offload function, and will remove the benefit of the cache in cold-start scenarios.
  • Auto-tune Cache: Enabling this option will enable the cache logic to disable caching in situations where it doesn’t make sense from a performance perspective. This includes if a query pattern isn’t providing any cache hits, or the benefit of the hits that do occur isn’t sufficient to justify caching.
  • Cache Type: Select the desired cache type. Options will adjust based on the cache type selected.
  • Server: For external grid cache interfaces, specifies the server name or IP to connect to.
  • Port: When using an external grid cache, specifies the TCP port to connect to.
  • Database Number: For Redis, which database "number" is used, not supported when the Redis instance is in cluster protocol mode.
  • Cache Password: For interfaces with a password authenticated interface, the password to connect to the cache with.
  • Use SSL: When connecting to the cache, should SSL be used for the connection (Redis)
  • Verify Peer: When using SSL, verify the peer TLS certificate
  • API Cache Name: For interfaces that utilize a “named” cache interface, this is the name of the cache to use, i.e. Hazelcast.
  • Cache Configuration File: When supported, this is an external configuration file to configure the cache, i.e. a Hazelcast client or server XML configuration file. The location of the file should be relative to the current working directory of the JDBC driver or proxy, which will be printed on startup. On a proxy install, when this file is not present, it will be pulled from the Heimdall Server.
  • Grid Cache Offload: This option, if enabled, enables a first-tier local cache, which allows hot content to be served out of local heap. In the local cache, the objects will be stored in the performance optimal format (binary streaming format for a proxy, and Java objects when not a proxy).
  • Max Expiry: If set to a value above zero, this value (in seconds) is the maximum TTL the L1 cache will cache an object for. This overrides the TTL in the cache rules if it is lower than the TTL set there FOR L1 cache queries. This value does not impact the TTL in the L2 cache such as Redis.
  • Max Cached Objects: If set to a value other than 0, this limits the total number of objects in the L1 cache layer to a fixed maximum. This can assist in reducing memory garbage collection time under heavy load, and is advised under very high request volume.
  • Customize query cache key: This allows the key used to access cached objects to be customized in order to expand when the result is considered the same. By default, all queries will include a hash of the complete SQL query, but also includes the VDB name the request was made through, the database user, and the database catalog. These fields can be removed to allow, for example, multiple VDBs to share the same data in the cache.
  • Cluster manager via cache: This option allows a VDB to be managed by multiple management nodes at once, via the cache interface. For this to work, all management nodes must be populated with the vdb, and the cache settings configured to be identical. Once done, configuration changes for that VDB will be synchronized between nodes, and the cache will be used to propagate metrics as well, allowing the dashboard to be used at the same time on both nodes. Further, if one management node goes offline, the access layer nodes will remain unaffected.
  • AWS Access Key, Secret Key, Tag, Tag Value: When using Hazelcast with AWS Auto-discovery, these options may be necessary for proper discovery of other Hazelcast nodes.

Note: With Amazon’s AWS Elasticache service, the Redis parameter group option of “notify-keyspace-events” should be set to the value of “AE” in order to optimize cache behavior. This will also be instructed in the log output. For non-Elasticache Redis servers, this option will be configured automatically.

Note 2: Memory allocation for cache is dynamically controlled based on free space allocated to heap. Use the VDB setting of "xmx" to adjust this (in vdb properties). It defaults to target about 1GB of total memory used by the proxy process.

Data Source & Rule List

Specify at least one data source, as a default for data to be retrieved from. If a forward policy is specified in the rules for a vdb, it also must be selected here to insure proper connectivity is established to that data source for the forward function to work properly. A connection to the data source will only be established when used if not the primary data source. A reasonable attempt will be made to insure that the data sources for forwarding and read/write split are automatically populated here, but in some rare corner cases (with dynamically generated properties), all data sources will need to be specified here.

Like the data source, the rule list selector configures what rules should be attached to the vdb. If empty, no rule list will be used. All rule lists that are used by the initial (default) rule-list must be specified or they will not be executed. A reasonable attempt will be made to insure that the rules used in "call" actions are automatically populated here, but in some rare corner cases (with dynamically generated properties), all rules will need to be specified here. The log may generate warnings if this is not setup properly.

Advanced Features

  • delayedTransaction: (proxy only) Delay when a transaction is started until a DML is detected. This is useful when using a framework that does everything in transactions, and the "in-trans" option in a rule isn't useful. Rules set to not match in a transaction will operate until a DML is encountered, then will not match until a commit/rollback. There may be other performance benefits on the server side as well depending on server. Test carefully before production use.
  • trackQueryDistributionCount: a count (Long) of the number of query entries to track for auto cache refreshing (default of 1000, requires autoRefresh cache option).
  • paranoia=Enable paranoid logging, don't reveal the SQL patterns or queries, only their hashes in debug logs
  • healthcheck-port: If the proxy should do a self-check on itself, and expose the status via the designated port. With an HTTP connection, request /status to determine the status of the proxy.
  • healthcheck-interval:
  • JMX Port: If JMX monitoring is desired, set the port here. Typically 1099 if used. Please
  • JMX Hostname: The hostname that JMX uses to refer to itself--the monitoring clients must resolve this name to the Heimdall instance. The value of ${hostname} can be used to map to the internally detected hostname, and can be combined with other string values, i.e. "${hostname}-proxy" could be used to provide proper resolution on a remote client that doesn't have access to the actual hostname via dns.
  • JMX Username: The username the JMX client will use to connect to the proxy
  • JMX Password: The password for JMX connectivity

Logging

The following log options are available:

  • Log Connections: to log when a JDBC or TCP connection is established by the calling application
  • Log All SQL: equivalent to enabling a LOG policy with a .* wildcard. If this is enabled, it will override any log rate control configured on a log policy.
  • AWS Cloudwatch: If in AWS, and selected, then multiple metric points will be generated under the Heimdall category for the VDB and AWS instance to allow monitoring of the access layer performance. Note, these metrics are generated on a per-minute basis, so may add to the cost of cloudwatch. Additionally, console logs will be shipped to cloudwatch as well when selected.
  • Aggregate Console Logs: In order to simplify debugging, this option allows console logs from the access layers to be sent to the central manager for logging. This avoids issues where the logs may not be showing, due to Log4j configuration, for example. When using the "management server proxy" option, this option will be implicitly configured for the management server managed proxy instance, even if not set for other proxies.
  • Write Logs to Files: When performing SQL logging, the logs are by default written to an internal database for optimal Analytics speed. If the logs should also be written to detailed CSV logs for each query, please select this option. This may result in a large amount of data written to the log directory, although logs will automatically be cleaned up when that filesystem reached 90% full.

Important: When logging SQL, a large amount of data may be generated, and logging can impact the overall performance. To avoid this, you can disable sql logging here, and instead use a log policy with parameters that limit how much data is written. Please see the rules section for details on parameters and the logging section for more details on logging overall.

Debugging

  • Pass-through Enabled: When in JDBC mode, the system can be set to pass any NEW connections through to the underlying driver, bypassing all processing logic. This should only be used in rare situations to debug if the system is causing a problem directly, or the existence of the system is causing a problem.
  • Verbose Debug Mode: In order to diagnose issues with rule processing or other behavior anomalies, this option can be set in order to track the processing of a query through the access layer. This option can cause significantly performance penalties and should be used with caution.
  • Log Methods: log all JDBC method calls made by an application, excluding those relating to resultSets. Warning!
  • Log ResultSet Methods: Include ResultSet operations when logging other methods. Warning!

Important: When adding in method logging, for every query generated, it may result in a dozen records for JDBC methods, and with resultset methods, every row will likely have one or more log records. Caution should be observed when using these two log options are used, and should in general only be used in low-volume lab conditions.

VDB Properties

Allows a list of name-value pairs to be used to configure various options, in general per direction from customer support, and for use with test releases to enable a fix. Example:

  • supressNoResult=true With Postgres, if an update query is executed via the executeQuery result, it will generate an exception on return saying "No results were returned by the query". In some frameworks, this is detected and suppressed when using the native Postgres driver, but not with the Heimdall driver. In order to work around this behavior, this option can trigger us returning a null instead of a resultset, which appears to allow the calling code to work fine.

  • connectionIdleTimeout=(proxy only) Connection timeout (in ms) for an idle client-side connection, will terminate connection's thread

  • dnsCacheTTL=The cache time (in s) to use for the dns resolution cache, defaults to 5s to conform with AWS requirements. Note: This will impact the global JVM setting for this if used in JDBC mode.

  • dnsNegativeTTL=The time to cache negative DNS (non-existent) queries, defaults to 10s per Java standard. Note: This will impact the global JVM setting for this if used in JDBC mode.

  • delayCacheInit=the time in seconds to delay initializing the cache, normally not needed.

Note When using the healthcheck-port option, an additional feature is enabled, that of API based invalidation. Please see the Cache theory section for more details.