EQuIS Migration Tool (Beta)

<< Click to Display Table of Contents >>

Navigation:  Database >

EQuIS Migration Tool (Beta)

Overview

How It Works

Data Tagging for Migration (Source Database)

Data Transfer to Server (Source Database and Separate Target Servers)

Data Validation

Data Migration into EQuIS Tables

EQuIS Migration Tool Limitations

 

Overview

 

The EQuIS Migration Tool (EMT), designed to migrate data from one EQuIS database to another, is currently in beta testing. The tool will replace the functionality of the previous Facility Migration Tool, transferring the data associated with a specified facility. EarthSoft plans to finalize the EMT in a future build of EQuIS 7. The EMT (Beta) has been added to the EQuIS_By_Request_Forms_All_[Build].zip package, located in the Downloads Dashboard of the EarthSoft Community Center for an organization's Primary Point of Contact (POC) under Products > Additional Resources > v7.0.

 

Some improvements of the EMT over the Facility Migration Tool include:

Removed the requirement in EMT that the “source” and “target” databases have the exact same Schemas applied and that they be updated to the same version.

Made EMT compatible with Azure SQL.

Made EMT more robust so that it can handle larger migrations and will not fail as easily during a migration process.

 

Further transfer options, such as exporting to an interim data file or migrating modules/subsets of data, are planned for a later release.

 

 

How It Works

 

The following diagram illustrates the flow of data from the source database to the target database.

 

DB_Pro-EMT-Process-Diagram

 

 

After the user specifies both the source and target databases, the migration process performs the following functions.

 

 

Data Tagging for Migration (Source Database)

 

This process identifies what data will be moved from the source to the target. EQuIS is a relational database with Foreign Keys establishing those relationships. This process starts with the DT_FACILITY table, tagging the single row representing the facility being migrated, then data from associated “children” tables, continuing down the table-tree level by level (from parent to child) until the process has reached the end of the table associations. It then moves back up the table-tree to tag any additional data needed by the “parent” relationship(s), primarily reference table records. It is thus possible to export only the Reference Values needed for the data being migrated.

 

Certain Reference Tables will bring their children records along during migration. A table showing which Parent Table – Child Table combinations will be migrated together is below:

 

Parent Tables

Child Tables

RT_DQM_CHECK

RT_DQM_CHECK_PARAM

RT_DQM_CHECK

RT_DQM_RULE

RT_DQM_QAPP

RT_DQM_CHECK

RT_DQM_QAPP

RT_DQM_QUALIFIER

RT_DQM_QAPP

RT_DQM_REASON

RT_DQM_QUALIFIER

RT_DQM_RULE

RT_DQM_REASON

RT_DQM_RULE

RT_GROUP

RT_GROUP_MEMBER

RT_MTH_ANL_GROUP

RT_MTH_ANL_GROUP_MEMBER

RT_REMAP

RT_REMAP_DETAIL

RT_TAXONOMY

RT_TAXONOMY_PARAMETER

 

A few tables in EQuIS do not use Foreign Keys to define their associated data – specifically DT_COST, DT_FILE, DT_GEOGRAPHY, DT_NOTE, and DT_GEO_TEST and related EQuIS Geotech tables. The final step of tagging looks at these special tables to tag any rows with associated data that has already been tagged (based on the column OBJECT_EUID or DB_RECORD_ID). For example, this step would tag a DT_NOTE table record associated with a location in the database.

 

Note on Action Levels (i.e., DT_ACTION_LEVEL*): Data in the DT_ACTION_LEVEL* tables may not receive the expected tags in these situations:

The FACILITY_ID column is not populated.

There is no subfacility assigned, and the action level(s) or a child data table to these tables do not contain data (e.g., rows in DT_FINDING have no associated action level).

 

To move any action levels, use the ActionLevels format.

 

 

Data Transfer to Server (Source Database and Separate Target Servers)

 

After tagging, data are transferred into the temporary tables associated with the target database for validation and final migration into the actual EQuIS tables. Where both the source and target data are accessible to the process, this transfer process is skipped.

 

The process automatically creates the temp tables based on the same from the source database. Only tables containing data tagged for migration will have a temp table created and have the data transferred. In addition to the EQuIS tables containing the data to be migrated, several temp tables created in the source database will also be transferred as they are used in the validation and final migration processes.

 

 

Data Validation

 

A Schema/data validation process compares the source and target Schemas to identify issues where data from the source database may be lost in the migration or Schema-based requirements of the target database will not allow source data. The validation process then queries the data to determine if an actual problem exists.

 

The problems that can exist are detailed in the table below, along with how the EQuIS Migration Tool handles those.

 

Concern

Migration Tool Behavior

Example

Any tables in the target database with a Primary or Unique key (PK/UK) that does not exist in the source database

If the target database is more restrictive than the source database and the data being migrated does not violate the more restrictive key, the data migrates.

 

Data that violates the more restrictive key cannot be migrated.

Test Alternate Key (TAK) on the DT_TEST table, where the target database is more restrictive (includes more fields in the TAK) than the source database.

Tables or Fields (columns) in the source database that do not exist in the target database

Any data populated in these will not migrate.

The source database has the DQM Schema, but not the target database. Records from tables like DT_RESULT_DQM will not migrate.

Fields (columns) required in the target database but not required in (or missing from) the source database

If there is no “default” value for the target database or a fix cannot automatically be applied, data for that table cannot be migrated.

The target database has the Geotech Schema (which adds a required WORKFLOW_STATUS column to DT_LOCATION), but not the source database.

 

DT_LOCATION.WORKFLOW_STATUS in the target database populates with the default value for that field.

Data Type/Size Mismatch

The data are queried to determine if errors might occur, and the user will be informed of any potential data loss.

Migration is being performed from an EQuIS database with a newer Schema applied to a database with an older Schema, or a particular Schema update modifying the type/size of a column was applied to the source but not the target database.

 

The source database has a field that allows 80 characters, but the target only allows 40. This is only a problem when data being migrated contains more than 40 characters, where the additional characters will be truncated.

CHECK constraints (either at table or column level) on the target database but not on the source database

The validation process identifies these and informs the user of the problem. While these may seem simple, they can cause major data loss, even preventing an entire table from being loaded.

The AQS Schema adds a field called AQS_POC that is required and must contain a value between 0 and 99. If the AQS Schema was applied to the target but not the source, all rows in DT_SAMPLE will be lost without applying a “fixed” value AQS_POC field, and since the column does not exist in the source, it must be added to the temp table.

 

 

Data Migration into EQuIS Tables

 

The final step in the migration process is populating the EQuIS tables in the target database with the source data in the temp tables.

 

The system dynamically generates and executes the SQL code necessary to INSERT rows into the target database.

 

Following is a summary of how this process works:

1.Tables are processed in the order of their ranking with other tables in the database – i.e., tables with no Foreign Keys (required value(s) in certain column(s) match values in the referenced table) are processed first, and then those with Foreign Keys (in order of how many levels deep that table sits). This is necessary to ensure that the parent data are loaded prior to the child data. The VW_TABLE_ORDER view shows the ranking for each table in the database, where a higher rating indicates more dependencies on other tables.

Note: A few tables that use a EUID value (but not part of a Foreign Key) to link to related data – e.g., DT_FILE, DT_GEOGRAPHY, DT_NOTE, and certain geotechnical tables – are pushed to the end of the process, including any of their child tables.

2.New EUID values are assigned for all rows that will be added as new rows in the target database (source EUID values are not valid in the target database).

3.Each row is assigned a number from 1 to n, which is used for batching and to ensure parent data are loaded prior to child data (i.e., self-referencing Foreign Keys).

4.Some triggers can be disabled, which improves processing speed and decreases loading errors. These are re-enabled when all data for a table has been loaded (see notes on disabling triggers below).

5.The loading process is done in batches for performance purposes (see batching process below).

Note: If an error occurs during the data loading, the batch size is decreased several times, at which point the system processes the data row-by-row. This allows loading of all data that can load, while data that will fail will be identified with a specific error. This significantly slows the loading process.

6.A row is added to the ST_EDD_BATCH table for the EQuIS Migration process performed.

7.When the process is complete, statistics for each table are provided with counts of the following:

Rows from the source database that were planned to be migrated.

New rows added to the target.

Errors that occurred in the process.

Rows that were excluded because of validation issues.

A downloadable Excel spreadsheet with detailed information about the migration is available by clicking Download All Logs. The detailed migration logs include tabbed worksheets containing a summary of the migration, target and source table information, rows skipped during migration, foreign keys, etc.

 

Configuration Setting: Batch Size for Data Loading

 

Data are loaded in batches to improve performance. This can improve performance of the data-load by 10 to 100 times or more. Two batch size options are used:

Standard – Default of 100,000 rows

Tables that contain “large binary data” such as DT_FILE – Default of 1,000 rows

 

These values are stored in the ST_CONFIG table as follows:

 

CONFIG_SECTION

CONFIG_KEY

STRING_VALUE

EQuIS.Migrate

BatchSize

100000

EQuIS.Migrate

BatchSizeLargeBinary

1000

The batch size can be specified by changing the STRING_VALUE to a numeric value (i.e., an integer without commas or periods). Other fields in the ST_CONFIG table should not be changed.

Configuration Setting: Disabling Triggers

 

Certain triggers can be disabled and will be if the login to the database has the permissions to do so. This is done for both performance reasons and to ensure that as much data as possible are loaded. Only triggers like trg_verify_qualifier and trg_require_parent_sample are disabled. Others can be added if needed, or the ability to disable triggers can be disabled completely. By default, the following rows are added to the ST_CONFIG table to control how/what triggers are disabled:

 

CONFIG_SECTION

CONFIG_KEY

OBJECT_TYPE

STRING_VALUE

EQuIS.Migrate

DisableTriggers

dt_

trg_...|trg_...|trg_...

EQuIS.Migrate

DisableTriggers

rt_

trg_...|trg_...|trg_...

EQuIS.Migrate

DisableTriggers

st_

trg_...|trg_...|trg_...

 

The STRING_VALUE is a pipe-separated (|) list of the trigger names to be disabled. If a trigger is listed, but does not exist in the database, it will be ignored. The OBJECT_TYPE column is optional and allows creating multiple rows and categorizing of triggers if needed. Multiple rows may be necessary, since the STRING_VALUE column has a limit of 2,000 characters.

 

Note: Disabling triggers affects all users, not just the one performing the migration.

 

To prevent triggers from being disabled, replace all rows in the ST_CONFIG table for this option with one row where STRING_VALUE is either blank or contains something is not a trigger, like in the table below.

 

CONFIG_SECTION

CONFIG_KEY

OBJECT_TYPE

STRING_VALUE

EQuIS.Migrate

DisableTriggers

 

DO NOT DISABLE TRIGGERS

 

 

EQuIS Migration Tool Limitations

 

There are some limitations to using the beta EQuIS Migration Tool, as detailed in the table below. To increase the likelihood of success, consider doing the following:

Back up the target database prior to migration.

For point-in-time database backups, note the time prior to initiating the migration.

Perform migrations into a test (not production) database first.

Create a RefVals EDD with appropriate reference values to migrate into the target database prior to migration.

Facility-specific reference value EDDs should be loaded following migration, as the target FACILITY_ID cannot be filled in until after it has been created by the EQuIS Migration Tool.

Add groups and action levels for the facility after the migration via EDD.

 

Limitation

Workaround

Cannot migrate using an SQL Server database/compatibility version prior to 2016.

Upgrade the compatibility level or server version.

Cannot update records in an existing facility; only insert new records into a new facility.

This feature is planned for a future release of the EQuIS Migration Tool.

Losing network connection during a migration can lead to a transport-level error if retry logic fails, breaking the migration

Ensure that the connection persists during the migration. For example, with an overnight migration, make sure the machine does not automatically turn off.

Migrations that encounter errors may take a long time to run.

Try different configuration settings.

Resolving data conflicts (such as by updating schemas) may help.

Not all group members (e.g., location groups or method analyte groups) may be picked up.

Set FACILITY_ID on groups in the RT_GROUP and RT_MTH_ANL_GROUP tables that should be migrated with the facility.

 

Alternately, export an EDD of the refvals format in the source database for these tables and children, remove undesired records from the EDD, and then load into the target database.

Action Levels, which are not guaranteed to have a FACILITY_ID assigned, may not be picked up.

Export an EDD of the ActionLevels format in the source database, and then load into the target database.

SPM plans may not fully migrate, or the EMT may attempt to migrate additional rows.

Ensure that SPM records in the source database have FACILITY_ID fully populated where appropriate/available.

 

Compare record counts in SPM tables thoroughly following migration, both between the source and target database and for the target database before and after.

 

Export SPM data via EDP to load into the new facility following migration.

The EMT may attempt to migrate additional reference value rows.

Review the EMT logs and delete any unnecessary reference values that have been added.

 

Report any unexpected behavior in migrating reference values to EarthSoft.

Cannot migrate using an EQuIS 6 or earlier database.

Upgrade database(s) to EQuIS 7 prior to migration.

By design, the majority of system table (ST_...) records are excluded from the migration. Consequently, DT_ records with USER_IDs populated, such as the DT_FILE table, are skipped.

Compare record counts in the DT_FILE table for the facility prior to migration.

 

Clearing the USER_ID field for a file record, if appropriate to do so, will allow migration of the record. Nullifying this field will not remove Owner permissions from the original uploader.