Dataset To SQL Utility

Contents

  1. Introduction
  2. Command Line Usage
  3. DBMS Support
  4. Usage Notes
  5. SQL Server Options
  6. SQLite Options
  7. Solution Query Example (SQL Server)
  8. Solution Query Example - Simple Schema

1. Introduction

The Database to SQL Script utility will produce an SQL Script from any given PLEXOS Database, (both Input and Solution databases), which will create and populate an SQL Database with the corresponding schema and data. It currently creates Primary Keys, Foreign Key Relationships and Unique Constraints.

2. Command Line Usage


DatasetToSQL <database_filename> [output_filename] [--drop] [--target="dbms"] [--progress] [--compress]
[--filter=<classname>] [--periodtype=<periodtype>] [--dataonly] [--dbname=<name>]
<database_filename>can be an input database (*.xml) or a Solution database (*.zip)
--drop                       Generate SQL to DROP the database if it already exists.
--target                     Generates additional SQL supported by a specific DBMS. (default=sqlserver)
                             Valid dbms values are: sqlserver, mysql, oracle, sqlite, access (32-bit only)
--progress                   Generates progress messages in the SQL Script.
--compress                   Writes the output file to a ZIP archive.
--filter                     To split up large databases you can provide a class filter for the data files.
--periodtype                 To split up large databases you can provide a period type filter for the data files.
--dataonly                   Generates INSERTs for data-related tables only.
--dbname                     Use specific database name rather than the filename of the dataset.
--simple                     For Solution databases, creates a database with a simpler and faster schema.

Additional Options (SQL Server only):

DatasetToSQL [database_filename] <output_filename> [--drop] [--execute] [--sqlserver-server=<server>][--sqlserver-integrate}
[--sqlserver-login=<login>] [--sqlserver-password=<password>][--sqlserverdatafiles][--sqlserverdatafilepath-out=<path>]
[--sqlserverdatafilepath-in=<path>] [--overwrite]
--execute                    Execute the SQL (SQL Server only; requires server and login information)
--sqlserver-server           Server name (SQL Server only)
--sqlserver-integrated       Use Integrated Security (SQL Server only)
--sqlserver-login            Login name (SQL Server only)
--sqlserver-password         Password (SQL Server only)
--sqlserverdatafiles         Generate data files for solution data allowing for faster data import in SQL Server.
--sqlserverdatafilepath-out  The path in which to generate the solution data files for SQL Server (default=current directory).
--sqlserverdatafilepath-in   The path in which SQL Server will be able to access the solution data files
                             (default=sqlserverdatafilepath-out).
--overwrite                  Generate code to re-use an existing database, clearing its data first. 
               

Additional Options (SQLite only):

--sqlitedatafiles         Generate data files for solution data allowing for faster data import in SQLite.
--sqlitedatafilepath-out  The path in which to generate the solution data files for SQLite (default=current directory).
                

3. DBMS Support

At the present time, SQL Server, MySQL, Oracle, SQLite and Access (32-bit only) are supported. Use the --target command line argument to specify which DBMS you intend to run the script for.

To populate a Microsoft Access Database, use the 32-bit version of DataSetToSQL and ensure you have installed the 32-bit version of the Microsoft Access Database Engine.

4. Usage Notes

The sql script generated may be quite large. It may be necessary, therefore, to execute the resulting script from the command line.

SQL Server: sqlcmd -i script.sql

MySQL: mysql < script.sql

Oracle: sqlplus @script.sql

SQLite: sqlite3 -init script.sql database.db ".exit"

Access databases are created by the utility directly. There is no script output.

5. SQL Server Options

The most efficient method to import Solution Data into SQL Server is to pass the --execute flag and SQL Server credentials so that DataSetToSQL can populate the database directly. If this is not possible, similar performance can be achieved by generating data files which can later be imported using SQL Server's BULK INSERT feature. In this case you do not need to specify credentials.

An independent BULK INSERT involves generating data files using the --sqlserverdatafiles flag, and specifying --sqlserverdatafilepath-out (where the files will be written to) and --sqlserverdatafilepath-in (where the files will be read from by SQL Server. This path must be visible to the instance of SQL Server running the import script and would only be set if you plan on moving the files to this location after they are generated.) You can then execute the SQL Script to populate the database.

6. SQLite Options

To more efficiently populate SQLite with Solution data, pass --sqlite-datafiles (and optionally an --sqlite-datafilepath). DataSetToSQL will now generate CSV files for the large t_data tables, and the SQL script will import from these files. Do not move the files once generated as the SQL script references the path explicitly.

7. Solution Query Example (SQL Server)

A basic query in SQL Server of the solution database would be as follows.

SELECT t_model.name AS 'Model', t_parent_object.name AS 'Parent Object', t_collection.name AS 'Collection', t_child_object.name 
                AS 'Child Object', t_property.name AS 'Property', t_key.band_id AS 'Band', t_timeslice.name AS 'Timeslice', 
                t_sample.sample_id AS 'Sample', t_period_0.datetime AS 'Datetime', t_data_0.value AS 'Value', t_unit.value AS 'Units'
FROM t_membership
INNER JOIN t_collection ON t_membership.collection_id = t_collection.collection_id
INNER JOIN t_object t_parent_object ON t_membership.parent_object_id = t_parent_object.object_id
INNER JOIN t_object t_child_object ON t_membership.child_object_id = t_child_object.object_id
INNER JOIN t_property ON t_collection.collection_id = t_property.collection_id
INNER JOIN t_unit ON t_property.unit_id = t_unit.unit_id
INNER JOIN t_key ON t_membership.membership_id = t_key.membership_id AND t_property.property_id = t_key.property_id
INNER JOIN t_model ON t_key.model_id = t_model.model_id
INNER JOIN t_timeslice ON t_key.timeslice_id = t_timeslice.timeslice_id
INNER JOIN t_sample ON t_key.sample_id = t_sample.sample_id
INNER JOIN t_data_0 ON t_key.key_id = t_data_0.key_id
INNER JOIN t_phase_4 ON t_data_0.period_id = t_phase_4.period_id
INNER JOIN t_period_0 ON t_phase_4.interval_id = t_period_0.interval_id
WHERE t_key.phase_id = @phase_id AND t_collection.collection_id = @collection_id AND (@property_id IS NULL OR
t_property.property_id = @property_id)
ORDER BY t_model.name, t_collection.name, t_parent_object.name, t_child_object.name, t_property.name, t_key.band_id, t_timeslice.name, 
t_sample.sample_id, t_period_0.datetime

Of the various "numbered" tables, such as t_data_0, t_data_1, t_period_0, t_period_1, etc; the number refers to a period type, where the numbers correspond as follows:

0 = Interval, 1 = Day, 2 = Week, 3 = Month, 4 = Year

phase_id's correspond as follows:

1 = LT Plan, 2 = PASA, 3 = MT Schedule, 4 = ST Schedule

8. Solution Query Example - Simple Schema

A basic query in SQL Server of the solution database would be as follows. In this format, all ids for parameters can be acquired from lookup tables.

SELECT t_parent_object.name AS 'Parent Object',
       t_model.name AS 'Model',
       t_phase.name AS 'Phase',
       t_collection.name AS 'Collection',
       t_child_object.name AS 'Child Object',
       t_category.name AS 'Category',
       t_property.name AS 'Property',
       t_solution_data.band_id AS 'Band',
       t_timeslice.name AS 'Timeslice',
       t_sample.name AS 'Sample',
       _date AS 'Datetime',
       t_solution_data.value AS 'Value',
       t_unit.name AS 'Units'
FROM t_solution_data
INNER JOIN t_model ON t_solution_data.model_id = t_model.model_id
INNER JOIN t_object t_parent_object ON t_solution_data.parent_id = t_parent_object.object_id
INNER JOIN t_object t_child_object ON t_solution_data.child_id = t_child_object.object_id
INNER JOIN t_collection ON t_solution_data.collection_id = t_collection.collection_id
INNER JOIN t_category ON t_solution_data.category_id = t_category.category_id
INNER JOIN t_property ON t_solution_data.property_id = t_property.property_id
INNER JOIN t_timeslice ON t_solution_data.timeslice_id = t_timeslice.timeslice_id
INNER JOIN t_sample ON t_solution_data.sample_id = t_sample.sample_id
INNER JOIN t_unit ON t_solution_data.unit_id = t_unit.unit_id
INNER JOIN t_phase ON t_solution_data.phase_id = t_phase.phase_id
INNER JOIN t_period_type ON t_solution_data.period_type_id = t_period_type.period_type_id
WHERE t_solution_data.period_type_id = @period_type_id AND
      t_solution_data.phase_id = @phase_id AND
      t_solution_data.collection_id = @collection_id AND
      t_solution_data.property_id = @property_id AND
      t_solution_data.model_id = @model_id AND
      t_solution_data.timeslice_id = @timeslice_id AND
      t_solution_data.sample_id = @sample_id
ORDER BY t_solution_data.model_id, parent_id, t_solution_data.collection_id, child_id, t_solution_data.property_id, t_solution_data._date