Importing Tables and Views into AtScale Projects as Datasets

The first step in designing an AtScale cube is to import the physical tables and views that the cube will be based on.

About reusing datasets

Consider that reusing Datasets between the Dimension canvas and the Cube Canvas - or between multiple Dimensions - can cause the Engine to take unexpected join-paths at deployment time.

If a data architect wishes to reference the same table from multiple points in the same model, it is recommended to define separate Query Datasets (QDS) that select from the desired table. The separate datasets should be used to back each dimension or Fact model component. Doing so ensures the generation of run-time join-paths that are readily identified in the Design Center.

Note that each QDS must be created from scratch; converting an existing Dataset to a QDS does not satisfy the uniqueness requirement.

About this task

If you are not sure what datasets you need for your cube, you can drag them on to the main canvas to look at them first. Doing this adds the datasets to the project library. Consider the following:

  • Adding datasets to the library

    You can build up a dataset library by dragging all the datasets to the main cube canvas to start with. This allows you to preview columns, add calculated columns, and plan how you will model relationships between the datasets. Then you can remove all the datasets from the canvas and begin modeling by adding datasets on-by-one from the dataset library.

  • Using a table partitioned in BigQuery

    In case you are using Google BigQuery as a data warehouse, if you import as a fact dataset a table that is partitioned in BigQuery, the _PARTITIONTIME pseudo-column is listed among the columns of the dataset. When you are modeling data, you can build a Date dimension from this pseudo-column. When users working in client BI applications query your published cube, they can include WHERE clauses on _PARTITIONTIME; BigQuery will do partition pruning when serving the results.

    For more information about the _PARTITIONTIME pseudo-column, see the Google BigQuery documentation here.

Procedure

  1. In the Data Sources panel, click a data source name to expand it.

  2. Click a database or schema name to expand it.

  3. To add a table or view:

    • As a fact dataset: Drag a table or view on to the main canvas to use as a fact dataset.

    • As a dimension dataset: Drag a table or view to use as dimension datasets to the Dimensions panel. The Create a Dimension dialog opens. (See Add a Normal Dimension for more on specifying dimension properties.) After you specify the properties of the new normal dimension and click Save, the normal dimension appears on the canvas. Double-click the dimension header to open it in a dimension canvas and see the dataset that the normal dimension is based on.

      There are other types of dimensions that you can create in AtScale. However, the methods for creating them differ from the method described here. See Types of Dimensions in a Cube for information about other dimension types.

The datasets are added to the Library panel. This panel lists datasets that are available to all of the cubes that are in a project. If you dragged a dataset to the canvas by mistake, remove it from the canvas and them remove it from the Library panel.