Calculated Column FAQs

This section answers the most frequently asked questions (FAQs) about creating and using calculated columns in an AtScale cube.

What can I do with a calculated column?

Calculated columns are useful for deriving values from the original dataset columns (such as calculating someone's age based on their birth date), doing data cleansing and pre-processing (such as combining column values or substituting one value for another), or for computing new data values based on a number of input variables (such as calculating a profit margin value based on revenue and costs).

Can I refer to a calculated column in another calculated column formula?

No. Calculated column formulas can operate only on the original dataset columns.

What formulas can I use to define a calculated column?

The formulas that describe a calculated column must be valid SQL expressions that:

  • Would be valid as an item in the column list of a SELECT statement.
  • Use SQL functions and expression syntax supported by the engine you are using to query your data warehouse.
  • Do not require an input or output type that AtScale doesn't support. (For example, SQL functions that operate on map or array data types.)

Can I use aggregate functions in a calculated column formula?

No. Although it is valid SQL, it is not the correct way to add aggregations to an AtScale cube. Instead, define a measure or a calculated member using the aggregate calculations that AtScale supports.

What SQL operators and functions can I use in a calculated column?

If you are using Google BigQuery as your data warehouse, see Functions & Operators in the BigQuery documentation.

If you are using a Hadoop cluster as your data warehouse, the answer depends on the underlying SQL-on-Hadoop engine that you are using. Here are links to the documentation for the SQL-on-Hadoop engines that AtScale supports:

EngineLink to SQL Docs
HiveSQL Functions: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-Built-inFunctions Operators: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-Built-inOperators
ImpalaSQL Functions: http://www.cloudera.com/content/cloudera/en/documentation/cloudera-impala/v2-0-x/topics/impala_functions.html Operators: http://www.cloudera.com/content/cloudera/en/documentation/cloudera-impala/v2-0-x/topics/impala_operators.html
SparkSQLSpark SQL is designed to be compatible with the Hive Metastore, SerDes and UDFs. Currently Spark SQL is based on Hive 0.12.0 and 0.13.1. https://spark.apache.org/docs/latest/sql-programming-guide.html#compatibility-with-apache-hive

Is the AS statement supported?

Yes, except when using InterSystems IRIS as a data warehouse.

The AS (alias) statement is not supported by InterSystems IRIS, and you should not use it in the formula for a calculated column. Using it with such a data warehouse would lead to failed queries with "Unsupported CAST target specified" message.