SHOW LOAD

Description

Displays information of all load jobs or given load jobs in a database. This statement can only display load jobs that are created by using Broker Load and INSERT.

However, we now recommend that you use the SELECT statement to query the results of Broker Load or Insert jobs from the loads table in the information_schema database. For more information, see Batch load data from Amazon S3.

In addition to the preceding loading methods, CelerData supports using Stream Load and Routine Load to load data. Stream Load is a synchronous operation and will directly return information of Stream Load jobs. Routine Load is an asynchronous operation where you can use the SHOW ROUTINE LOAD statement to display information of Routine Load jobs.

Syntax

SHOW LOAD [ FROM db_name ]
[
   WHERE [ LABEL { = "label_name" | LIKE "label_matcher" } ]
         [ [AND] STATE = { "PENDING" | "ETL" | "LOADING" | "FINISHED" | "CANCELLED" } ]
]
[ ORDER BY field_name [ ASC | DESC ] ]
[ LIMIT { [offset, ] limit | limit OFFSET offset } ]

Note

You can add the \G option to the statement (such as SHOW LOAD WHERE LABEL = "label1"\G;) to vertically display output rather than in the usual horizontal table format. For more information, see Examples.

Parameters

ParameterRequiredDescription
db_nameNoThe database name. If this parameter is not specified, your current database is used by default.
LABEL = "label_name"NoThe labels of load jobs.
LABEL LIKE "label_matcher"NoIf this parameter is specified, the information of load jobs whose labels contain label_matcher is returned.
ANDNo
  • If you specify only one filter condition in the WHERE clause, do not specify this keyword. Example: WHERE STATE = "PENDING".
  • If you specify two or three filter conditions in the WHERE clause, you must specify this keyword. Example: WHERE LABEL = "label_name" AND STATE = "PENDING".
STATENoThe states of load jobs. The states vary based on loading methods.
  • Broker Load
    • PENDING: The load job is created.
    • QUEUEING: The load job is in the queue waiting to be scheduled.
    • LOADING: The load job is running.
    • PREPARED: The transaction has been committed.
    • FINISHED: The load job succeeded.
    • CANCELLED: The load job failed.
  • INSERT
    • FINISHED: The load job succeeded.
    • CANCELLED: The load job failed.
If the STATE parameter is not specified, the information of load jobs in all states is returned by default. If the STATE parameter is specified, only the information of load jobs in the given state is returned. For example, STATE = "PENDING" returns the information of load jobs in the PENDING state.
ORDER BY field_name [ASC | DESC]NoIf this parameter is specified, the output is sorted in ascending or descending order based on a field. The following fields are supported: JobId, Label, State, Progress, Type, EtlInfo, TaskInfo, ErrorMsg, CreateTime, EtlStartTime, EtlFinishTime, LoadStartTime, LoadFinishTime, URL, and JobDetails.
  • To sort the output in ascending order, specify ORDER BY field_name ASC.
  • To sort the output in descending order, specify ORDER BY field_name DESC.
If you do not specify the field and the sort order, the output is sorted in ascending order of JobId by default.
LIMIT limitNoThe number of load jobs that are allowed to display. If this parameter is not specified, the information of all load jobs that match the filter conditions are displayed. If this parameter is specified, for example, LIMIT 10, only the information of 10 load jobs that match filter conditions are returned.
OFFSET offsetNoThe offset parameter defines the number of load jobs to be skipped. For example, OFFSET 5 skips the first five load jobs and returns the rest. The value of the offset parameter defaults to 0.

Output

+-------+-------+-------+----------+------+---------+----------+----------+------------+--------------+---------------+---------------+----------------+-----+------------+
| JobId | Label | State | Progress | Type | Priority | EtlInfo | TaskInfo | ErrorMsg | CreateTime | EtlStartTime | EtlFinishTime | LoadStartTime | LoadFinishTime | URL | JobDetails |
+-------+-------+-------+----------+------+---------+----------+----------+------------+--------------+---------------+---------------+----------------+-----+------------+

The output of this statement varies based on loading methods.

FieldBroker LoadINSERT
JobIdThe unique ID assigned by CelerData to identify the load job in your CelerData cluster.The field has the same meaning in an INSERT job as it does in a Broker Load job.
LabelThe label of the load job. The label of a load job is unique within a database but can be duplicate across different databases.The field has the same meaning in an INSERT job as it does in a Broker Load job.
StateThe state of the load job.
  • PENDING: The load job is created.
  • QUEUEING: The load job is in the queue waiting to be scheduled.
  • LOADING: The load job is running.
  • PREPARED: The transaction has been committed.
  • FINISHED: The load job succeeded.
  • CANCELLED: The load job failed.
The state of the load job.
  • FINISHED: The load job succeeded.
  • CANCELLED: The load job failed.
ProgressThe stage of the load job. A Broker Load job only has the LOAD stage, which ranges from 0% to 100% to describe the progress of the stage. When the load job enters the LOAD stage, LOADING is returned for the State parameter.
Note
  • The formula to calculate the progress of the LOAD stage: Number of CelerData tables that complete data loading/Number of CelerData tables that you plan to load data into * 100%.
  • When all data is loaded into CelerData, 99% is returned for the LOAD parameter. Then, loaded data starts taking effect in CelerData. After the data takes effect, 100% is returned for the LOAD parameter.
  • The progress of the LOAD stage is not linear. Therefore, the value of the LOAD parameter may not change over a period of time even if data loading is still ongoing.
The stage of the load job. An INSERT job only has the LOAD stage, which ranges from 0% to 100% to describe the progress of the stage. When the load job enters the LOAD stage, LOADING is returned for the State parameter.
The Note is the same as those for Broker Load.
TypeThe method of the load job. The value of this parameter defaults to BROKER.The method of the load job. The value of this parameter defaults to INSERT.
PriorityThe priority of the load job. Valid values: LOWEST, LOW, NORMAL, HIGH, and HIGHEST.-
EtlInfoThe metrics related to ETL.
  • unselected.rows: The number of rows that are filtered out by the WHERE clause.
  • dpp.abnorm.ALL: The number of rows that are filtered out due to data quality issues, which refers to mismatches between source tables and CelerData tables in, for example, the data type and the number of columns.
  • dpp.norm.ALL: The number of rows that are loaded into your CelerData cloud account.
The sum of the preceding metrics is the total number of rows of raw data. You can use the following formula to calculate whether the percentage of unqualified data exceeds the value of the max-filter-ratio parameter:dpp.abnorm.ALL/(unselected.rows + dpp.abnorm.ALL + dpp.norm.ALL).
The metrics related to ETL. An INSERT job does not have the ETL stage. Therefore, NULL is returned.
TaskInfoThe parameters that are specified when you create the load job.
  • timeout: The time period that a load job is allowed to run. Unit: seconds.
  • max-filter-ratio: The largest percentage of rows that are filtered out due to data quality issues.
For more information, see BROKER LOAD.
The parameters that are specified when you create the load job.
  • timeout: The time period that a load job is allowed to run. Unit: seconds.
  • max-filter-ratio: The largest percentage of rows that are filtered out due to data quality issues.
For more information, see INSERT.
ErrorMsgThe error message returned when the load job fails. When the state of the loading job is PENDING, LOADING, or FINISHED, NULL is returned for the ErrorMsg field. When the state of the loading job is CANCELLED, the value returned for the ErrorMsg field consists of two parts: type and msg.
  • The type part can be any of the following values:
    • USER_CANCEL: The load job was manually canceled.
    • ETL_SUBMIT_FAIL: The load job failed to be submitted.
    • ETL-QUALITY-UNSATISFIED: The load job failed because the percentage of unqualified data exceeds the value of the max-filter-ratio parameter.
    • LOAD-RUN-FAIL: The load job failed in the LOAD stage.
    • TIMEOUT: The load job failed to finish within the specified timeout period.
    • UNKNOWN: The load job failed due to an unknown error.
  • The msg part provides the detailed cause of the load failure.
The error message returned when the load job fails. When the state of the loading job is FINISHED, NULL is returned for the ErrorMsg field. When the state of the loading job is CANCELLED, the value returned for the ErrorMsg field consists of two parts: type and msg.
  • The type part can be any of the following values:
    • USER_CANCEL: The load job was manually canceled.
    • ETL_SUBMIT_FAIL: The load job failed to be submitted.
    • ETL_RUN_FAIL: The load job failed to run.
    • ETL_QUALITY_UNSATISFIED: The load job failed due to quality issues of raw data.
    • LOAD-RUN-FAIL: The load job failed in the LOAD stage.
    • TIMEOUT: The load job failed to finish within the specified timeout period.
    • UNKNOWN: The load job failed due to an unknown error.
    • TXN_UNKNOWN: The load job failed because the state of the transaction of the load job is unknown.
  • The msg part provides the detailed cause of the load failure.
CreateTimeThe time at which the load job was created.The field has the same meaning in an INSERT job as it does in a Broker Load job.
EtlStartTimeA Broker Load job does not have the ETL stage. Therefore, the value of this field is the same as the value of the LoadStartTime field.An INSERT job does not have the ETL stage. Therefore, the value of this field is the same as the value of the LoadStartTime field.
EtlFinishTimeA Broker Load job does not have the ETL stage. Therefore, the value of this field is the same as the value of the LoadStartTime field.An INSERT job does not have the ETL stage. Therefore, the value of this field is the same as the value of the LoadStartTime field.
LoadStartTimeThe time at which the LOAD stage starts.The field has the same meaning in an INSERT job as it does in a Broker Load job.
LoadFinishTimeThe time at which the load job finishes.The field has the same meaning in an INSERT job as it does in a Broker Load job.
URLThe URL that is used to access the unqualified data detected in the load job. You can use the curl or wget command to access the URL and obtain the unqualified data. If no unqualified data is detected, NULL is returned.The field has the same meaning in an INSERT job as it does in a Broker Load job.
JobDetailsOther information related to the load job.
  • Unfinished backends: The ID of the BE that does not complete data loading.
  • ScannedRows: The total number of rows that are loaded into CelerData and the number of rows that are filtered out.
  • TaskNumber: A load job can be split into one or more tasks that concurrently run. This field indicates the number of load tasks.
  • All backends: The ID of the BE that is executing data loading.
  • FileNumber: The number of source data files.
  • FileSize: The data volume of source data files. Unit: bytes.
The field has the same meaning in an INSERT job as it does in a Broker Load job.

Usage notes

  • The information returned by the SHOW LOAD statement is valid for 3 days from LoadFinishTime of a load job. After 3 days, the information cannot be displayed. You can use the label_keep_max_second parameter to modify the default validity period.

    ADMIN SET FRONTEND CONFIG ("label_keep_max_second" = "value");
  • If the value of the LoadStartTime field is N/A for a long time, it means that load jobs heavily pile up. We recommended that you reduce the frequency of creating load jobs.

  • Total time period consumed by a load job = LoadFinishTime - CreateTime.

  • Total time period a load job consumed in the LOAD stage = LoadFinishTime - LoadStartTime.

Examples

Vertically display all load jobs in your current database.

SHOW LOAD\G;
*************************** 1. row ***************************
         JobId: 976331
         Label: duplicate_table_with_null
         State: FINISHED
      Progress: ETL:100%; LOAD:100%
          Type: BROKER
      Priority: NORMAL
       EtlInfo: unselected.rows=0; dpp.abnorm.ALL=0; dpp.norm.ALL=65546
      TaskInfo: resource:N/A; timeout(s):300; max_filter_ratio:0.0
      ErrorMsg: NULL
    CreateTime: 2022-10-17 19:35:00
  EtlStartTime: 2022-10-17 19:35:04
 EtlFinishTime: 2022-10-17 19:35:04
 LoadStartTime: 2022-10-17 19:35:04
LoadFinishTime: 2022-10-17 19:35:06
           URL: NULL
    JobDetails: {"Unfinished backends":{"b90a703c-6e5a-4fcb-a8e1-94eca5be0b8f":[]},"ScannedRows":65546,"TaskNumber":1,"All backends":{"b90a703c-6e5a-4fcb-a8e1-94eca5be0b8f":[10004]},"FileNumber":1,"FileSize":548622}