Table Ingestion Tuning

Table Ingestion Parameters

Data is ingested from tables by querying your table or view periodically. There are a few parameters that control how much data is ingested, as well as how often. To see the defaults of these parameters, as well as to change them, click Query Parameters on Job Options.

You will see the following 3 parameters with the current value displayed:

Query Cadence

This parameter controls how often, in minutes, we should query your table. It is relative to the last time your table was queried, which you can see by clicking the Job ID which gives you a chronological list of queries to your table.

Query Window Size

This parameter controls how large, in hours, of a query window we should use: a query window is the time interval of your data, where time is given in the change_timestamp column you supplied when first configuring the job. The beginning of the query window is always the largest change_timestamp we have encountered while querying your table. The end of the query window is either specified in hours by this parameter, or if left to 0 as the default, means unbounded to the current time.

This is useful if you need to limit the amount of data scanned per query. If your table is large, we recommend partitioning your data by the change_timestamp column, so this parameter gives you a way to limit the number of partitions scanned per query if cost is a concern.

Row Limit

This parameter controls how many rows to ingest, at most, per query. Note if you specify a query window size that covers an interval of rows with less than the row limit, you may get less than the row limit number of rows.

Last updated

Copyright © 2023 Arize AI, Inc