SQL Select Validator
- class ConstraintValidator
This validator checks different of a SQL query. You are intended to derive this class and implement its methods.
- abstract allowed_joins() Sequence[JoinCondition]
Returns all of the tables allowed to be connected to the query via a
JOINand the equi-join conditions that must be met for the join to be valid.
- can_use_function(function: str) bool
Returns whether or not a SQL function is allowed to be used anywhere in the query. By default, this checks the function against the list of safe functions that we have curated by hand.
- condition_column_allowed(fq_column: FqColumn) bool
Checks if a column is allowed to be used in a
ORDER BY. By default, this calls
select_column_allowed(), but if you override this method and want to preserve that behavior, you should call yourself.
- abstract max_limit() int | None
Return the maximum number of rows that can be returned by a query. if None, there is no limit.
This value is also used to inform 🧩 Reconstruction. If this function provides a limit, but the query does not, or the query provides a higher limit, the query will be reconstructed to include the correct limit.
The maximum number of rows that can be returned by a query, or None if unlimited.
- Return type:
int | None
- abstract parameterized_constraints() Sequence[ParameterizedConstraint]
Returns a sequence of constraints that must exist in either the
WHEREclause of the query or in a
JOINcondition. It doesn’t matter where the constraint is, as long as it exists and is required (i.e. not part of an optional condition).
- abstract requester_identities() Sequence[ParameterizedConstraint]
Returns the possible identities of the requester, as represented in the database. This is used to instruct the LLM how to constrain the query that it generates. Only one of these identities needs to match for the query to be compliant.
The reason that we return a sequence, and not a single identity, is that sometimes an LLM will specify the constraint as part of a
JOINcondition, and not a
WHEREcondition. In that case, the column in the JOIN condition may not match the column you expect.
For example, consider selecting films for a customer, constrained by the customer id. The LLM may give you a query like this:
SELECT f.title FROM film f JOIN inventory i ON f.film_id=i.film_id JOIN rental r ON i.inventory_id=r.inventory_id JOIN customer c ON r.customer_id=c.customer_id WHERE c.customer_id=:customer_id
Or you may receive a query like this:
SELECT f.title FROM film f JOIN inventory i ON f.film_id=i.film_id JOIN rental r ON i.inventory_id=r.inventory_id AND r.customer_id=:customer_id
customer.customer_idare valid requester identities, so ou need to specify both of them by returning a
heimdallm.bifrosts.sql.common.ParameterizedConstraintfor each of them.
- abstract select_column_allowed(column: FqColumn) bool
Check that a fully-qualified column is allowed to be selected in the
SELECTclause. Use this to restrict the columns and tables that can be selected.
This value is also used to inform 🧩 Reconstruction. Columns that do not pass this check will be removed from the query.