Correctness¶
Epsio guarantees eventual consistency, ensuring that every query result in an Epsio view table is always accurate up to a specific point in time.
This means the view reflects a consistent snapshot of the database, such that there was a specific moment when running the same query directly against the database would have
returned the exact same result. Changes are applied to the view in a way that preserves transactional integrity, avoiding any intermediate or partial states.
This ensures the data in the view aligns with the logical state of the database at a particular point in its history.
For example, consider an employees
table, and a view defined to count the number of employees:
Let's say we add Alice and John to the employees table within the same transaction; Epsio guarantees that the view will never show an intermediate count (e.g., 1). The view will only update to reflect the final consistent state, ensuring correctness and avoiding partial or inaccurate results.
Forwarding¶
The CDC Forwarder outputs change data capture (CDC) from multiple tables into a single, unified stream consumed by the Execution Engine. This stream also includes "Transaction End" messages, ensuring that Epsio views remain consistent across tables.
This approach contrasts with other streaming engines that rely on a Debezium + Kafka setup, where each table is replicated into its own topic. In such setups, a view may be transactionally correct for a single table, but joining two tables can produce inconsistent results, as they are consumed independently.
Epsio's CDC Forwarder eliminates this issue by guaranteeing transactional correctness across all tables, ensuring accurate and reliable results in every scenario.
Sinking¶
The view's sink guarantees that no intermediate data is shown in the result table. This is true for both the Population Phase and the Running Phase.
During the initial Population Phase, Epsio first processes all changes in the view. Once finished processing, the Sink consolidates all changes and writes them to a temporary table in parallel transactions. Upon finishing, the view will change the temporary table's name to the correct name in a single transaction, thus ensuring there is never a time with partial results.
If a results table already exists during the population phase, the Sink will read the current state of the result table, and only apply the necessary changes (removing any extraneous rows and adding any rows needed). It will do this in a single transaction to ensure correctness.
During the Running Phase, the Sink consolidates all changes per transaction (or batch of transactions). Once a transaction (or batch of transactions) finishes, the Sink will write all data to the result table in a single transaction, ensuring no intermediate results are seen.