In programs that use SQLite, there can be various actions to be performed by the database. These actions can be grouped together in what’s called a Transaction.
A transaction is a sequence of actions on data items
Transactions help prevent problems that could arise, such as data durability when a program crushes or in the event of an unexpected power failure or even during complex concurrent programming procedures etc. (These restrictions/guarantees is basically ACID, more info below)
Programs can start a transaction and execute operations as part of the transaction. But for the transactions changes to occur the transaction must be committed. To commit simply means to instruct the database to permanently update its state according to the operations contained within the transaction.
Transactions can be considered as logical units of work for a database system. If a transaction fails the database must remove it’s effects from the database and revert back to the state the database was in before the transaction occurred.
No only are transaction units of work that move the database state forward, transactions are also a database abstraction with the following guarantees (aka ACID):
Atomic: All operations within a transaction should succeed, if even a single operation fails all other operation will not be considered and the transaction fails to be committed.
Consistency: A transaction mutates a consistent database state to a consistent state and a transaction must be deterministic.
Isolation: All operations of each transaction happen ‘together’ instantaneously.
Durability: Effects of successful transactions must become a part of the database.
To get a greater view of Transactions let us see them at work, using Rust and the Rusqlite crate:
use rusqlite::{params, Connection, Result};
/// A helper function for connecting the database
fn connect_db() -> Result<Connection> {
let conn = Connection::open("/tmp/TEST_DB.db")?;
.execute(
conn"CREATE TABLE IF NOT EXISTS vals(
v INTEGER NOT NULL
)",
,
[]?;
)
Ok(conn)
}
/// A slow way to insert rows
fn slow_insert(conn: &Connection) -> Result<()> {
for count in 1..=1000 {
.execute("INSERT INTO vals (v) VALUES (?1)", params![count])?;
conn}
Ok(())
}
/// A fast way to insert rows
fn fast_insert(conn: &mut Connection) -> Result<()> {
let tx = conn.transaction()?;
for count in 0..1000 {
.execute("INSERT INTO vals (v) VALUES (?1)", params![count])?;
tx}
.commit()?;
txOk(())
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_slow_insert() {
let conn = connect_db().unwrap();
&conn).unwrap();
slow_insert(}
// #[test]
// fn test_fast_insert() {
// let mut conn = connect_db().unwrap();
// fast_insert(&mut conn).unwrap();
// }
}
In the above code we try out two ways to insert 1000 rows in an SQLite database. The code has three function:
fn connect_db() -> Result<Connection> {
: A
helper function for connecting the databasefn slow_insert(conn: &Connection) -> Result<()>
”
A slow way to insert rows.fn fast_insert(conn: &mut Connection) -> Result<()> {
:
A fast way to insert rowsThe code also has two test function test_slow_insert
and
test_fast_insert
the later is commented out because we only
want to test the slow one first by running:
$ cargo test
We see that it is quite slow, the test on my machine output:
running 1 test
test tests::test_slow_insert has been running for over 60 seconds
test tests::test_slow_insert ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 165.22s
Doc-tests st
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
Let us try out the fast version by commenting out the
test_slow_insert
unit test function and uncommenting the
test_fast_insert
unit test function. Then after we run
cargo test
we get
running 1 test
test tests::test_fast_insert ... ok
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.18s
Doc-tests st
running 0 tests
test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s
This time the test completes instantly.
So why is the first one slow, specifically why is this slow:
/// A slow way to insert rows
fn slow_insert(conn: &Connection) -> Result<()> {
for count in 1..=1000 {
.execute("INSERT INTO vals (v) VALUES (?1)", params![count])?;
conn}
Ok(())
}
It is slow because it uses the connection’s execute
method which results in a new transaction being created to insert each
and every row. This might be acceptable if the database was held in memory
but in this case the database is being held on the filesystem on
spinning disk drive. Interactions with the filesystem is slow usually
several syscalls have to be called. For example for durability reasons
(a key requirement for ACID) databases often make use
of the fsync
system call. All this means creating a 1000
transactions and committing them is very slow, it is much better to
batch the database operations on a single transaction and only
committing them once like this:
/// A fast way to insert rows
fn fast_insert(conn: &mut Connection) -> Result<()> {
let tx = conn.transaction()?;
for count in 0..1000 {
.execute("INSERT INTO vals (v) VALUES (?1)", params![count])?;
tx}
.commit()?;
txOk(())
}