SQLite Transactions (Rust)

https://kwatafana.systems/

In programs that use SQLite, there can be various actions to be performed by the database. These actions can be grouped together in what’s called a Transaction.

A transaction is a sequence of actions on data items

Transactions help prevent problems that could arise, such as data durability when a program crushes or in the event of an unexpected power failure or even during complex concurrent programming procedures etc. (These restrictions/guarantees is basically ACID, more info below)

Programs can start a transaction and execute operations as part of the transaction. But for the transactions changes to occur the transaction must be committed. To commit simply means to instruct the database to permanently update its state according to the operations contained within the transaction.

Transactions can be considered as logical units of work for a database system. If a transaction fails the database must remove it’s effects from the database and revert back to the state the database was in before the transaction occurred.

No only are transaction units of work that move the database state forward, transactions are also a database abstraction with the following guarantees (aka ACID):

To get a greater view of Transactions let us see them at work, using Rust and the Rusqlite crate:

use rusqlite::{params, Connection, Result};
	    
	    /// A helper function for connecting the database
	    fn connect_db() -> Result<Connection> {
	        let conn = Connection::open("/tmp/TEST_DB.db")?;
	    
	        conn.execute(
	            "CREATE TABLE IF NOT EXISTS vals(
	                v  INTEGER NOT NULL
	            )",
	            [],
	        )?;
	    
	        Ok(conn)
	    }
	    
	    /// A slow way to insert rows
	    fn slow_insert(conn: &Connection) -> Result<()> {
	        for count in 1..=1000 {
	            conn.execute("INSERT INTO vals (v) VALUES (?1)", params![count])?;
	        }
	    
	        Ok(())
	    }
	    
	    /// A fast way to insert rows
	    fn fast_insert(conn: &mut Connection) -> Result<()> {
	        let tx = conn.transaction()?;
	    
	        for count in 0..1000 {
	            tx.execute("INSERT INTO vals (v) VALUES (?1)", params![count])?;
	        }
	        tx.commit()?;
	        Ok(())
	    }
	    
	    #[cfg(test)]
	    mod tests {
	        use super::*;
	    
	        #[test]
	        fn test_slow_insert() {
	            let conn = connect_db().unwrap();
	            slow_insert(&conn).unwrap();
	        }
	    
	        // #[test]
	        // fn test_fast_insert() {
	        //     let mut conn = connect_db().unwrap();
	    
	        //     fast_insert(&mut conn).unwrap();
	        // }
	    }

In the above code we try out two ways to insert 1000 rows in an SQLite database. The code has three function:

The code also has two test function test_slow_insert and test_fast_insert the later is commented out because we only want to test the slow one first by running:

$ cargo test

We see that it is quite slow, the test on my machine output:

running 1 test
	    test tests::test_slow_insert has been running for over 60 seconds
	    test tests::test_slow_insert ... ok
	    
	    test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 165.22s
	    
	       Doc-tests st
	       
	    running 0 tests
	       
	    test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

Let us try out the fast version by commenting out the test_slow_insert unit test function and uncommenting the test_fast_insert unit test function. Then after we run cargo test we get

running 1 test
	    test tests::test_fast_insert ... ok
	    
	    test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.18s
	    
	       Doc-tests st
	       
	    running 0 tests
	       
	    test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s   

This time the test completes instantly.

So why is the first one slow, specifically why is this slow:

/// A slow way to insert rows
	    fn slow_insert(conn: &Connection) -> Result<()> {
	        for count in 1..=1000 {
	            conn.execute("INSERT INTO vals (v) VALUES (?1)", params![count])?;
	        }
	    
	        Ok(())
	    }

It is slow because it uses the connection’s execute method which results in a new transaction being created to insert each and every row. This might be acceptable if the database was held in memory but in this case the database is being held on the filesystem on spinning disk drive. Interactions with the filesystem is slow usually several syscalls have to be called. For example for durability reasons (a key requirement for ACID) databases often make use of the fsync system call. All this means creating a 1000 transactions and committing them is very slow, it is much better to batch the database operations on a single transaction and only committing them once like this:

/// A fast way to insert rows
	    fn fast_insert(conn: &mut Connection) -> Result<()> {
	        let tx = conn.transaction()?;
	    
	        for count in 0..1000 {
	            tx.execute("INSERT INTO vals (v) VALUES (?1)", params![count])?;
	        }
	        tx.commit()?;
	        Ok(())
	    }

Kwatafana ⧉