Read and Write to OPFS (Origin Private File System) with duckdb-wasm and Compare Performance with IndexedDB
databasewasmOPFS (Origin Private File System) is a file system accessible only from the same origin. It has been supported by major browsers since March 2023. Since it doesn’t directly access the user’s file system, it can be used without permission. Chrome’s IndexedDB uses LevelDB as a backend, which is not suited for random access, but OPFS provides an API that enables random access.
GoogleのkvsライブラリLevelDBを使う - sambaiz-net
By default, data may be deleted when storage space is low, but you can request persistence with navigator.storage.persist(). Firefox requires user permission, while Chrome automatically determines based on interaction history.
duckdb-wasm also supports OPFS. DuckDB can directly query Parquet and CSV files, but it also has a columnar database file format similar to Parquet that supports row-level updates and transactions. By specifying opfs:// in db.open(), a database file is created in OPFS.
Retrieve Google Sheets data using SQL with DuckDB’s Go client - sambaiz-net
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<title>DuckDB WASM + OPFS</title>
</head>
<body>
<script type="module">
import * as duckdb from 'https://cdn.jsdelivr.net/npm/@duckdb/duckdb-wasm@latest/+esm';
const JSDELIVR_BUNDLES = duckdb.getJsDelivrBundles();
const bundle = await duckdb.selectBundle(JSDELIVR_BUNDLES);
const worker_url = URL.createObjectURL(
new Blob([`importScripts("${bundle.mainWorker}");`], { type: 'text/javascript' })
);
const worker = new Worker(worker_url);
const logger = new duckdb.ConsoleLogger();
const db = new duckdb.AsyncDuckDB(logger, worker);
await db.instantiate(bundle.mainModule, bundle.pthreadWorker);
await db.open({
path: 'opfs://mydb.db',
accessMode: duckdb.DuckDBAccessMode.READ_WRITE
});
const conn = await db.connect();
await conn.query(`
CREATE TABLE IF NOT EXISTS users (
id BIGINT PRIMARY KEY,
name VARCHAR,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)
`);
await conn.query(`INSERT INTO users (id, name) VALUES (${Date.now()}, 'Alice')`);
await conn.query('CHECKPOINT'); // flush changes
const result = await conn.query('SELECT * FROM users');
console.log(result.toArray());
await conn.close();
</script>
</body>
</html>
Let’s compare performance when reading and writing with IndexedDB.
function openIDB() {
return new Promise((resolve, reject) => {
const req = indexedDB.open('benchmark', 1);
req.onupgradeneeded = () => {
const db = req.result;
if (!db.objectStoreNames.contains('users')) {
db.createObjectStore('users', { keyPath: 'id' });
}
};
req.onsuccess = () => resolve(req.result);
req.onerror = () => reject(req.error);
});
}
async function idbWrite(db, data) {
return new Promise((resolve, reject) => {
const tx = db.transaction('users', 'readwrite');
const store = tx.objectStore('users');
for (const item of data) {
store.put(item);
}
tx.oncomplete = () => resolve();
tx.onerror = () => reject(tx.error);
});
}
async function idbRead(db, ids) {
return new Promise((resolve, reject) => {
const tx = db.transaction('users', 'readonly');
const store = tx.objectStore('users');
const results = [];
for (const id of ids) {
const req = store.get(id);
req.onsuccess = () => results.push(req.result);
}
tx.oncomplete = () => resolve(results);
tx.onerror = () => reject(tx.error);
});
}
I measured execution time for writing 100,000 rows, retrieving by ID, and running aggregation queries in Chrome. DuckDB is faster for aggregation queries, while IndexedDB is faster for retrieval by ID. However, when fetching multiple rows, DuckDB can also achieve better performance than IndexedDB.
| Operation | IndexedDB | DuckDB + OPFS |
|---|---|---|
| Write 100000 rows | 7301ms | 4425ms (Batch) |
| Aggregation (COUNT, AVG, SUM) | 352ms | 31ms |
| - Read | 351ms | - |
| - Compute | 2ms | - |
| Read 1000 rows by ID (individual) | 189ms | 912ms |
| Read 1000 rows by ID (batch) | 22ms | 9ms (IN clause) |