Retrieve Google Sheets data using SQL with DuckDB's Go client

golang

DuckDB is an OSS that can quickly read and write CSV, JSON, and Parquet etc. using SQL.

$ brew install duckdb
$ duckdb
v1.1.3 19864453f7
Enter ".help" for usage hints.
Connected to a transient in-memory database.
Use ".open FILENAME" to reopen on a persistent database.

D SELECT * FROM read_csv_auto('test.csv');
┌───────┬─────────┬─────────┐
│   a   │    b    │    c    │
│ int64 │ varchar │ varchar │
├───────┼─────────┼─────────┤
10 │ aaa     │ bbb     │
20 │ aaa     │ ccc     │
30 │ bbb     │ ddd     │
40 │ bbb     │ eee     │
└───────┴─────────┴─────────┘

In addition to CLI, there are clients for Go, Swift, WASM, and others, so it can be used in a variety of environments.

package main

import (
	"database/sql"
	"errors"
	"fmt"
	"log"

	_ "github.com/marcboeker/go-duckdb"
)

func main() {
	db, err := sql.Open("duckdb", "")
	if err != nil {
		log.Fatal(err)
	}
	defer db.Close()

	rows, err = db.Exec(`SELECT * FROM read_csv_auto('test.csv');`)
	if err != nil {
		log.Fatal(err)
	}

	for rows.Next() {
		var (
			a int64
			b string
			c string
		)
		err = row.Scan(&a, &b, &c)
		if errors.Is(err, sql.ErrNoRows) {
			log.Println("no rows")
		} else if err != nil {
			log.Fatal(err)
		}
	}
}

/*
10 aaa bbb
20 aaa ccc
30 bbb ddd
40 bbb eee
*/

You can also pass a URL, so you can access public sheets as follows.

sheetExportURL := "https://docs.google.com/spreadsheets/export?format=csv&id=..."
rows, err := db.Query(fmt.Sprintf(`SELECT * FROM read_csv_auto('%s');`, sheetExportURL))

DuckDB has a mechanism to extend functionality through extensions.

D SELECT extension_name, installed, description
  FROM duckdb_extensions();
┌──────────────────┬───────────┬────────────────────────────────────────────────────────────────────────────────────┐
│  extension_name  │ installed │                                    description                                     │
│     varchar      │  boolean  │                                      varchar                                       │
├──────────────────┼───────────┼────────────────────────────────────────────────────────────────────────────────────┤
│ arrow            │ false     │ A zero-copy data integration between Apache Arrow and DuckDB                       │
│ autocomplete     │ true      │ Adds support for autocomplete in the shell                                         │
│ aws              │ false     │ Provides features that depend on the AWS SDK                                       │
│ azure            │ false     │ Adds a filesystem abstraction for Azure blob storage to DuckDB                     │
│ delta            │ false     │ Adds support for Delta Lake                                                        │
│ excel            │ false     │ Adds support for Excel-like format strings                                         │
│ fts              │ false     │ Adds support for Full-Text Search Indexes                                          │
│ httpfs           │ false     │ Adds support for reading and writing files over a HTTP(S) connection               │
│ iceberg          │ false     │ Adds support for Apache Iceberg                                                    │
│ icu              │ true      │ Adds support for time zones and collations using the ICU library                   │
│ inet             │ false     │ Adds support for IP-related data types and functions                               │
│ jemalloc         │ false     │ Overwrites system allocator with JEMalloc                                          │
│ json             │ true      │ Adds support for JSON operations                                                   │
│ motherduck       │ false     │ Enables motherduck integration with the system                                     │
│ mysql_scanner    │ false     │ Adds support for connecting to a MySQL database                                    │
│ parquet          │ true      │ Adds support for reading and writing parquet files                                 │
│ postgres_scanner │ false     │ Adds support for connecting to a Postgres database                                 │
│ shell            │ true      │ Adds CLI-specific support and functionalities                                      │
│ spatial          │ false     │ Geospatial extension that adds support for working with spatial data and functions │
│ sqlite_scanner   │ false     │ Adds support for reading and writing SQLite database files                         │
│ substrait        │ false     │ Adds support for the Substrait integration                                         │
│ tpcds            │ false     │ Adds TPC-DS data generation and query support                                      │
│ tpch             │ false     │ Adds TPC-H data generation and query support                                       │
│ vss              │ false     │ Adds indexing support to accelerate Vector Similarity Search                       │
├──────────────────┴───────────┴────────────────────────────────────────────────────────────────────────────────────┤
24 rows                                                                                                 3 columns │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

You can access private sheets using the community’s GSheets extension. This extension either receives an access_token or performs oauth authentication during CREATE SECRET, stores it in the Secrets Manager, read_gsheet retrieves the token, and adds it to the Authorization header for the request.

D INSTALL gsheets FROM community;
D LOAD gsheets;
D CREATE PERSISTENT SECRET (TYPE gsheet);
D SELECT * FROM read_gsheet(...);

The extension and secret are stored.

$ ls ~/.duckdb/extensions/v1.1.3/osx_arm64/
gsheets.duckdb_extension
gsheets.duckdb_extension.info
httpfs.duckdb_extension
httpfs.duckdb_extension.info

$ ls ~/.duckdb/stored_secrets/__default_gsheet.duckdb_secret

Call from Go.

if _, err := db.Exec("LOAD gsheets;"); err != nil {
        log.Fatal(err)
}

sheetURL := "https://docs.google.com/spreadsheets/d/.../edit"
rows, err := db.Query(fmt.Sprintf(`SELECT * FROM read_gsheet('%s');`, sheetURL))
if err != nil {
        log.Fatal(err)
}

Reference

DuckDBでGoogle Sheetsのデータをクエリしてみる