2025-02-18 Creating Iceberg Tables in S3 Tables from EMR Serverless, inserting data, and querying from Athena
2025-01-25 Walk through Iceberg metadata contents by creating tables, modifying schema and write mode, and writing data in Spark
2024-09-02 Avoiding OOM in count-distinct operations on massive datasets using HyperLogLog++, a probabilistic cardinality estimation algorithm
2024-05-22 Install Livy on EMR on EKS and run Spark jobs from local Jupyter notebooks with Sparkmagic
2022-10-21 Develop Spark Applications in Scala, deploy with GitHub Actions, and perform remote debugging on EMR
2021-04-16 Enable Job Bookmark of AWS Glue to process from the records following ones executed previously
2017-08-24 Launch Hive execution environment with Cloudera Docker Image and execute query to JSON log