CFOCoder
Notes on data engineering, devops, and building things that work.
Latest posts
-
From Native Installation to a More Stable Hadoop + Hive Stack with Coolify
In the previous posts of this series, I installed Hadoop 3.3.6 natively on Ubuntu, configured YARN, ran MapReduce jobs, installed Apache Hive 3.1.3 on top of Hadoop, loaded external tables from HDFS,...
-
Querying Apache Hive from DBeaver: Starting HiveServer2 and Connecting a Desktop SQL Client
In the previous posts of this series, I installed Hadoop 3.3.6 natively on Ubuntu, configured YARN, ran MapReduce jobs, installed Apache Hive 3.1.3 on top of Hadoop, and finally loaded CSV files into...
-
From HDFS to SQL Queries: Loading CSV Files into Hive External Tables and Querying with SQL
When I completed the installation of Hadoop 3.3.6 and Apache Hive 3.1.3 on my Ubuntu machine, I had everything running smoothly. But then came a practical question that every data engineer faces: How...
-
Restic + MinIO for OpenClaw: What It Is, What It Solves, and the Quick Reference I Wanted Yesterday
Yesterday I spent part of the day optimizing my OpenClaw setup and cleaning up the way I protect its operational state.
-
Building a Modern Frontier Data Stack: Hadoop 3.4.3, Hive 4.2.0, and MinIO S3 Integration in 2026
A few days ago, I published posts about how to install Hadoop 3.3.6 natively on Ubuntu. At that time, I thought it was the state of the art. But things in the Big Data world move fast.
-
Apache Hive 3.1.3 on Ubuntu: Native Installation on Top of Hadoop 3.3.6
In Part 1 of this series, I installed Hadoop 3.3.6 natively on Ubuntu 24.04 and configured HDFS in pseudo-distributed mode. In Part 2, I configured YARN and ran the canonical WordCount job on War and...
-
Correcting Word Frequencies with Data Normalization: MapReduce Text Processing on War and Peace — Part 3
In Part 1 of this series, we installed Hadoop 3.3.6 natively on Ubuntu and configured HDFS for distributed storage. In Part 2, we configured YARN, wrote our first MapReduce program (WordCount), and...
-
Running Your First MapReduce Job on Hadoop: WordCount on War and Peace
In Part 1 of this series we installed Hadoop 3.3.6 natively on Ubuntu 24.04 and got HDFS running in pseudo-distributed mode. That gave us a working distributed file system, but Hadoop is much more...