deliberate.codes

TIL: Suppress ._* and .DS_Store files on network and USB volumes

2026-05-07T00:00:00+00:00

macOS creates AppleDouble (._*) and .DS_Store files on volumes that don’t natively support macOS metadata (extended attributes and resource forks). Two defaults keys tell Finder to skip writing them.

Network volumes#

defaults write com.apple.desktopservices DSDontWriteNetworkStores -bool TRUE
killall Finder

Log out and back in if Finder restart alone isn’t enough.

To revert:

defaults delete com.apple.desktopservices DSDontWriteNetworkStores
killall Finder

USB and external drives#

defaults write com.apple.desktopservices DSDontWriteUSBStores -bool TRUE
killall Finder

Remove existing files#

These settings only prevent new files — existing ones stay until removed:

find /path/to/share -name '._*' -delete
find /path/to/share -name '.DS_Store' -delete

Caveat#

This suppresses most cases but is not a universal guarantee. macOS writes AppleDouble files whenever it needs to store metadata on a filesystem that doesn’t support native extended attributes. Some apps or copy operations can still produce them regardless of this setting.

Why I turned my old MacBook into a server for coding agents

2026-04-26T00:00:00+00:00

Async coding agents need a host. Mine is called Ferris, and it shipped speq-skill v0.3.0 to GitHub on its own while I was away from my desk.

This is the first post in a series on getting coding agents to work for me asynchronously. The article walks through installing Debian on an old Mac, getting Claude Code and Codex signed in over XRDP, and the three use cases I run on Ferris today.

Building Ferris#

Ferris is my old 15“ Retina MacBook Pro. It was my favorite laptop and partner in crime for over five years until we parted ways for a newer MacBook Pro. Since then it has been sitting in my drawer without a purpose. The specs are (roughly): Intel Core i7 at 2.50 GHz, 16 GB RAM, NVidia graphics with 2 GB NVRAM.

When I read about all the people running OpenClaw or other agents on tiny VMs in the cloud, I thought: “Ferris still has ample power for Claude Code or Codex”. I run LLMs in the cloud via Anthropic or OpenAI, so the heavy thinking happens in the cloud anyway. Ferris only has to handle file operations and, of course, compiling the software it generates.

I started by installing Debian 13 “trixie”. The process was straightforward:

Download a live install image. I went with Live Xfce amd64.
Still on macOS, flash an old USB stick using balenaEtcher.
Reboot the Mac and hold Option (⌥) to select the Debian Live Installer.
Install with defaults.

Once Linux is installed, you want it to stop sleeping even when the display lid is closed. Set the following in /etc/systemd/logind.conf:

[Login]
HandleLidSwitch=ignore
HandleLidSwitchExternalPower=ignore
HandleLidSwitchDocked=ignore

Installing Debian is generic, disabling lid sleep was pretty straightforward. Next step on my journey: how to sign in to my Claude and OpenAI subscriptions via a browser on a remote machine?

About tokens and window managers#

My plan was to keep using my Claude 5x and ChatGPT Plus subscriptions on Ferris. Raw API tokens add up fast; the subscriptions cap that cost at a flat monthly rate. To use either of them on Ferris, I needed a way to log in via a browser. That ruled out a headless setup and brought me to XFCE, a lightweight window manager for Linux. The Debian Live XFCE image ships with Firefox already installed, which covers everything I need.

The next question was: how to connect to XFCE remotely? After discussing this with Claude, my goal was to connect to XFCE on Ferris via RDP. The server for RDP on Linux is called xrdp and I summarized a tutorial on how to configure it in the TIL Configure XRDP and XFCE on Debian 13.

On my main Mac I use Windows App by Microsoft to connect to Ferris’ XFCE. Windows App is Microsoft’s RDP client; on macOS it replaced the older Microsoft Remote Desktop app and connects to any RDP server.

What Ferris actually does for me#

Three use cases keep Ferris busy: coding agent runs in YOLO mode, remote control from anywhere, and conversations with my second brain from a phone.

Building software in YOLO mode#

My primary use case for Ferris is building software with Claude in YOLO mode. YOLO mode (Claude Code’s --dangerously-skip-permissions flag) means the coding agent does not stop and ask for permission before running tools. This is the unlock: I prompt Ferris with a set of instructions and the tasks are usually done by the time I check back.

The flag turns off the per-tool permission prompts so the agent can run unattended:

> claude --dangerously-skip-permissions
> # Or as alias
> alias claude-yolo='claude --dangerously-skip-permissions'

My routine workflow is spec-driven development with speq-skill; I have a series on the topic here on the blog. YOLO mode took this to the next level. With permission prompts off, Ferris can work through an entire plan while I am away from the computer.

Remote-controlling sessions#

Anthropic’s Remote Control feature lets you continue any Claude Code session from the web interface, the Claude app on your phone, or the desktop app. The docs are at Continue local sessions from any device.

I run /remote-control some-name inside Claude Code to enable remote access. The next puzzle piece was keeping the terminal session alive that Claude Code was invoked in. tmux solves that (see also my TIL tmux: terminal multiplexer):

# SSH into Ferris
> ssh marco@ferris.local

# Start a new tmux session
marco@ferris> tmux new

# Start Claude in YOLO mode
tmux[0]> claude-yolo

# Trigger Remote Control within the session
claude> /remote-control my-session

Chatting with my second brain on the phone#

For coding I prefer an actual keyboard and a bigger screen. But there is one use case where the phone wins: chatting with my second brain from anywhere.

I took Karpathy’s LLM Wiki gist and built myself an LLM wiki I call second brain. It is a collection of Markdown files about my content and posts, both for this blog and for LinkedIn. Every time I have a question, a thought, or an idea for an article or a post, I use my phone and Claude to iterate on it via a remote session running on Ferris.

Learnings and outlook#

My biggest learning from this experiment is the freedom and productivity I gain once the agent runs on a remote machine and no longer stops for permission approvals. For me this is a big step towards a setup in which agents work autonomously on my behalf. It helps a lot to have the agent constrained to a designated host that is replaceable in case the agent destroys it. On that note, it also raises a lot of questions about security and guardrails. For example: what do I do with my GitHub credentials, and how can I limit the agents so that they cannot do any harm?

As a next step I want to explore “always on” agents in more detail. I want to give OpenClaw a try, and also check out alternatives such as NanoClaw.

TIL: Configure XRDP and XFCE on Debian 13

2026-03-19T00:00:00+00:00

A minimal recipe for a working XRDP + XFCE desktop on Debian 13 that holds up for any user, not just the one you happened to set up first.

1. Install the packages#

sudo apt update
sudo apt install xrdp xfce4 xfce4-goodies

2. Use the stock `startwm.sh`#

Custom startwm.sh scripts are the most common reason fresh users fail to get a desktop while the admin account works. Keep /etc/xrdp/startwm.sh at the package default:

#!/bin/sh
if test -r /etc/profile; then
    . /etc/profile
fi
if test -r ~/.profile; then
    . ~/.profile
fi
test -x /etc/X11/Xsession && exec /etc/X11/Xsession
exec /bin/sh /etc/X11/Xsession

sudo chmod 755 /etc/xrdp/startwm.sh

3. Set the system X session manager to XFCE#

XRDP hands off to whatever x-session-manager resolves to. Point it at XFCE explicitly:

sudo update-alternatives --config x-session-manager
# pick xfce4-session

4. Skip per-user `.xsession` files#

Don’t drop .xsession into new users’ home directories. The system default Xsession path picks the right session manager on its own, and a hand-edited .xsession shadows it.

5. Start XRDP#

sudo systemctl enable --now xrdp

Connect from any RDP client to port 3389. If a fresh account still can’t get a desktop while the original works, the desktop session is failing rather than the RDP transport. Check journalctl -u xrdp-sesman for the exit code, then switch that user to VNC if you don’t want to keep debugging the X handoff.

Collaborative agentic engineering: how teams build software with AI agents

2026-03-04T00:00:00+00:00

Multiple engineers, each with their own coding agent, one shared codebase. How do you avoid chaos?

Your team already knows how to coordinate on code. Git, feature branches, pull requests. This is considered best practice and all software teams that I know are working with this.

Agentic engineering doesn’t change this model. What it changes is what needs to travel alongside the code:

Intent must be formalized and shared. Coding agents need structured context to make good decisions. That context has to be part of the shared codebase in the repository.
Implementation gets faster, so coordination must keep up. When features ship in hours instead of weeks, the feedback loop between parallel efforts tightens.

Git coordinates code. But code only captures what was built. It doesn’t capture what was intended, what was decided against, or what constraints other features depend on. When Tim merges a restructured billing module on Monday and Hannah starts building on that module on Tuesday, her agent has no idea why Tim changed the interface, what constraints he introduced, or what behavior he expects. The code is there, but the reasoning behind it is gone.

Spec-driven development fills this gap. It gives coding agents structured intent: requirements, edge cases, and expected behavior, formalized as Markdown files in the Git repository. With specs anchored next to the code, the coordination patterns teams already use apply automatically. Branch them, review them, merge them.

This works whether features build on each other or develop in parallel. When an engineer extends a colleague’s recently merged work, their agent finds the relevant specs and understands the intent behind the code. When two engineers work on separate features at the same time, each agent has the full spec library as context, and both sets of specs merge through the normal PR process.

I built speq-skill to enable this: an iterative, shared workflow for spec-driven development that supports parallel feature work, both within a team and as a solo developer. Here’s how the workflow looks in practice.

The workflow#

Consider this scenario. Tim is adding a payment retry system. Hannah is building an invoice export feature. Both work on separate feature branches. The diagram below shows the full lifecycle.

Tim plans his feature. He creates a branch and plans with his agent. The agent searches the existing spec library on main, finds relevant specs, and uses them as context. It conducts a clarifying interview: What exactly should happen when a payment fails? How many retries? What’s the backoff strategy? Together they produce spec deltas: markdown files describing the new behavior in BDD-style scenarios. The spec deltas land in specs/_plans/ on Tim’s branch, the staging area.

Tim’s agent implements. It reads the spec deltas and builds the feature, writes integration tests that validate each scenario, runs them, and verifies everything works. The staging area stays active throughout, pushed to the branch alongside the code. Nothing touches main.

Hannah does the same on her branch. She plans her invoice export, produces her own spec deltas, and implements independently. Her agent loads the spec library from main plus her own staging area. The two branches develop in parallel.

Tim records and opens a PR. When implementation is complete and verified, Tim records his specs. This moves the deltas from the staging area into the permanent spec library, still on his feature branch. He opens a pull request that includes code, tests, and the recorded specs. The reviewer sees both what Tim built and what he intended. Humans evaluate architecture and design. The AI checks correctness: does the implementation match the specs? Are the scenarios covered by tests? The spec and the code are reviewable together, in one place.

The PR merges. main now includes Tim’s feature, his tests, and his specs.

Hannah rebases. She pulls in Tim’s changes. If there are conflicts in the specs, they’re handled the same way as code conflicts. Her coding agent can resolve straightforward ones and ask clarifying questions when the intent is ambiguous. After rebasing, Hannah’s agent has access to Tim’s merged specs when it searches the library.

The cycle repeats with every feature.

The spec library grows with the team#

Every time a feature is merged, its requirements and scenarios update the permanent spec library on main. The next time any engineer plans a feature, their agent searches this library and loads the relevant requirements and scenarios into its context window.

Tim’s agent now knows about Hannah’s invoice export, even if Tim never looked at her code. If Hannah’s export spec says “the system SHALL exclude archived invoices older than 90 days,” Tim’s agent picks that up when planning a reporting feature that aggregates invoice data. Without that spec, Tim’s agent might have included archived invoices, creating an inconsistency nobody catches until a customer files a bug.

The shared spec library acts as a shared memory bank both for humans and coding agents.

Solo devs: git worktrees#

The same workflow works for a solo developer building multiple features in parallel. Git worktrees give you multiple checked-out branches simultaneously, each with its own staging area and agent session. Same discipline, same coordination through the spec library on main.

Get started#

speq-skill is open source and free to use. Get started here: github.com/marconae/speq-skill.

If you want to dig deeper into the ideas behind this workflow, start with Spec-driven development: an introduction and Writing specs for AI coding agents. For the tool that puts it all together: Introducing speq-skill.

Introducing speq-skill: spec-driven development with semantic search

2026-02-23T00:00:00+00:00

Everybody agrees: AI makes code cheap. The question is: how do you get the AI to produce working and tested software?

In the past months I have worked intensively with coding agents. My favorite agent so far is Claude Code, and since the release of Opus 4.6 it has become my go-to tool for writing software.

Here’s what I’ve learned:

The bottleneck is no longer writing code.
It’s telling the agent what to build. Precisely enough that the result actually works.

Today, most people deal with this in one of two ways:

Vibe coding — just prompt and iterate. It’s fast for prototypes, but it doesn’t scale.
Copy-paste prompting — collect snippets and prompt templates, paste them in before each task. Better than vibe coding, but it doesn’t build a lasting knowledge base.

What both approaches are missing is a system.

What’s missing is a system#

I’ve written about this on the blog: spec-driven development as a methodology, why vibe coding doesn’t scale, and how to write effective specs. For me, spec-driven development is the missing system for coding agents.

There are tools out there — OpenSpec, BMAD, SpecKit, etc. but none of them gave me:

Integration tests as proof the software actually works
No lock-in to a single language like Python
Smart enough to only load relevant specs
A system that asks me instead of making assumptions
A system that works also in brownfield projects

Introducing speq-skill#

So I built speq-skill. It’s a free Claude Code plugin with a lightweight CLI and skills.

The core idea: your specs live permanently in your project. The CLI adds semantic search so the coding agent finds only the relevant specs to avoid cluttering the context window. The skills add guardrails for code quality and enforced integration tests and TDD.

The workflow#

speq-skill follows a four-phase cycle: mission, plan, implement, record.

Mission#

Every project starts with a mission file. The agent interviews you about the project’s purpose, target users, tech stack, architecture, and constraints, then generates specs/mission.md. This is a one-time setup that gives every future session the context it needs.

Plan#

When you’re ready to build a feature, run /speq:plan. The agent searches the existing spec library for related features semantically, asks you clarifying questions, and produces an implementation plan and spec deltas for your feature. The spec deltas are markdown files with requirements written as BDD-style scenarios using RFC 2119 keywords. Each plan lives in a staging area specs/_plans/ until implementation is complete.

This is the phase where intent gets clarified. The agent is instructed to conduct a clarifying interview with you so that gaps or vague elements of your prompt are discussed.

Implement#

Run /speq:implement to instruct the coding agent to orchestrate the implementation of a plan. The coding agent will spawn sub-agents that not only generate the code but also make sure that the planned feature are working. The verification is done with enforced integration tests and the rule to obtain factual evidence for working software instead of claiming success. During the implementation the agent follows quality guardrails for code style and testing.

Record#

After implementation, /speq:record merges the spec delta into the permanent library in specs/. Your specs accumulate over time, forming a growing knowledge base of what the software does and why.

The spec library#

Specs live in your project as plain markdown files, organized by domain and feature:

specs/
  mission.md
  _plans/          # staging area for in-progress work
  _recorded/       # completed plans, kept for reference
  auth/
    login/
      spec.md
    signup/
      spec.md
  billing/
    invoices/
      spec.md

Two directories have special roles. _plans/ is the staging area where plans with spec deltas live while a feature is being planned and implemented. _recorded/ is where completed plans are moved after the agent merges their deltas into the permanent library. You can review recorded plans to trace how your project evolved.

Each spec uses BDD-style Given/When/Then scenarios with RFC 2119 keywords (SHALL, SHOULD, MAY) to express requirements at the right level of precision. For example:

### Scenario: expired token

- Given a user with an expired authentication token
- When the user requests a protected resource
- Then the system SHALL return a 401 status code
- And the system SHALL include a `WWW-Authenticate` header

See writing specs for AI coding agents for the full guide on this format.

The library grows over time as plans get recorded. After a few features, you have a searchable knowledge base of what the software does and why.

The CLI#

The speq CLI is the backbone that the agent calls during planning and implementation. It provides two core capabilities:

Semantic search (speq search) — indexes the spec library with Snowflake Arctic Embed, a compact embeddings model (~23 MB). The agent queries for relevant specs instead of loading everything, keeping the context window focused on what matters for the current task.
Structure validation (speq feature validate, speq plan validate) — enforces that specs follow the required BDD/RFC 2119 format. The agent runs validation after writing specs to catch structural issues early.

The CLI also offers speq domain list and speq feature list for navigating the spec library.

Bundled skills#

speq-skill bundles three core skills alongside the workflow:

Code navigation via Serena, for semantic code exploration
External documentation via Context7, for pulling in up-to-date library docs
Code quality guardrails to enforce clean code and Test-Driven Development

Serena and Context7 are two of the most popular MCP servers in the Claude Code ecosystem. The skills teach the agent when and how to use them effectively, so you get the benefits of both tools without having to prompt for them manually.

Get started#

speq-skill is open source and free to use. Installation instructions and full documentation are on GitHub.

If you’re new to spec-driven development, the three-part blog series provides the foundation:

Writing specs for AI coding agents

2026-02-01T00:00:00+00:00

Vague specs produce vague code. When working with AI coding agents, the quality of your specifications directly determines the quality of the output. This post covers how to write specs that leave no room for interpretation.

If you’re new to spec-driven development, start with the introduction for context on what it is and which tools exist.

The problem with prose#

Vibe coding works like this: you describe what you want in conversational prose, the AI generates code, you iterate until it looks right. The problem is that natural language is ambiguous. “The system should handle invalid login attempts appropriately” means different things to different people. The AI will interpret it however its training suggests, which may not match your intent.

Spec-driven development solves the drift problem by making specifications the primary artifact.

But it raises a new question: how do you structure specs for both clarity and effectiveness?

By borrowing patterns from Behavior-Driven Development (BDD) and the RFC 2119 requirement keywords, you can write specs that are both human-readable and machine-parseable.

A structure borrowed from BDD#

BDD is a software development practice that emerged from Test-Driven Development. Where TDD focuses on testing implementation, BDD focuses on specifying behavior from the user’s perspective. Teams write specifications in natural language that both stakeholders and developers can understand, then automate those specifications as tests.

Gherkin is the structured language BDD uses. It provides keywords like Feature, Scenario, Given, When, Then, and And that make specifications both readable and executable. Tools like Cucumber parse Gherkin files and run them as automated tests.

The core pattern is Given-When-Then:

Given some initial context
When an action occurs
Then expect this outcome

For spec-driven development with AI agents, we adapt this to Markdown using Requirements, Scenarios, and WHEN-THEN-AND blocks. This preserves the clarity of Gherkin while staying in a format that works everywhere.

The anatomy of a spec file#

Create one spec file per feature. Each spec file contains:

Purpose: A summary of the feature and why it matters
Requirements: High-level capabilities the feature must have
Scenarios: Specific behaviors under each requirement

Here’s the structure:

# Feature Name Specification

## Purpose

One paragraph describing what this spec covers and why it matters.

## Requirements

### Requirement: Capability Name

Brief description of the requirement.

#### Scenario: Specific behavior

- **WHEN** a specific condition occurs
- **THEN** it SHALL do something specific
- **AND** it SHALL also do this other thing

RFC 2119 keywords#

RFC 2119 defines requirement keywords used in Internet standards. These keywords eliminate ambiguity about whether something is mandatory, recommended, or optional:

Keyword	Alternative	Meaning
SHALL	MUST	Absolute requirement
SHALL NOT	MUST NOT	Absolute prohibition
SHOULD	RECOMMENDED	Strong recommendation, but valid exceptions may exist
SHOULD NOT	NOT RECOMMENDED	Strong discouragement, but valid exceptions may exist
MAY	OPTIONAL	Truly optional

I found that using the full set matters. Popular SDD frameworks like OpenSpec default to SHALL only. This works for core requirements, but real systems have nuance:

Performance optimizations that are nice to have but not critical
Security hardening that depends on deployment context
Convenience features that some users want and others don’t

Using only SHALL forces you to either make everything mandatory or omit optional behaviors entirely. The full RFC 2119 vocabulary lets you express degrees of importance.

Real examples#

Example: Database driver authentication#

This example is adapted from exarrow-rs, an ADBC driver for Exasol databases.

### Requirement: Authentication

The system SHALL implement Exasol's authentication mechanisms securely.

#### Scenario: Username and password authentication

- **WHEN** authenticating with username and password
- **THEN** it SHALL send credentials securely over the connection
- **AND** it SHALL support encrypted password transmission

#### Scenario: Authentication failure

- **WHEN** authentication fails
- **THEN** it SHALL return an error with the failure reason
- **AND** it SHALL close the connection
- **AND** it SHALL NOT retry automatically to avoid account lockout

Notice the SHALL NOT in the last scenario. This explicitly prohibits automatic retry, which prevents the agent from adding “helpful” retry logic that could lock out users.

Example: Calculator expression evaluation#

This example is adapted from crabculator, a terminal-based calculator.

### Requirement: Expression Evaluation

The system SHALL evaluate mathematical expressions and return results.

#### Scenario: Evaluate arithmetic expression

- **WHEN** a valid arithmetic expression is evaluated (e.g., `5 + 3 * 2`)
- **THEN** it SHALL return the computed numeric result (e.g., `11`)

#### Scenario: Evaluate expression with parentheses

- **WHEN** an expression with parentheses is evaluated (e.g., `(5 + 3) * 2`)
- **THEN** it SHALL respect operator precedence and grouping (e.g., `16`)

#### Scenario: Evaluate invalid expression

- **WHEN** an invalid expression is evaluated (e.g., `5 + + 3`, `5 / 0`)
- **THEN** it SHALL return an error with a descriptive message
- **AND** it SHOULD indicate the position of the error in the expression
- **AND** it MAY suggest corrections for common mistakes

This scenario mixes all three requirement levels:

Returning results and errors is mandatory (SHALL)
Indicating error position is recommended (SHOULD)
Suggesting corrections is optional (MAY)

Writing effective scenarios#

Be specific about triggers#

Bad:

- **WHEN** there's an error
- **THEN** it SHALL handle it appropriately

Good:

- **WHEN** the database returns error code 42000 (syntax error)
- **THEN** it SHALL wrap it in a QuerySyntaxError
- **AND** it SHALL include the original SQL in the error message
- **AND** it SHALL NOT include credentials in error output

Cover edge cases explicitly#

If you don’t specify behavior for edge cases, the agent will guess:

#### Scenario: Empty result set

- **WHEN** a SELECT query returns zero rows
- **THEN** it SHALL return an empty RecordBatch
- **AND** it SHALL include the schema in the empty batch
- **AND** it SHALL NOT return null or throw an exception

#### Scenario: Null values in results

- **WHEN** result data contains NULL values
- **THEN** it SHALL preserve nulls in the Arrow representation
- **AND** it SHALL NOT substitute default values for nulls

State what should NOT happen#

Prohibitions are as important as requirements:

#### Scenario: Error logging

- **WHEN** logging connection errors
- **THEN** it SHALL log the error type and message
- **AND** it SHALL NOT log passwords
- **AND** it SHALL NOT log full connection strings
- **AND** it SHALL NOT expose stack traces in production mode

Putting it together#

Effective specs for AI coding agents:

Use structured format: Requirements → Scenarios → WHEN-THEN-AND
Apply RFC 2119 keywords: SHALL, SHOULD, MAY (and their negatives)
Be specific: Name error codes, specify thresholds, define exact behaviors
Cover edge cases: Empty results, null values, timeouts, failures
State prohibitions: What the system SHALL NOT do is as important as what it SHALL do
Keep specs alive: Specs should evolve with the code, not be discarded after implementation

The investment in writing precise specs pays off in reduced back-and-forth, fewer misinterpretations, and code that actually matches your intent.

Benchmarking exarrow-rs: Rust vs Python for Exasol

2026-01-25T00:00:00+00:00

I built exarrow-rs using spec-driven development.

Now, the question is: is it any good?

Time to find out. I benchmarked exarrow-rs against PyExasol, the official Python driver.

The results#

They say, the proof is in the pudding:

Operation	exarrow-rs	PyExasol	Difference
Parquet Import (1GB)	7.5s	13.0s	+42% faster
Parquet Import (100MB)	1.08s	1.39s	+30% faster
Polars Streaming (100MB)	1.2s	1.7s	+37% faster
CSV Import (100MB)	0.78s	0.87s	+10% faster

The Parquet import stands out: at 1GB scale, exarrow-rs finishes 42% faster compared to PyExasol.

How I ran the benchmarks#

Setup#

Machine: MacBook Pro (M5)
Database: Exasol Docker (2025.2)
Data sizes: 100MB and 1GB
Iterations: 5 per benchmark (plus 1 warmup)

Test data#

Both drivers imported identical files into the same table schema:

CREATE TABLE benchmark.benchmark_data (
    id BIGINT,
    name VARCHAR(100),
    email VARCHAR(200),
    age INTEGER,
    salary DECIMAL(12,2),
    created_at TIMESTAMP,
    is_active BOOLEAN,
    description VARCHAR(1000)
)

The 100MB dataset contains 419K rows. The 1GB dataset contains 4.3M rows.

Operations tested#

CSV Import: Read CSV file, insert into Exasol
Parquet Import: Read Parquet file, insert into Exasol
SELECT to Polars: Query data from Exasol, stream into a Polars DataFrame

Why the difference?#

Three factors explain exarrow-rs’s performance:

Native Arrow format. exarrow-rs implements ADBC (Arrow Database Connectivity). Data stays in Arrow’s columnar format throughout. No row-to-column conversions, no Python object overhead.

Rust’s memory model. No garbage collection pauses. Predictable allocations. When processing millions of rows, these details matter.

Direct Polars integration. Since both exarrow-rs and Polars use Arrow internally, data transfers are zero-copy. PyExasol can’t achieve this because it must bridge Python’s object model.

What these benchmarks don’t show#

Network latency. My tests used a local Docker container. Over a network, the driver’s efficiency matters less.

Concurrency. I tested single-threaded imports. Production workloads often parallelize across multiple connections.

Running the benchmarks yourself#

The benchmark code is available in the repository:

# Clone the repo
git clone https://github.com/marconae/exarrow-rs
cd exarrow-rs

# Start Exasol Docker
./scripts/setup_docker.sh

# Generate test data
cargo run --release --features benchmark --bin generate_data -- --size 1gb

# Run benchmarks
./scripts/run_benchmarks.sh

You can adjust iterations, data sizes, and which operations to run:

# Run only Parquet benchmarks with 10 iterations
FORMATS="parquet" ITERATIONS=10 ./scripts/run_benchmarks.sh

Summary#

exarrow-rs delivers measurable performance gains over PyExasol:

Parquet imports: 30-42% faster
Polars streaming: 37% faster
CSV imports: 6-10% faster

The benchmark shows that exarrow-rs is a viable alternative to PyExasol for data engineering workloads in Rust.

TIL: Claude Code's task management tools

2026-01-25T00:00:00+00:00

Claude Code includes internal task management tools to manage and track multistep work.

TaskCreate#

Creates a task in Claude’s internal task list. Each task tracks progress through pending, in_progress, and completed states.

TaskCreate(
  subject: "Add user authentication",
  description: "Implement JWT-based auth with refresh tokens",
  activeForm: "Adding user authentication"
)

Key fields:

subject: imperative form (“Add feature”)
description: detailed requirements and acceptance criteria
activeForm: present continuous shown in spinner while working (“Adding feature”)

TaskUpdate#

Updates task status or properties. Mark tasks in_progress before starting work, completed when done.

TaskUpdate(taskId: "1", status: "in_progress")

TaskUpdate(taskId: "1", status: "completed")

Dependencies control execution order:

addBlockedBy: tasks that must complete first
addBlocks: tasks waiting on this one

TaskUpdate(taskId: "2", addBlockedBy: ["1"])

TaskList and TaskGet#

TaskList returns all tasks with summary info: id, subject, status, owner, and blockedBy. Use it to see available work and overall progress.

TaskGet fetches full details for a specific task including the complete description and dependency graph.

Task (sub-agents)#

The Task tool spawns specialized sub-agents that work autonomously and return results. Each agent type has specific capabilities and tool access.

Task(
  subagent_type: "Explore",
  description: "Find auth implementation",
  prompt: "Locate all authentication-related code"
)

Available agent types:

Explore: fast codebase search and exploration
Plan: implementation strategy design
Bash: command execution
general-purpose: multi-step research tasks

Sub-agents run in their own context window, isolated from the parent conversation. They return a single result when complete. This isolation means they don’t consume the parent’s context budget, but they also can’t see ongoing conversation state unless you include it in the prompt.

You can run multiple agents in parallel by issuing multiple Task calls in one message. Use run_in_background: true to continue working while an agent runs.

How to invoke#

Claude Code uses these tools automatically when appropriate. To encourage their use, prompt for multi-step workflows explicitly:

“Break this down into tasks and track progress”
“Create a task list for implementing this feature”
“Use sub-agents to explore the codebase in parallel”

Conclusion#

Task tools bring structure to complex work. They make progress visible, enable parallel execution via sub-agents, and help Claude maintain focus across multi-step implementations.

TIL: agent-browser: headless browser CLI for AI agents

2026-01-18T00:00:00+00:00

agent-browser is a Rust-based CLI that gives AI agents browser control through simple commands. It uses accessibility trees for semantic element selection instead of brittle CSS selectors.

Install and set up:

npm install -g agent-browser
agent-browser install  # downloads Chromium

Basic workflow:

agent-browser open example.com
agent-browser snapshot              # get accessibility tree with refs
agent-browser click @e2             # click element by reference
agent-browser fill @e3 "email@test.com"
agent-browser screenshot page.png
agent-browser close

Find elements semantically:

agent-browser find role button click --name "Submit"
agent-browser find label "Email" fill "user@test.com"

The --profile flag persists login sessions across runs. The --session flag isolates browser instances.

A cure for the vibe coding hangover

2026-01-14T00:00:00+00:00

Agentic coding tools like Claude Code or OpenAI Codex are fascinating. An AI that reads your code, understands your project structure, and makes changes directly. Features that would take days appear in hours.

You control the coding agents by prompting. That’s what’s widely known as Vibe Coding. During my first sessions with Claude Code I did exactly that: go from prompt to prompt. But after a while I lost track of where I was going. And so did the coding-agent. I call it “the vibe coding hangover”.

The vibe coding hangover#

As my projects grew, the agent lost track of components it had built. It duplicated functionality, contradicted earlier design decisions, and solved the same problem three different ways.

I was going from prompt to prompt with no clear start and no clear end. No artifact captured what I wanted. Every conversation started from near-zero context. The AI was writing code, but nobody was writing down the requirements.

Vibe coding is building without blueprints. It works for small things, but it falls apart in larger projects or codebases.

So what to do about the hangover? Spec-driven development: write down what you want to build before the AI writes code. Use those specifications to guide and constrain implementation. I have written a blog post on spec-driven development that explains the methodology and compares three open-source projects in the space.

With this blog post, I want to share six takeaways from using SDD with Claude Code:

Lessons Learned#

Lesson 1: vibe coding doesn’t scale#

It’s fine for prototypes, experiments, and small scripts. But without formalizing intent, you accumulate technical debt with every prompt. The AI fills gaps with assumptions and does not know about other features or constraints that have been built with previous prompts.

After a few vibe coding prompts, I had a working prototype and no clear understanding of what it was supposed to do. The code worked, but the why behind decisions disappeared. Refactoring meant re-explaining everything from scratch.

Lesson 2: AI agents need constraints#

The same capability that makes agents powerful—their ability to make decisions and fill in details—is also their weakness. Without constraints, output drifts from your intent. How should the coding agent know what to build if the intent is not communicated?

Think of specs as guardrails. Not bureaucratic overhead, but a way to prevent the agent from “helping” in directions you didn’t want. The more explicit the spec, the better the output.

Lesson 3: a long-living spec library is essential#

A well-maintained spec library becomes the memory that AI agents lack. It answers “what did we build and why?”

Without evolving specs, the agent loses the overview. It can read your code, but code doesn’t capture intent. Why was this trade-off made? What edge cases were considered? Specs hold that context.

Lesson 4: start with a project mission#

Create a project mission, e.g., a mission.md file, within your project. The mission lays the foundation for the project and guides the agent’s decisions.

Mine typically includes: project purpose, target personas, tech stack, development tools, how the agent should test, project structure, high-level architecture, and security/performance constraints.

The best part: the agent will help you create it. Add that list of topics to your prompt and ask it to build the mission file with you. You’ll have a solid foundation in minutes.

With a proper mission file, you stop repeating yourself. The agent knows the tech stack. It knows important constraints and patterns for your project. Your feature specs can focus on what to build.

I instruct the agent to read the mission file before starting to iterate on a new or existing feature spec.

Lesson 5: clarifying intent is the real work#

The most valuable part of SDD isn’t the format or the tooling. It’s the forcing function: you must articulate what you want before implementation begins.

The intent doesn’t need to be technical. Describe behavior and requirements as users would experience the software. Be aware that the agent will fill in gaps which may differ from what you actually want. To avoid this, I now add “Ask me clarifying questions” to every prompt that iterates on a spec. This surfaces areas where my own thinking is vague. The questions the AI asks often reveal requirements I hadn’t considered.

Lesson 6: structure makes you faster#

Here’s what surprised me most: adding structure made me faster, not slower.

Planning before coding feels like overhead when you have an AI that can generate code in seconds. But the time I “lost” iterating on specs, I gained back tenfold by avoiding rework, confusion, and dead ends.

Conclusion#

The cure for the vibe coding hangover is more discipline. Spec-driven development provides that discipline:

It captures the intent in a structured and systematic way so that both humans and coding agents can not only understand but also reference it.
A long-living spec library prevents drift and preserves decisions and requirements throughout the coding sessions.
A project mission sets the foundation and summarizes the guardrails for the project.

Spec-driven development: an introduction

2026-01-08T00:00:00+00:00

AI coding agents are powerful, but they drift without constraints. Spec-driven development (SDD) solves this by making specifications—not code—the primary artifact.

After hitting the limits of unstructured prompting, I spent time with three SDD frameworks to find what works. Here’s what I learned.

What is spec-driven development?#

Spec-driven development splits the workflow into two distinct phases: planning and implementation.

In the planning phase, human and coding agent collaborate to capture intent. You discuss what you’re building, why it matters, and how it should behave. The output isn’t code—it’s a collection of markdown files that form a memory bank. The memory bank consists of a project mission, a set of accepted feature specs, and a set of rules governing how the agent should behave.

In the implementation phase, the coding agent works from this memory bank to generate source code. The specs constrain what gets built. The rules constrain how it gets built. The agent has everything it needs without hallucinating requirements or drifting from intent.

Maturity levels#

Birgitta Böckeler’s exploration identifies three maturity levels for SDD:

Spec-first: Specifications precede code generation for each task. The spec guides the AI, then gets discarded or archived.
Spec-anchored: Specs persist after implementation, evolving alongside features. They become living documentation the agent references in future sessions.
Spec-as-source: The most ambitious level—specs replace code as the primary maintainable artifact. Humans edit specs, never code. The AI regenerates implementation from specs.

Most tools today operate at levels one or two. Level three remains largely aspirational.

What tools exist?#

The tooling landscape is young and moving fast. Here’s what I found after exploring three open-source SDD frameworks with Claude Code.

GitHub Spec-Kit#

github/spec-kit

Spec-Kit is GitHub’s open-source toolkit for SDD. It implements a phased workflow where human-written specifications define the “what” and “why” before code generation begins.

What it provides:

CLI for managing specs and phases
Templates for structuring specifications
Prompts designed for AI coding agents
Integration with Copilot, Claude, and Gemini

How it works: Spec-Kit organizes development into explicit phases. You write specs first, get them reviewed, then hand off to the AI for implementation. The CLI guides you through each phase.

Standout feature: With 60k+ stars, it has the largest community. The structured phases enforce discipline without being overly prescriptive.

Fission-AI OpenSpec#

Fission-AI/OpenSpec

OpenSpec is a lightweight SDD framework with a similar philosophy to Spec-Kit but a different execution model. It emphasizes a permanent memory bank of accepted specifications.

What it provides:

Structured specification workflow
Change proposals with explicit task breakdowns
Dedicated folders for specs, proposals, and tasks
Deterministic, auditable outputs

How it works: You create change proposals that reference existing specs. Each proposal breaks down into tasks. When implementation completes, accepted specs get merged into the permanent memory bank. This creates a growing knowledge base the agent can reference.

Standout feature: The permanent memory bank. As your project grows, the agent always has access to accepted specifications. This works well for brownfield projects where context accumulates over time.

AgentOS#

buildermethods/agent-os

AgentOS takes a broader approach, defining patterns and best practices for AI agent workflows. Specs act as documentation and guardrails rather than the sole source of truth.

What it provides:

Rich workflow patterns for AI agents
Built-in support for front-end development
Visual specification support (mockups, diagrams)
Audit trail of implemented changes

How it works: Unlike OpenSpec, AgentOS doesn’t maintain a permanent spec library. Instead, it keeps implemented changes in folders, making the development process auditable after the fact. You can include visuals as part of your intent specification.

Standout feature: AgentOS comes with a strong set of pre-defined rules and best practices. It also offers visual specification support. If you’re building UI-heavy applications, being able to include mockups alongside text specs is valuable.

Comparison table#

Aspect	Spec-Kit	OpenSpec	AgentOS
Philosophy	Phased workflow	Permanent memory bank	Audit trail
Spec persistence	Per-project	Accumulating library	Change-based
Visual specs	No	No	Yes
Community size	60k+ stars	16k+ stars	3k+ stars
License	MIT	MIT	MIT

Which should you choose?#

I started with AgentOS and made good progress, but after a few sessions I found myself missing a permanent memory bank. This led me to OpenSpec, which immediately resonated: it’s lightweight, requires no additional API keys, and breaks change proposals into persistent spec files—each describing a single capability of your software.

But I also found: The Tool Matters Less Than the Discipline

These tools differ in implementation details, but they share what matters: forcing you to write down intent before implementation begins.

The space is young. Tools will evolve, merge, or disappear. What won’t change is the underlying principle: AI agents need structured constraints to produce consistent results.

TIL: Save and load Docker images as zipped tarballs

2026-01-05T00:00:00+00:00

Saving and sharing a Docker image on your host is possible via the following steps.

(1) Save and ZIP the image as a tarball:

docker save -o image.tar image-name
gzip -k image.tar

(2) Load the tarball on the other machine:

gunzip -k image.tar.gz
docker load -i image.tar

TIL: Using Playwright MCP with Claude Code for browser automation

2025-12-22T00:00:00+00:00

Simon Willison’s TIL introduced me to using Playwright MCP with Claude Code. The setup is simple:

claude mcp add playwright npx '@playwright/mcp@latest'

This gives Claude Code 25+ browser automation tools: navigation, clicking, typing, screenshots, and more. The killer feature is that the browser window is visible, so you can manually authenticate while Claude watches and continues working with your session cookies.

Prompting#

When prompting, explicitly mention “playwright mcp” to ensure Claude uses these tools instead of attempting bash-based browser automation.

Use playwright mcp to verify the UI changes.

Auto-approve all Playwright tools#

By default, Claude Code asks permission for each Playwright tool invocation. To auto-approve all of them, add this to your project’s .claude/settings.local.json:

{
  "permissions": {
    "allow": [
      "mcp__playwright__*"
    ]
  }
}

The * wildcard matches all Playwright MCP tools (browser_navigate, browser_click, browser_snapshot, etc.), making the workflow much smoother for browser-heavy tasks.

Key tools#

Tool	Purpose
`browser_navigate`	Load a URL
`browser_snapshot`	Capture accessibility tree
`browser_click`	Click elements
`browser_type`	Type into inputs

Claude Code cheat sheet

2025-12-15T00:00:00+00:00

For me, one of the most important events in 2025 was the quiet release of Claude Code.

Anthropic didn’t even give it a proper launch. They mentioned it as a secondary item in their post announcing Claude 3.7 Sonnet:

“Claude Code is an active collaborator that can search and read code, edit files, write and run tests, commit and push code to GitHub, and use command line tools—keeping you in the loop at every step.”

In short: a coding agent that writes code, executes it, inspects the results, and iterates until it reaches a desired state. Claude Code offers many ways to customize its behavior. This guide covers six key concepts: CLAUDE.md, slash commands, rules, configuration, sub-agents, and skills.

CLAUDE.md#

CLAUDE.md is a special file that Claude automatically loads into context at the start of every conversation. Use it to document project conventions, build commands, code style, and repository etiquette.

Locations#

Claude Code reads CLAUDE.md files from multiple locations, in order of precedence:

Location	Scope	Git
`./CLAUDE.md`	Project root	Commit and share with team
`./CLAUDE.local.md`	Project root	Gitignore for personal config
`../CLAUDE.md`	Parent directories	Useful for monorepos
`~/.claude/CLAUDE.md`	User home	Personal defaults across all projects

Press # during a session to add instructions directly to your CLAUDE.md.

Example#

# Build Commands

- npm run build: Build the project
- npm run test: Run tests

# Code Style

- Use ES modules (import/export), not CommonJS
- Destructure imports when possible

# Workflow

- Run typecheck after making code changes
- Prefer single tests over full suite for performance

Slash commands#

Slash Commands capture repeated prompts as reusable templates stored in markdown files. They appear in the autocomplete menu when you type /.

Locations#

Project commands: .claude/commands/ — available to the current project
Personal commands: ~/.claude/commands/ — available across all projects

Example#

Create .claude/commands/fix-issue.md:

---
description: Analyze and fix a GitHub issue
allowed-tools: Read, Write, Bash
---

Analyze and fix GitHub issue: $ARGUMENTS

Steps:

1. Use `gh issue view $ARGUMENTS` to get details
2. Search the codebase for relevant files
3. Implement the fix
4. Write tests to verify
5. Run linting and type checking

Usage: /fix-issue 1234

The $ARGUMENTS keyword passes everything after the command name into the prompt.

Rules#

Rules let you organize project instructions into multiple focused files instead of one large CLAUDE.md. All markdown files in .claude/rules/ are automatically loaded into context at startup.

Locations#

Project rules: .claude/rules/ — shared with team via git
User rules: ~/.claude/rules/ — personal rules across all projects

User-level rules load first; project rules take higher priority.

Structure#

.claude/rules/
├── code-style.md      # Formatting conventions
├── testing.md         # Test requirements
├── security.md        # Security checklist
└── frontend/
    ├── react.md       # React patterns
    └── styling.md     # CSS conventions

Conditional Rules#

Scope rules to specific files using YAML frontmatter with the paths field:

---
paths: src/api/**/*.ts
---

# API Development Rules

- All endpoints must validate input
- Use standard error response format
- Include OpenAPI documentation comments

This rule only activates when Claude works on files matching src/api/**/*.ts. Rules without a paths field apply to all files.

Example#

Create .claude/rules/testing.md:

---
paths: **/*.test.ts
---

# Testing Standards

- Use descriptive test names: "should [action] when [condition]"
- One assertion per test when possible
- Mock external dependencies
- Include edge cases
- Run tests before committing

Configuration#

Configuration controls permissions, environment variables, and behavioral settings via settings.json. This determines what Claude can do without asking for confirmation.

Locations#

File	Scope
`~/.claude/settings.json`	User-level defaults
`.claude/settings.json`	Project-level, shared via git
`.claude/settings.local.json`	Project-level, gitignored

Project settings override user settings.

Example#

{
  "permissions": {
    "allow": [
      "Bash(npm run lint)",
      "Bash(npm run test:*)",
      "Read(./src/**)",
      "Write(./src/**)"
    ],
    "deny": [
      "Bash(rm -rf *)",
      "Bash(git push:*)",
      "Read(./.env)",
      "Read(./secrets/**)"
    ]
  }
}

Permissions use glob patterns. Bash(npm run test:*) matches npm run test:unit, npm run test:integration, etc. Deny rules take precedence over allow rules.

Sub-agents#

Sub-agents are specialized AI assistants with their own context windows and tool permissions. They handle specific tasks without polluting your main conversation.

Locations#

Project agents: .claude/agents/
Personal agents: ~/.claude/agents/

Configuration#

Sub-agent files use YAML frontmatter to define behavior:

---
name: code-reviewer
description: Reviews code for bugs and best practices. Use proactively after implementing features.
tools: Read, Grep, Glob
model: sonnet
---

You are a code reviewer specializing in TypeScript and React.

When invoked:

1. Analyze the code structure and patterns
2. Check for potential bugs and edge cases
3. Verify error handling
4. Review naming conventions
5. Provide actionable feedback

Usage#

Automatic: Claude delegates based on task context when descriptions include phrases like “use proactively”
Explicit: Use the code-reviewer to check my recent changes
Command: /agents to create and manage sub-agents interactively

Built-in sub-agents include Explore (read-only codebase exploration) and a general-purpose agent for multi-step tasks.

Skills#

Skills are directories containing a SKILL.md file plus supporting resources. Unlike sub-agents, Skills use progressive disclosure: Claude loads only the metadata at startup, then reads the full skill when relevant.

Structure#

my-skill/
├── SKILL.md          # Required: name, description, instructions
├── reference.md      # Optional: additional context
├── templates/        # Optional: supporting files
└── scripts/          # Optional: executable code

Example SKILL.md#

---
name: api-integration
description: Guides for integrating with our internal APIs
---

# API Integration Skill

When building API integrations:

1. Read `reference.md` for endpoint specifications
2. Use templates in `templates/` for request/response patterns
3. Run `scripts/validate.py` to check payloads

## Authentication

All requests require Bearer tokens. See `reference.md` for token generation.

## Rate limits

- 100 requests/minute for standard endpoints
- 10 requests/minute for batch operations

Key Differences from Sub-agents#

Aspect	Sub-agents	Skills
Invocation	Delegated tasks with separate context	Auto-loaded based on relevance
Structure	Single markdown file	Directory with supporting files
Context	Own context window	Main conversation context
Use case	Isolated specialized tasks	Domain knowledge and workflows

Summary#

Concept	Purpose	Location
CLAUDE.md	Project context always loaded	Repo root, parent dirs, home
Slash Commands	Reusable prompt templates	`.claude/commands/`
Rules	Modular instructions with path targeting	`.claude/rules/`
Configuration	Permission guardrails	`settings.json`
Sub-agents	Specialized task handlers	`.claude/agents/`
Skills	Progressive domain knowledge	Skill directories with `SKILL.md`

Start with CLAUDE.md for basic project context. Add Rules when CLAUDE.md grows unwieldy. Use Slash Commands for repeated workflows. Configure permissions in settings.json. Use Sub-agents for isolated tasks. Build Skills for complex domain knowledge.

exarrow-rs: an ADBC driver for Exasol in Rust

2025-12-05T00:00:00+00:00

After building tinypw, I wanted a more complex project. And I was curious how to connect to Exasol from Rust. The result: exarrow-rs, an ADBC-compatible database driver that uses Apache Arrow’s columnar format.

Building it taught me more than expected: bridging async and sync APIs, navigating Arrow’s surprisingly deep type system, integrating with the ADBC ecosystem, and using spec-driven development with Claude Code to keep the implementation consistent.

Here’s what I built and what I learned.

Note: exarrow-rs started as a side-project prototype and is now maintained by Exasol Labs.

What it does#

exarrow-rs bridges Exasol databases and the Arrow ecosystem. Instead of row-by-row data transfer (slow for analytical queries), it uses Arrow’s columnar format to move data efficiently. The driver implements ADBC (Arrow Database Connectivity). Think ODBC/JDBC, but designed around Arrow from the ground up.

use exarrow_rs::Driver;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    let user = "sys";
    let password = "exasol";
    let host = "localhost";
    let port = 8563;

    let driver = Driver::new();
    let conn_str = format!("exasol://{user}:{password}@{host}:{port}");
    let database = driver.open(&conn_str)?;
    let connection = database.connect().await?;

    let results = connection.query("SELECT * FROM schema.sales").await?;
    // results is an Arrow RecordBatch - ready for analytics
    Ok(())
}

The interesting bits#

Fully async on Tokio. The driver communicates with Exasol over WebSockets using their native WebSocket API. This required bridging async I/O with ADBC’s synchronous interface.

Type-safe parameter binding. Rust’s type system ensures query parameters match expected types at compile time:

let stmt = connection.prepare("SELECT * FROM users WHERE id = ?").await?;
stmt.bind(0, &user_id)?;
let results = stmt.execute().await?;

Comprehensive type mapping. SQL types map to Arrow types, including edge cases:

Exasol Type	Arrow Type
`BOOLEAN`	`Boolean`
`VARCHAR`	`Utf8`
`DECIMAL(p,s)`	`Decimal128`/`Decimal256`
`TIMESTAMP`	`Timestamp(Microsecond)`
`GEOMETRY`	`Binary` (WKB)
`INTERVAL`	`Interval(MonthDayNano)`

C FFI layer. Build with --features ffi to get a shared library compatible with the ADBC driver manager. The adbc_core and adbc_ffi crates handle the C bindings, so you can load the driver from Python, Go, or any language with ADBC support.

One caveat#

The driver uses Exasol’s WebSocket API, which returns JSON responses. exarrow-rs converts these JSON responses into Arrow batches. It’s an extra conversion step, but the columnar output still integrates cleanly with Arrow-based tools.

What I learned#

Async + synchronous APIs are tricky. ADBC expects a blocking-style interface, but efficient database communication wants to be async. Bridging these two worlds while keeping the API ergonomic was an interesting challenge.

Arrow’s type system is extensive. The Arrow specification covers edge cases I’d never considered. Mapping SQL’s DECIMAL precision to Arrow’s Decimal128 vs Decimal256? Intervals with month, day, and nanosecond components? Each mapping taught me something about both ecosystems.

The ADBC ecosystem is well-designed. The adbc_core and adbc_ffi crates handle the C interop layer, so exposing the driver to other languages required implementing traits rather than writing unsafe FFI code.

Spec-driven development pays off. I built exarrow-rs using spec-driven development with Claude Code. Writing specifications before implementation kept the AI agent focused and reduced drift. Each feature had a clear spec, and the resulting code stayed consistent across the codebase.

What is Exasol?#

Exasol is a high-performance, in-memory analytical database designed for data warehousing and BI workloads. It’s an enterprise product, but offers a free personal edition for development and testing.

Summary#

If you want to find out more about exarrow-rs, checkout the source code and README on github.com/exasol-labs/exarrow-rs.

TIL: Using Context7 MCP for up-to-date library documentation

2025-12-05T00:00:00+00:00

LLMs have stale training data. When working with Next.js 15, React 19, or any library that evolved after the model’s cutoff date, you get hallucinated APIs and deprecated patterns.

Context7 solves this by pulling version-specific documentation directly into your prompt.

Setup with Claude Code#

claude mcp add context7 npx -y @upstash/context7-mcp@latest

For higher rate limits, get a free API key at context7.com/dashboard and add it:

claude mcp add context7 npx \
  -y @upstash/context7-mcp@latest \
  -e CONTEXT7_API_KEY=your-api-key

Usage#

Add “use context7” to your prompt:

Use context7. How do I set up middleware in Next.js 15?

Context7 will fetch current documentation before the model responds.

How it works#

Two tools power it:

Tool	Purpose
`resolve-library-id`	Matches library names to Context7 IDs (e.g., “next.js” → “/vercel/next.js/v15.0.0”)
`get-library-docs`	Fetches documentation chunks and code examples (default 5000 tokens)

The documentation comes directly from official sources, ranked by trust score and coverage.

Engine lock-in: the lakehouse anti-pattern

2025-11-15T00:00:00+00:00

Consider this scenario: Multi-terabyte Databricks lakehouse. Dozens of dashboards. High consumption costs. The team wants to evaluate alternative query engines for their BI workloads. Maybe a specialized OLAP engine. Anything to bring costs down.

Digging into the reports shows: Native queries everywhere. Custom calculations and business logic written directly in the SQL dialect of the current engine, hard-coded into the BI layer.

Every one of those queries is written in the dialect of a single engine. And that’s the trap. The moment you embed engine-specific SQL in your front-end layer, you’ve welded your BI tool to that engine. Want to route queries to a faster, cheaper alternative? You can’t. Not without rewriting every report.

These teams adopted a lakehouse for flexibility. Instead, they locked themselves to a single engine through their front-end layer.

I call it the “Native Query Anti-Pattern.”

The lakehouse promise#

Data lakehouses deliver many advantages, but one stands above the rest: an open data architecture that lets multiple query engines operate directly on the same shared data layer.

This openness brings flexibility and competition. You choose different engines for different jobs. One handles transformations, moving data from bronze to silver to gold. Another powers your consumption layer. You’re not locked in.

On Databricks, you might use Spark for transformation pipelines. Spark excels at heavy-duty data processing at scale. But Spark is not ideal for read-only analytical workloads: dozens or hundreds of concurrent users hitting dashboards simultaneously.

Why? Spark is a multi-purpose engine. It uses a hybrid execution model (data-centric code generation, optional vectorization, and Volcano-mode fallback) to handle everything from OLAP queries to semi-structured data processing. This flexibility comes at a cost. For analytical queries, engines built specifically for OLAP, like Databricks Photon, Exasol’s Lakehouse Turbo, Snowflake, or Trino, consistently outperform vanilla Spark.

Analytical queries in the consumption layer drive most of the cost in a lakehouse deployment. This is where engine choice matters most. Under consumption-based pricing, the math is simple:

Slower queries = longer runtime = higher costs.

The Native Query Anti-Pattern#

Here’s where teams give away their freedom.

When you implement custom calculations or business logic in your BI tool using native queries, you hardcode against a specific engine. These queries use engine-specific SQL dialects, functions, and syntax. They won’t run anywhere else.

The result: you can’t switch query engines without rewriting every native query in your reports. You’ve traded the lakehouse’s core advantage, engine independence, for short-term convenience.

You are locked in to both the front-end tool and the query engine.

The fix: headless BI#

Push business logic and data integration into your lakehouse. Build a clean Silver layer. And build a clean Gold layer or data mart where the data is already modeled for consumption.

When your data mart is clean, native queries become unnecessary. Your BI tool is able to push down analytical queries automatically, adapting to whatever SQL dialect the underlying engine speaks. No custom SQL, no engine-specific syntax.

This unlocks true flexibility:

Switch query engines: Route consumption queries to the fastest, cheapest engine for the job. The BI tool adapts automatically.
Switch BI tools: Your reports aren’t tied to a specific front-end. Move from one front-end tool to another without rewriting queries.
Cost optimization: Freedom to benchmark engines and optimize without migration headaches.

This is headless BI: decouple your data logic from your visualization layer. Keep the BI layer thin. Let the data platform do the heavy lifting.

TIL: Rust's dbg! macro preserves expression value

2025-11-08T00:00:00+00:00

The dbg! macro in Rust not only prints debug output but also returns the value, making it perfect for inline debugging:

let result = dbg!(expensive_calculation());

This prints [src/main.rs:1] expensive_calculation() = 42 and assigns 42 to result. No need to split into separate lines for debugging.

tinypw: a tiny password generator in Rust

2025-10-04T00:00:00+00:00

I’m learning Rust. After weeks of reading the book and watching tutorials, I needed something hands-on. So I built tinypw—a minimal CLI tool to generate random passwords. In this post, I’ll share my experience and the lessons learned. But first, let’s see it in action:

Why a password generator?#

I constantly need throwaway passwords. Testing signup flows, creating demo accounts, filling out forms. Since I always have a terminal open, a CLI tool makes more sense than switching to a browser or password manager.

The requirements were simple:

Generate random passwords of configurable length
Support different character sets (alphanumeric, symbols, etc.)
Show password strength estimation
Fast startup, no dependencies at runtime

Rust seemed like the perfect choice for this project.

What it does#

Run tinypw and get a password. Specify length with -l, character set with -c. Each generated password includes a strength indicator so you know if it’s suitable for your use case.

# Generate a 16-character password
tinypw -l 16

# Alphanumeric only
tinypw -l 12 -c alphanumeric

# Include symbols for maximum entropy
tinypw -l 24 -c all

The CLI is fast—Rust binaries start instantly compared to Python or Node scripts that need interpreter startup.

Learning Rust along the way#

Building tinypw taught me more than any tutorial could:

The borrow checker: Rust’s compiler is a strict but well-meaning code reviewer. It caught several lifetime issues during compilation.
Error handling with Result: No exceptions, no nulls. Either you handle the error or the code won’t compile.
Cargo ecosystem: The tooling is excellent. cargo build, cargo test, cargo fmt—everything just works.
clap for CLI parsing: The clap crate makes argument parsing declarative and type-safe.

The compiler messages deserve special mention. When something fails, Rust tells you exactly why and often suggests the fix. It’s like pair programming with someone who has read the entire language spec.

Try it#

tinypw is open-source: github.com/marconae/tinypw.

If you’re learning Rust, I recommend building something small and practical. The language clicks faster when you’re solving a real problem.

SQLingual: one-click SQL dialect translation

2025-09-26T00:00:00+00:00

Migrating between databases means rewriting SQL. Every platform has its own date functions, string handling, and syntax quirks. What runs on Databricks fails on Exasol. What works in BigQuery breaks in Snowflake.

I built SQLingual (pronounced “es-qu-lingual”) to make switching engines easier.

In this post I’ll explain how it works and what I have learned along the way.

What it does#

SQLingual translates SQL queries between 30 different database dialects. Paste your source query, select target dialect, click transpile. The translation happens instantly in your browser.

The example above shows a TPC-H query—the standard benchmark for analytical databases—converted from Databricks SQL to Exasol SQL. Notice how table aliases, date literals, and join syntax adapt automatically.

How it works#

SQLingual is powered by sqlglot, an open-source SQL parser and transpiler maintained by Tobiko Data. sqlglot parses SQL into an abstract syntax tree, then generates syntactically correct output for the target dialect.

This isn’t regex find-and-replace. sqlglot understands SQL semantics. It handles:

Function mapping: DATE_TRUNC becomes TRUNC or DATE_PART depending on the target
Type conversion: Data types translate to their equivalents
Syntax differences: CTEs, window functions, and joins adapt to each dialect’s conventions
Identifier quoting: Backticks, double quotes, or brackets—whatever the target expects

The app itself is built with Streamlit, which made it trivial to create an interactive UI without writing frontend code.

Supported dialects#

SQLingual supports translation between these databases:

Cloud warehouses: BigQuery, Snowflake, Redshift, Databricks, Synapse
Traditional databases: PostgreSQL, MySQL, Oracle, SQL Server, SQLite
Analytical engines: DuckDB, ClickHouse, Presto, Trino, Spark
And more: Exasol, Teradata, Hive, StarRocks, Doris, Drill

sqlglot is very actively maintained, so I expect it to support more dialects in the future.

When to use it#

SQLingual or sqlglot help you with the following:

Database migrations: Moving from one platform to another
Multi-cloud architectures: Queries that need to run on different engines
Learning: Understanding how SQL differs across platforms
Quick translations: One-off conversions without setting up tooling

Try it#

SQLingual is a side-project of mine, it’s free and open-source:

App: sqlingual.streamlit.app
Source: github.com/marconae/sqlingual

TIL: tmux essentials for persistent terminal sessions

2025-09-12T00:00:00+00:00

tmux keeps your terminal sessions alive when you disconnect.

Install#

# macOS
brew install tmux

# Ubuntu/Debian
sudo apt install tmux

Core concepts#

Session: A collection of windows, persists after disconnect
Window: A full-screen tab within a session
Pane: A split within a window

Essential commands#

All tmux commands start with the prefix Ctrl+b, then a key.

Action	Command
New session	`tmux new -s myproject`
Detach	`Ctrl+b d`
List sessions	`tmux ls`
Attach	`tmux attach -t myproject`
Kill session	`tmux kill-session -t myproject`

Window management#

Action	Command
New window	`Ctrl+b c`
Next window	`Ctrl+b n`
Previous window	`Ctrl+b p`
Rename window	`Ctrl+b ,`

Pane splitting#

Action	Command
Split vertical	`Ctrl+b %`
Split horizontal	`Ctrl+b "`
Navigate panes	`Ctrl+b arrow`
Close pane	`Ctrl+d` or `exit`

Typical workflow#

# Start a named session
tmux new -s dev

# Run your process (e.g., a build server)
npm run dev

# Detach with Ctrl+b d
# Log out, go home, whatever

# Later, reattach
tmux attach -t dev
# Your process is still running

My Ghostty terminal setup

2025-08-18T00:00:00+00:00

Ghostty is a GPU-accelerated terminal written in Zig. It’s fast, native, and stays out of your way.

Here’s my configuration:

theme = 0x96f
background-blur-radius = 20

font-size = 13
font-family = MesloLGS Nerd Font Mono

link-url = true
mouse-hide-while-typing = true
window-decoration = true

shell-integration = zsh

What each setting does#

theme = 0x96f — A hex color theme. Ghostty ships with dozens of built-in themes. Run ghostty +list-themes to browse them.

background-blur-radius = 20 — Adds a subtle blur behind the terminal window. Pairs well with transparency, though I keep opacity at 100%.

font-family = MesloLGS Nerd Font Mono — A patched font with icons for Powerlevel10k and other terminal tools. Install via brew install --cask font-meslo-lg-nerd-font.

link-url = true — Makes URLs clickable. Cmd+click opens them in your browser.

mouse-hide-while-typing = true — The cursor disappears when you start typing. Small detail, big difference.

shell-integration = zsh — Enables Ghostty’s shell integration features: clickable prompts, semantic zones, and better scrollback.

Installation#

brew install --cask ghostty

Configuration lives at ~/.config/ghostty/config. No JSON, no YAML, just key-value pairs.

Why does quarterly planning usually fail?

2025-08-12T00:00:00+00:00

Many companies define their roadmaps for software projects on the basis of quarterly plans. It promises a structured path to achieving their goals, but how often does it truly deliver on its promise? This post dives into the mathematics of planning to understand why waterfall-style planning is often unsuccessful.

Consider the following scenario: Your team has 12 tasks to complete over the next quarter. You’re confident—there’s a 90% chance each task will be completed on time and within the desired quality. Sounds promising, right?

But here’s where it gets interesting. If we assume the success of each task is independent, the probability of all tasks succeeding is not 90% but dramatically lower. The power of compound probability explains why:

The math is straightforward: The chance of all 12 tasks being completed successfully is 90% to the power of 12, or roughly 28%. This stark drop from 90% to 28% unveils a critical oversight in many planning processes:

The failure to account for the compound effect of risk across multiple tasks.

Why shorter cycles win#

Shorter planning cycles solve this problem of uncertainty. Instead of betting on a full quarter, commit to fewer tasks in a smaller timeframe.

Let’s say we commit to 3–4 tasks over a couple of four weeks. The same 90% confidence per task now looks like this:

$0.9^4 = 0.66 \approx 66%$

Still not perfect, but 66% beats 28%! More importantly, you find out what went wrong early.

Fail fast and adapt A failed task surfaces quickly. You can re-scope, re-prioritize, or re-shape while there’s still runway. Quarterly plans bury problems until the final status report.

Compound learning, not risk. Each cycle sharpens your estimates. You learn which tasks take longer than expected, which dependencies introduce friction, and how much your team can realistically ship.

Rethinking the roadmap#

None of this means you should stop thinking beyond a cycle of two or four weeks. You still need a direction. Stakeholders still need visibility into what’s coming.

The fix isn’t abandoning long-term planning. It’s changing what you plan. Replace the fixed quarterly commitment with a prioritized backlog. That backlog is your roadmap. It shows where you’re headed without locking you into a fixed timeline that probability will fail.

To burn through that backlog, pick an agile methodology that fits your team:

Scrum works well for teams that benefit from structured ceremonies and fixed-length sprints
Kanban suits teams with unpredictable work intake, a need for continuous flowm, or many individual contributors
ShapeUp offers six-week cycles with two-week cooldowns, giving more room for deep work while still forcing regular reassessment

The methodology matters less than the principle: commit to small batches, deliver frequently, fail fast and adjust based on what you learn.

deliberate.codes

TIL: Suppress ._* and .DS_Store files on network and USB volumes

Network volumes#

USB and external drives#

Remove existing files#

Caveat#

Why I turned my old MacBook into a server for coding agents

Building Ferris#

About tokens and window managers#

What Ferris actually does for me#

Building software in YOLO mode#

Remote-controlling sessions#

Chatting with my second brain on the phone#

Learnings and outlook#

TIL: Configure XRDP and XFCE on Debian 13

1. Install the packages#

2. Use the stock startwm.sh#

3. Set the system X session manager to XFCE#

4. Skip per-user .xsession files#

5. Start XRDP#

Collaborative agentic engineering: how teams build software with AI agents

The workflow#

The spec library grows with the team#

Solo devs: git worktrees#

Get started#

Introducing speq-skill: spec-driven development with semantic search

What’s missing is a system#

Introducing speq-skill#

The workflow#

Mission#

Plan#

Implement#

Record#

The spec library#

The CLI#

Bundled skills#

Get started#

Writing specs for AI coding agents

The problem with prose#

A structure borrowed from BDD#

The anatomy of a spec file#

RFC 2119 keywords#

Real examples#

Example: Database driver authentication#

Example: Calculator expression evaluation#

Writing effective scenarios#

Be specific about triggers#

Cover edge cases explicitly#

State what should NOT happen#

Putting it together#

Benchmarking exarrow-rs: Rust vs Python for Exasol

The results#

How I ran the benchmarks#

Setup#

Test data#

Operations tested#

Why the difference?#

What these benchmarks don’t show#

Running the benchmarks yourself#

Summary#

TIL: Claude Code's task management tools

TaskCreate#

TaskUpdate#

TaskList and TaskGet#

Task (sub-agents)#

How to invoke#

Conclusion#

TIL: agent-browser: headless browser CLI for AI agents

A cure for the vibe coding hangover

The vibe coding hangover#

Lessons Learned#

Lesson 1: vibe coding doesn’t scale#

Lesson 2: AI agents need constraints#

Lesson 3: a long-living spec library is essential#

Lesson 4: start with a project mission#

Lesson 5: clarifying intent is the real work#

Lesson 6: structure makes you faster#

Conclusion#

Spec-driven development: an introduction

What is spec-driven development?#

2. Use the stock `startwm.sh`#

4. Skip per-user `.xsession` files#