AI Tinkerers San Salvador: Hand Tracking, Spec-Driven Development, and a Linux Kernel Patch

Building real-time applications with AI requires disciplined engineering, not just clever prompting. That was the core thesis of the June 2026 AI Tinkerers San Salvador meetup, sponsored by WebRTC.ventures and AgilityFeat.

In this post, we share key takeaways from the event: starting with real-world applications like an offline-first AI application for sign language translation, then we dove into the mechanics of LLMs, the power of Spec-Driven Development, and a case study on using AI to contribute to the Linux Kernel. Read on to see how disciplined engineering—not “vibe”—is the true path to amplifying your technical output.

The Core Philosophy: AI-Amplified Engineering

As developers, we are currently living through a noisy industry transition. If you scroll through your feed, the dominant narrative is often a direct, polarizing clash: AI vs. Human Engineers. We hear about companies restructuring to replace talent with models, and we see marketing promising to build complex apps with a single sentence.

But is that really what is happening? Is AI a replacement, or is it something else entirely?

As the WebRTC Developer Advocate for WebRTC.ventures and host of the event, I wanted to address this head-on. The core thesis of our meetup was clear: AI is an amplifier of human capability, not a shortcut for engineering understanding.

If you feed a chaotic development workflow into an AI, it will simply amplify that chaos. But if you guide it with disciplined engineering principles, it amplifies your output to incredible heights. To prove this, we moved away from abstract slides and focused entirely on concrete, real-world implementations.

Act I: The Practical Proof – Offline Hand-Tracking with Mitanshi Asnani

To ground our thesis in reality, we started the meetup by handing the screen over to our guest speaker, Mitanshi Asnani, who joined us live from India right at her sunrise.

Mitanshi is working on a beautiful, high-impact project aiming to bridge communication barriers for the 1.5 billion people worldwide living with disabilities. Specifically, she is building an assistive system for deaf and mute individuals so they can have private, trusted conversations such as with a doctor without needing to rely on human interpreters who may not always be available or privy to confidential medical data.

The flow for hand-tracking recognition in an offline-first AI application for sign language translation.

Instead of chasing a massive, computationally expensive machine learning model that required training on thousands of custom images, Mitanshi took a remarkably elegant engineering approach:

Lightweight Foundations: She utilized Google’s MediaPipe library locally on the client to track 21 hand landmarks in real-time.
Geometric Binary Mapping: On top of those landmarks, she built a custom, rules-based recognition layer. She mapped the state of a hand to a 5-digit binary pattern (representing five fingers). For example, five open fingers (11111) maps to “Hello” or dropping two fingers to create 01100 maps to “Thank you”.
Offline-First Resilience: Because network connectivity can be highly unstable, her geometry-based mapping runs entirely offline in the browser. Users don’t need active mobile data to communicate.

During her demo, Mitanshi shared two key engineering lessons that highlighted the reality of building these systems:

The Temporal Threshold: To prevent micro-movements from triggering false positive translations, she introduced a threshold of 15 frames (~0.5 seconds). The user must hold the gesture consistently before it is translated.
Multithreading for Performance: Her initial prototype suffered from video freezing because the Text-to-Speech synthesis was running on the browser’s main execution thread. By offloading speech synthesis to a background thread (multithreading), she kept the live camera tracking perfectly fluid.

Mitanshi’s project was the perfect opening proof: AI didn’t replace her engineering; her smart architectural choices and optimizations are what made the AI useful.

Act II: Demystifying the “Magic” of LLMs

With Mitanshi’s demo setting a high standard, I took the stage to address the core theory: How do Large Language Models actually work under the hood?

The truth is, there is no conscious “reasoning” happening. If we open the box, what we find is pure, applied mathematics and statistics.

An LLM is essentially a highly advanced text completion engine. It takes an input (a prompt), processes it through a tokenizer to break words into numerical representations, and passes those numbers through a mathematical function represented by the neural network weights

The flow of LLM inference for understanding how LLMs actually work under the hood.

The output is a set of statistical probabilities indicating which token (word or character) is most likely to come next based on the data the model was trained on.

To prove this to the meetup audience, we did a quick live experiment. I asked the room to act as an LLM and complete this prompt:

“La capital de El Salvador es…” (The capital of El Salvador is…)

Naturally, the room called out “San Salvador.” That is the most statistically probable answer in a geographical dataset.

However, if the context provided to our “human model” comes from a local commuter and transit datasets, the mathematically most probable token to complete that sentence might actually be “congestionada” (congested)!

Both are grammatically valid completions, but their accuracy depends entirely on the context and training weights. Because “statistically probable” does not automatically mean correct, secure, or optimized, the human engineer’s judgment remains the ultimate filter.

Act III: The Method – Spec-Driven Development

If human judgment is the ultimate filter, how do we stop ourselves from falling into the “trial-and-error copy-paste loop” with AI coding assistants? You know the loop: you ask for code, copy it, get an error, paste the error back, and get a new piece of code that breaks something else.

At the meetup, I introduced a cleaner approach: Spec-Driven Development (SDD) using Kiro IDE (an AI-native IDE environment developed by Amazon).

Instead of letting the AI write code right away, SDD forces a structured three-step contract:

Requirements Gathering (EARS Framework): We use Easy Approach to Requirements Syntax (EARS) to define exact criteria before writing code. Structure: When [trigger] occurs, Who [system] shall [action].
Technical Design Validation: The AI generates an architectural specification, complete with system boundaries and diagrams. We, the humans, review and refine this design before any implementation begins.
Atomic Task Breakdown: The design is broken into small, independent, testable tasks that the AI can execute sequentially.

By focusing on what to build through rigid specifications, we prevent the AI from hallucinating wild architectures. The human acts as the architect; the AI acts as the apprentice executing the mechanical typing.

For more on this topic, you may be interested in my recent appearance on the Scaling Tech Podcast: Spec-Driven Creativity: AI Collaboration in Music and Software. I also blogged on the AgilityFeat site on How to Write Production-Ready Code with AI Using Spec-Driven Development: Beyond Vibe Coding.

Act IV: The Climax – Landing a Patch in the Linux Kernel

To prove that this workflow isn’t just for toy projects, I shared a personal victory with the audience: how I used this exact AI-amplified workflow to make my first-ever contribution to the Linux Kernel, one of the largest and most complex open-source projects in the world.

It all started with a frustrating gaming session. I bought a third-party Nintendo Switch Hori controller, but when I connected it to my Linux machine via Bluetooth, it wasn’t recognized.

I ran dmesg in the terminal to inspect the low-level system logs and found the controller was being ignored by the hid-nintendo driver due to its raw hardware signature. I suspected adapting the existing driver for this third-party “clone” would be straightforward. The problem? I hadn’t written low-level C code in years.

Using Kiro as my co-pilot, I didn’t blindly ask it to “make this controller work”. Instead, I used it to systematically explore the system:

It helped me write a tiny, isolated prototype driver to learn how Linux’s Human Interface Device (HID) layer works.
It explained low-level memory allocations, pointer actions, and how to format the specific 12-byte subcommands required to set the player light indicator on the controller.
It helped me dissect the existing, mainstream driver in the Linux codebase.

Together, we identified the exact 4 locations where changes were needed: adding the Hori hardware ID to the supported devices list, updating the initialization sequence, disabling unsupported features (like rumble feedback), and adjusting joystick calibration.

Once tested locally, the AI helped me navigate the strict submission process, guiding me through the use of the B4 tool to format and submit a clean patch to the Linux kernel mailing list. The patch was accepted and is scheduled for release in Linux Kernel v7.2!

The AI didn’t fix my controller. I had to understand the memory constraints, guide the prompts, and test draft code against the real device. But the AI amplified my speed, acting as an instant reference manual for a highly complex, unfamiliar system.

Closing the Loop: The Power of Community

The presentation wrapped up, but the energy in the room didn’t stop there as the Q&A session quickly turned into a collaborative brainstorming workshop. Participants explored the flexibility of adapting sign language models across different regional dialects and discussed optimizing frame rates by offloading heavy hand-tracking computations to remote GPU servers.

Others proposed innovative pipelines to capture sign language from video streams for automated transcript generation, and we provided initial steps for an attendee to troubleshoot Linux driver issues with their Sony XM5 headphones.

This is exactly why we host AI Tinkerers. Building high-performance real-time applications, whether it’s low-latency video streaming or secure AI integrations, requires deep engineering rigor. That is the work we love doing every day at WebRTC.ventures.

Thank you to everyone who joined us, asked tough questions, and showed that the future of engineering is safe in human hands, even if those hands are holding an AI-powered keyboard!

Upcoming AI Tinkerers events we’re hosting:

AI Tinkerers San Salvador: Spec-Driven Development, Hand Tracking, and a Linux Kernel Patch.

The Core Philosophy: AI-Amplified Engineering

Act I: The Practical Proof – Offline Hand-Tracking with Mitanshi Asnani

Act II: Demystifying the “Magic” of LLMs

Act III: The Method – Spec-Driven Development

Act IV: The Climax – Landing a Patch in the Linux Kernel

Closing the Loop: The Power of Community

How to Build a SignalWire Voice Agent That Qualifies Callers and Transfers to a Human

Voice AI Security: Building Realtime Voice Agents with WebRTC, LiveKit, and Sensitive Data Guardrails

Voice AI Conversation Records: Why vCons Belong in Your Production Architecture

Open Source WebRTC Media Servers: Choosing the Right One for Your Use Case

Recent Blog Posts

AI Tinkerers San Salvador: Spec-Driven Development, Hand Tracking, and a Linux Kernel Patch

How to Build a SignalWire Voice Agent That Qualifies Callers and Transfers to a Human

Voice AI Security: Building Realtime Voice Agents with WebRTC, LiveKit, and Sensitive Data Guardrails

Voice AI Conversation Records: Why vCons Belong in Your Production Architecture

We’re one of the few agencies in the world dedicated to WebRTC development. This dedication and experience is why so many people trust us to help bring real-time application dreams to life.

Let's get started!

Contact us today

Join our mailing list!

Categories

The Core Philosophy: AI-Amplified Engineering

Act I: The Practical Proof – Offline Hand-Tracking with Mitanshi Asnani

Act II: Demystifying the “Magic” of LLMs

Act III: The Method – Spec-Driven Development

Act IV: The Climax – Landing a Patch in the Linux Kernel

Closing the Loop: The Power of Community

Recent Blog Posts

Recent Blog Posts

We’re one of the few agencies in the world dedicated to WebRTC development. This dedication and experience is why so many people trust us to help bring real-time application dreams to life.