Vibe Coding, Vibe Checking, and Vibe Blogging

For the past decade and a half, I’ve been exploring the intersection of technology, education, and design as a professor of cognitive science and design at UC San Diego. Some of you might have read my recent piece for O’Reilly Radar where I detailed my journey adding AI chat capabilities to Python Tutor, the free […]

Apr 22, 2025 - 12:13

Vibe Coding, Vibe Checking, and Vibe Blogging

For the past decade and a half, I’ve been exploring the intersection of technology, education, and design as a professor of cognitive science and design at UC San Diego. Some of you might have read my recent piece for O’Reilly Radar where I detailed my journey adding AI chat capabilities to Python Tutor, the free visualization tool that’s helped millions of programming students understand how code executes. That experience got me thinking about my evolving relationship with generative AI as both a tool and a collaborator.

I’ve been intrigued by this emerging practice called “vibe coding,” a term coined by Andrej Karpathy that’s been making waves in tech circles. Simon Willison describes it perfectly: “When I talk about vibe coding I mean building software with an LLM without reviewing the code it writes.” The concept is both liberating and slightly terrifying—you describe what you need, the AI generates the code, and you simply run it without scrutinizing each line, trusting the overall “vibe” of what’s been created.

My relationship with this approach has evolved considerably. In my early days of using AI coding assistants, I was that person who meticulously reviewed every single line, often rewriting significant portions. But as these tools have improved, I’ve found myself gradually letting go of the steering wheel in certain contexts. Yet I couldn’t fully embrace the pure “vibe coding” philosophy; the professor in me needed some quality assurance. This led me to develop what I’ve come to call “vibe checks”—strategic verification points that provide confidence without reverting to line-by-line code reviews. It’s a middle path that’s worked surprisingly well for my personal projects, and today I want to share some insights from that journey.

Vibe Coding in Practice: Converting 250 HTML Files to Markdown

I’ve found myself increasingly turning to vibe coding for those one-off scripts that solve specific problems in my workflow. These are typically tasks where explaining my intent is actually easier than writing the code myself, especially for data processing or file manipulation jobs where I can easily verify the results.

Let me walk you through a recent example that perfectly illustrates this approach. For a class I teach, I had students submit responses to a survey using a proprietary web app that provided an HTML export option. This left me with 250 HTML files containing valuable student feedback, but it was buried in a mess of unnecessary markup and styling code. What I really wanted was clean Markdown versions that preserved just the text content, section headers, and—critically—any hyperlinks students had included in their responses.

Rather than writing this conversion script myself, I turned to Claude with a straightforward request: “Write me a Python script that converts these HTML files to Markdown, preserving text, basic formatting, and hyperlinks.” Claude suggested using the BeautifulSoup library (a solid choice) and generated a complete script that would process all files in a directory, creating a corresponding Markdown file for each HTML source.

(In retrospect, I realized I probably could have used Pandoc for this conversion task. But in the spirit of vibe coding, I just went with Claude’s suggestion without overthinking it. Part of the appeal of vibe coding is bypassing that research phase where you compare different approaches—you just describe what you want and roll with what you get.)

True to the vibe coding philosophy, I didn’t review the generated code line by line. I simply saved it as a Python file, ran it on my directory of 250 HTML files, and waited to see what happened. This “run and see” approach is what makes vibe coding both liberating and slightly nerve-wracking—you’re trusting the AI’s interpretation of your needs without verifying the implementation details.

Trust and Risk in Vibe Coding: Running Unreviewed Code

The moment I hit “run” on that vibe-coded script, I realized something that might make many developers cringe: I was executing completely unreviewed code on my actual computer with real data. In traditional software development, this would be considered reckless at best. But the dynamics of trust feel different with modern AI tools like Claude 3.7 Sonnet, which has built up a reputation for generating reasonably safe and functional code.

My rationalization was partly based on the script’s limited scope. It was just reading HTML files and creating new Markdown files alongside them—not deleting, modifying existing files, or sending data over the network. Of course, that’s assuming the code did exactly what I asked and nothing more! I had no guarantees that it didn’t include some unexpected behavior since I hadn’t looked at a single line.

This highlights a trust relationship that’s evolving between developers and AI coding tools. I’m much more willing to vibe code with Claude or ChatGPT than I would be with an unknown AI tool from some obscure website. These established tools have reputations to maintain, and their parent companies have strong incentives to prevent their systems from generating malicious code.

That said, I’d love to see operating systems develop a “restricted execution mode” specifically designed for vibe coding scenarios. Imagine being able to specify: “Run this Python script, but only allow it to CREATE new files in this specific directory, prevent it from overwriting existing files, and block internet access.” This lightweight sandboxing would provide peace of mind without sacrificing convenience. (I mention only restricting writes rather than reads because Python scripts typically need to read various system files from across the filesystem, making read restrictions impractical.)

Why not just use VMs, containers, or cloud services? Because for personal-scale projects, the convenience of working directly on my own machine is hard to beat. Setting up Docker or uploading 250 HTML files to some cloud service introduces friction that defeats the purpose of quick, convenient vibe coding. What I want is to maintain that convenience while adding just enough safety guardrails.

Vibe Checks: Simple Scripts to Verify AI-Generated Code

OK now come the “vibe checks.” As I mentioned earlier, the nice thing about these personal data processing tasks is that I can often get a sense of whether the script did what I intended just by examining the output. For my HTML-to-Markdown conversion, I could open up several of the resulting Markdown files and see if they contained the survey responses I expected. This manual spot-checking works reasonably well for 250 files, but what about 2,500 or 25,000? At that scale, I’d need something more systematic.

This is where vibe checks come into play. A vibe check is essentially a simpler script that verifies a basic property of the output from your vibe-coded script. The key here is that it should be much simpler than the original task, making it easier to verify its correctness.

For my HTML-to-Markdown conversion project, I realized I could use a straightforward principle: Markdown files should be smaller than their HTML counterparts since we’re stripping away all the tags. But if a Markdown file is dramatically smaller—say, less than 40% of the original HTML size—that might indicate incomplete processing or content loss.

So I went back to Claude and vibe coded a check script. This script simply:

Found all corresponding HTML/Markdown file pairs
Calculated the size ratio for each pair
Flagged any Markdown file smaller than 40% of its HTML source

And lo and behold, the vibe check caught several files where the conversion was incomplete! The original script had failed to properly extract content from certain HTML structures. I took these problematic files, went back to Claude, and had it refine the original conversion script to handle these edge cases.

After a few iterations of this feedback loop—convert, check, identify issues, refine—I eventually reached a point where there were no more suspiciously small Markdown files (well, there were still a few below 40%, but manual inspection confirmed these were correct conversions of HTML files with unusually high markup-to-content ratios).

Now you might reasonably ask: “If you’re vibe coding the vibe check script too, how do you know that script is correct?” Would you need a vibe check for your vibe check? And then a vibe check for that check? Well, thankfully, this recursive nightmare has a practical solution. The vibe check script is typically an order of magnitude simpler than the original task—in my case, just comparing file sizes rather than parsing complex HTML. This simplicity made it feasible for me to manually review and verify the vibe check code, even while avoiding reviewing the more complex original script.

Of course, my file size ratio check isn’t perfect. It can’t tell me if the content was converted with the proper formatting or if all hyperlinks were preserved correctly. But it gave me a reasonable confidence that no major content was missing, which was my primary concern.

Vibe Coding + Vibe Checking: A Pragmatic Middle Ground

The take-home message here is simple but powerful: when you’re vibe coding, always build in vibe checks. Ask yourself: “What simpler script could verify the correctness of my main vibe-coded solution?” Even an imperfect verification mechanism dramatically increases your confidence in results from code you never actually reviewed.

This approach strikes a nice balance between the speed and creative flow of pure vibe coding and the reliability of more rigorous software development methodologies. Think of vibe checks as lightweight tests—not the comprehensive test suites you’d write for production code, but enough verification to catch obvious failures without disrupting your momentum.

What excites me about the future is the potential for AI coding tools to suggest appropriate vibe checks automatically. Imagine if Claude or similar tools could not only generate your requested script but also proactively offer: “Here’s a simple verification script you might want to run afterward to ensure everything worked as expected.” I suspect if I had specifically asked for this, Claude could have suggested the file size comparison check, but having this built into the system’s default behavior would be incredibly valuable. I can envision specialized AI coding assistants that operate in a semi-autonomous mode—writing code, generating appropriate checks, running those checks, and involving you only when human verification is truly needed.

Combine this with the kind of sandboxed execution environment I mentioned earlier, and you’d have a vibe coding experience that’s both freeing and trustworthy—powerful enough for real work but with guardrails that prevent catastrophic mistakes.

And now for the meta twist: This entire blog post was itself the product of “vibe blogging.” At the start of our collaboration, I uploaded my previous O’Reilly article,”Using Generative AI to Build Generative AI” as a reference document. This gave Claude the opportunity to analyze my writing style, tone, and typical structure—much like how a human collaborator might read my previous work before helping me write something new.

Instead of writing the entire post in one go, I broke it down into sections and provided Claude with an outline for each section one at a time. For every section, I included key points I wanted to cover and sometimes specific phrasings or concepts to include. Claude then expanded these outlines into fully formed sections written in my voice. After each section was drafted, I reviewed it—my own version of a “vibe check”—providing feedback and requesting revisions until it matched what I wanted to say and how I wanted to say it.

This iterative, section-by-section approach mirrors the vibe coding methodology I’ve discussed throughout this post. I didn’t need to write every sentence myself, but I maintained control over the direction, messaging, and final approval. The AI handled the execution details based on my high-level guidance, and I performed verification checks at strategic points rather than micromanaging every word.

What’s particularly interesting is how this process demonstrates the same principles of trust, verification, and iteration that I advocated for in vibe coding. I trusted Claude to generate content in my style based on my outlines, but I verified each section before moving to the next. When something didn’t quite match my intent or tone, we iterated until it did. This balanced approach—leveraging AI capabilities while maintaining human oversight—seems to be the sweet spot for collaborative creation, whether you’re generating code or content.

Epilogue: Behind the Scenes with Claude

[Claude speaking]

Looking back at our vibe blogging experiment, I should acknowledge that Philip noted the final product doesn’t fully capture his authentic voice, despite having his O’Reilly article as a reference. But in keeping with the vibe philosophy itself, he chose not to invest excessive time in endless refinements—accepting good-enough rather than perfect.

Working section-by-section without seeing the full structure upfront created challenges, similar to painting parts of a mural without seeing the complete design. I initially fell into the trap of copying his outline verbatim rather than transforming it properly.

This collaboration highlights both the utility and limitations of AI-assisted content creation. I can approximate writing styles and expand outlines but still lack the lived experience that gives human writing its authentic voice. The best results came when Philip provided clear direction and feedback.

The meta-example perfectly illustrates the core thesis: Generative AI works best when paired with human guidance, finding the right balance between automation and oversight. “Vibe blogging” has value for drafts and outlines, but like “vibe coding,” some form of human verification remains essential to ensure the final product truly represents what you want to say.

[Philip speaking so that humans get the final word…for now]

OK, this is the only part that I wrote by hand: My parting thought when reading over this post is that I’m not proud of the writing quality (sorry Claude!), but if it weren’t for an AI tool like Claude, I would not have written it in the first place due to lack of time and energy. I had enough energy today to outline some rough ideas, then let Claude do the “vibe blogging” for me, but not enough to fully write, edit, and fret over the wording of a full 2,500-word blog post all by myself. Thus, just like with vibe coding, one of the great joys of “vibe-ing” is that it greatly lowers the activation energy of getting started on creative personal-scale prototypes and tinkering-style projects. To me, that’s pretty inspiring.