LLM Watch: Vibe Coding 101

Vibe Coding 404: How Not to Give Your Secrets Away

Pascal Biese — Wed, 09 Apr 2025 16:20:24 GMT

So, you've started “vibe coding” – building an app or website with the help of AI tools (Replit, Lovable, Cursor, etc.) – and everything is going great. You’re piling up features, the AI is handling the heavy lifting, and you’re feeling like nothing can stop you. Security might be the last thing on your mind. After all, you're just prototyping, right?

But here’s the deal: even quick projects can run into big trouble if you accidentally expose sensitive data or overlook basic security steps. Imagine waking up to find your database emptied by a stranger, or an unexpected $5,000 bill because someone “borrowed” your API key! The good news is you don’t need to be a security expert to avoid most of these nightmares. A few simple habits will keep your project safe and keep you confidently building.

“Security is my vibe!”
- Dylan (35), aspiring vibe coder, after learning the hard way.

This beginner-friendly guide will walk you through data security (keeping your keys, secrets, and user data safe) and a bit of code security (writing code that doesn’t open the door to attackers). We’ll keep it conversational and practical – no fancy tech, just real talk on why it matters and how to stay safe. Let’s dive in!

Why Security Matters for Vibe Coders

You might be thinking, "I'm just a solo builder hacking something together. Do I really need to worry about security?" The answer is yes, and here’s why:

Protect Your Wallet: Many AI-based services (like OpenAI’s API) charge money per use. If your secret API key leaks, someone could use it to rack up charges on your account. There are real stories of developers getting hit with huge bills because attackers found their keys. (One attacker, for example, reported finding over 1,000 OpenAI API keys by scanning public Replit projects).
Protect Your Data (and Your Users’ Trust): If you accidentally leave a database or storage bucket open, bad actors can steal or delete data. In one case, 900+ websites using Firebase (a popular online database) misconfigured their security and exposed 125 million records, including emails, passwords, and billing info, to the public. Imagine explaining to your users (or your boss) that personal data got leaked – not fun.
Stay Up and Running: Security flaws can get your app hijacked. An exposed webhook or an insecure piece of code can let someone else control parts of your app or knock it offline. If your prototype suddenly breaks because of an attack, that’s time lost and a major vibe check on your motivation.

In short, a few careless mistakes can derail your project or cost you big time. On the flip side, a little care with security means you can keep the good vibes going – your app stays safe, your bills stay sane, and you build with peace of mind. Now, let’s get into the specific things you should watch out for and how to handle them.

Data Security Essentials (Keep Your Secrets Safe)

“Data security” might sound heavy, but here we’re mostly talking about keeping secrets secret and not exposing things that shouldn’t be public. As a vibe coder, you deal with things like API keys, database URLs, or webhook URLs – these are the keys to your kingdom. Let’s go through the must-knows one by one.

API Keys and Secrets: Handle with Care

What they are: API keys, secret tokens, database credentials – think of these as the passwords that grant access to services. For example, an OpenAI API key lets whoever has it use your OpenAI account (and spend your money), and a database URL with a password could let someone read or write all your data. In short, they are high-value targets.

Why you should care: If an API key or secret token gets leaked, anyone can use it as if they were you. OpenAI explicitly warns developers: “Remember that your API key is a secret! Do not share it or expose it in any client-side code (browsers, apps)”. If a key is exposed, strangers can start running up your usage or fiddling with your data. For instance, one group of attackers scraped public code repositories and found hundreds of leaked OpenAI keys – then used those keys to give themselves free access to GPT-4, charging the victims’ accounts. Some unlucky devs have been hit with thousands of dollars in charges because of this kind of mistake.

Don't Believe the Vibe: Best Practices for Coding with AI Agents

Pascal Biese — Wed, 02 Apr 2025 16:04:34 GMT

Left: Without Solid Engineering Practices. Right: With Solid Engineering Practices.

AI coding assistants have evolved into powerful “pair programmers,” accelerating development of software projects - if used carefully. This article provides a closer look at four leading AI-powered development tools – Cursor, Windsurf, Cline, and Roo Code – comparing their strengths, features, and ideal use cases.

We’ll also explore best practices for effective workflows, and tips on debugging, refactoring, documentation, collaboration, and CI/CD integration with AI assistance.

Overview of AI Coding Agents

Modern AI coding agents vary in form: some are standalone AI-enhanced IDEs, others are extensions that integrate into popular editors. All four tools in focus aim to boost productivity by understanding code context, generating code, refactoring, and even running tests or commands on your behalf.

Here’s a brief introduction to each:

Cursor – A standalone AI code editor (forked from VS Code) that tightly integrates a conversational agent into the coding workflow. Cursor offers multi-line intelligent autocompletion and an “agentic” mode to execute larger tasks semi-autonomously. It is known for strong codebase understanding and rapid improvements, albeit with a usage-based pricing model.

Windsurf – An AI-first IDE by Codeium, positioned as the “first agentic IDE.” It leverages Cascade technology for deep project-wide context awareness and multi-file coherent edits. Windsurf provides advanced features like Supercomplete (intent-based code completion), AI-assisted command execution, and persistent “Memories” for context. It offers a generous free tier (with optional Pro plan).

Cline – An open-source AI coding assistant extension for VS Code, designed to plan and execute development tasks collaboratively. Cline operates in dual modes: Plan Mode (where it gathers context, discusses architecture, and drafts solutions without changing code) and Act Mode (where it implements the agreed plan in code). Cline emphasizes a step-by-step approach with user oversight and privacy (data never leaves your environment by default).

Roo Code – Originally born from Cline (previously “Roo Cline”), Roo Code is a VS Code extension that pushes autonomous coding further. It mimics a junior developer’s workflow by cyclically planning, coding, running, and debugging with minimal intervention. Roo Code introduces multiple personas/modes (e.g. Code mode, Architect mode, QA mode) that tailor the AI’s behavior to different tasks, along with auto-approval options for certain actions. It remains free and open-source, requiring you to connect your own AI model backends (OpenAI API, local models, etc.), giving developers flexibility in choosing the AI model.

With how many regular updates there are for all of these tools, it’s hard to give a clear recommendation. While I would draw a line between the “AI-first” IDEs (Cursor and Windsurf) and Cline/Roo Code (which feel more like natural language command line tools), the differences between each of those are hard to quantify.

I personally find it to be more of a preference than anything else. If you don’t want to switch between tools and try out every new update, I would suggest to simply pick one or two of them and to stick with whatever you feel works best for you. Next, we compare these tools in detail to highlight what each excels at and where each might fall short.

Detailed Comparison of Cursor, Windsurf, Cline, and Roo Code

Tool Origins and Ecosystem: Cursor and Windsurf are full-fledged AI-integrated code editors (forked from VS Code), while Cline and Roo Code are extensions that run within your IDE. This means if you prefer to stick with VS Code and its extensions, Cline/Roo Code might slot into your existing setup more easily. Cursor and Windsurf, on the other hand, offer a self-contained IDE experience. Windsurf was built by the team behind Codeium, inheriting Codeium’s AI autocomplete engine; Cursor is an independent product by AnySphere; Cline and Roo Code are community-driven open-source projects with strong user communities (Cline notably surpassed 1 million installs).

AI Capabilities and Autonomy: All four agents can generate and edit code based on natural language instructions, but their levels of autonomy and context-handling differ. Cursor introduced the concept of an “agentic” code editor – it can act on your behalf to perform tasks like creating new files or refactoring multiple modules once you approve a plan. Windsurf similarly brands itself as an agentic IDE, with AI Flows (agents + copilots) that maintain real-time awareness of your actions. Cline explicitly separates planning from execution, requiring a human go-ahead to move from architectural discussion to code changes – a design that enforces human-in-the-loop control for safety. Roo Code leans toward higher autonomy: you can configure it to auto-approve routine edits or command executions in a “hands-off” mode. In practice, Cursor and Windsurf feel more proactive in suggesting next steps continuously (Cursor’s interface even makes the AI chat occupy half the editor pane by design), whereas Cline emphasizes a deliberate two-phase workflow and Roo Code offers adaptive autonomy settings (manual vs. hybrid vs. auto) to suit your comfort level.

Context Awareness: A key strength of these tools is understanding your codebase context to provide relevant suggestions. Windsurf arguably leads here with its proprietary Context Engine – it deeply indexes your entire project and keeps a “memory” of your code, enabling coherent multi-file edits and informed completions even on large production codebases. It also provides an Indexing Engine for semantic code search and references beyond the open files. Cursor also indexes your project (using embeddings for context) and will automatically include relevant file references when you query it, though some reviews suggest its multi-file support is more basic compared to Windsurf’s advanced approach. Cline and Roo Code both allow the AI to read multiple files and even entire folders on command (via instructions like @file or @folder to inject content into the conversation). Roo Code’s Context Mentions and persistent session state mean it can carry knowledge across multiple prompts in a coding session.

All four tools strive to “know” your code – for example, Cursor will not only complete code but also auto-import symbols it suggests if they aren’t already imported. Windsurf’s Memories feature further allows explicit or automatic rules to persist (such as remembering project-specific conventions or API keys across sessions), which can be very useful in long-running ML experiments or complex API projects. Keep in mind that these things can change very quickly - so don’t stress too much over the details. If a feature is really successful for one of those frameworks, it usually gets adopted by the others sooner than later.

Pricing and Access: All four have free options, but with different models. Roo Code is entirely free/open-source (you just might pay for API usage if using a paid model like OpenAI). Cline is open-source but provides a cloud service for model access – it offers some free credits and then a subscription for more heavy use (e.g., a $20/month plan for generous usage, and ability to connect to enterprise model endpoints). Windsurf can be used fully free with a Codeium account; a Pro plan (~$10/month) unlocks larger models and more Cascade credits (for longer AI sessions). Cursor has a usage-based pricing – there’s a base subscription (around $20/month) which includes a certain amount of AI compute, and you can pay-as-you-go beyond that for heavy usage. In enterprise settings, Cursor also offers volume licenses and on-prem privacy options (SOC 2 compliance, etc.), whereas Windsurf and Cline appeal to both individual devs and companies by allowing self-hosted or private model endpoints.

Recommended Workflows and Best Practices

Integrating AI coding agents into your development workflow requires some strategy to get the best results. Below are best practices for using these tools effectively, especially when working on larger projects. These guidelines will help you leverage each tool while maintaining good software engineering discipline.

From Code Assistants to Agents: Introduction to AI Coding

Pascal Biese — Tue, 01 Apr 2025 20:01:42 GMT

Over the last two years, we've witnessed remarkable advancements in AI-powered coding tools—from simple autocomplete features to sophisticated code generation capabilities. These tools have rapidly transformed from experimental curiosities to essential components of modern development workflows.

In this article, we'll examine a fundamental shift occurring in this space: the rise of agentic code assistance tools that go beyond basic code completion to offer autonomous planning, coding, debugging, and even deployment capabilities. This represents a significant advancement over first-generation AI coding tools that were primarily reactive, context-aware suggestion engines.

What we'll cover in this article:

The evolution from code completion to agentic assistance
Leading tools in the agentic code assistance ecosystem
Technical capabilities and architecture of modern coding agents
Performance limitations in professional development contexts
Best practices for maintaining code quality

Let's dive into how these tools are reshaping professional software development.

1. From Code Completion to Agentic Assistance

The first generation of AI coding tools like early versions of GitHub Copilot primarily focused on autocomplete-style functionality—suggesting the next line or block of code based on what you were typing. While revolutionary at the time, these tools were fundamentally reactive, responding only to immediate user input and context.

Today's agentic code assistance tools represent a significant paradigm shift. They exhibit greater autonomy and can engage in complex tasks like:

Planning code architecture before implementation
Debugging errors through multi-step reasoning
Refactoring existing code across multiple files
Generating test suites with comprehensive coverage
Deploying software with minimal human intervention

This evolution is driven by three key technical advancements:

1. More sophisticated language models: As foundation models are constantly getting better, so are the coding assistants that - at their core - rely on them.

2. Multi-step reasoning capabilities: Rather than generating single suggestions, modern agents can plan and execute complex sequences of actions, evaluating their success and adapting accordingly.

3. Deeper integration with development environments: Today's tools have access to more context—not just the current file but project structure, version control history, and even runtime information.

The shift from reactive tools to autonomous agents mirrors the progression we've seen in other AI applications, and it's fundamentally changing how developers approach their work.

2. Leading Tools in the Agentic Code Assistance Ecosystem

There’s a plethorra of coding tools available in the current market - more than anyone would ever need. And with that many tools competing for the limited attention of developers, it’s getting hard to keep up. For the sake of the reader’s sanity, I will only list the ones that have remained popular for at least a few months now. Which one of these is regarded “the best” can change very quickly. So my advice would be to choose one or two of them and try things out until you’ve gotten familiar with the workflows.

GitHub Copilot

Originally focused on code completion, GitHub Copilot has evolved to include more agentic features through Copilot Chat. Built on OpenAI's models and deeply integrated with the GitHub ecosystem, it can now assist with code explanation, test generation, and code review. Its key technical strength is its training on millions of repositories in GitHub's vast codebase.

Cursor and Windsurf

Both tools take an IDE-centric approach, with Cursor building on VS Code and Windsurf creating its own editor environment. What makes them technically distinct is their deep contextual understanding of codebases and their ability to modify code across multiple files while maintaining project coherence.

Cline and Roo Code

Cline lets you connect to a wide range of models including Claude, GPT-4, and Llama to deliver code assistance directly in your development environment (e.g., VS Code or Cursor). Cline focuses on a clean interface that simplifies prompting and interaction, while providing AI-augmented coding assistance for developers working in Visual Studio Code.

Roo Code (formerly Roo Cline) builds upon Cline's foundation while adding additional features. This fork maintains the same clean interface but offers expanded capabilities including multi-model support and other experimental features.

Augment Code

Augment Code’s technical strengths is its ability to understand and manipulate code across multiple files while maintaining consistency and coherence throughout the project. It’s currently still in Beta and might be unstable at times. But it’s one of the few services that offer a paid tier with unlimited consumption.

3. Technical Capabilities and Architecture

The most advanced agentic code assistance tools share several key architectural components that enable their functionality:

Multi-Agent Collaboration

Rather than relying on a single monolithic agent, these tools use multi-agent architectures where specialized agents collaborate to complete complex tasks. This approach mirrors human team dynamics, with different agents taking on specialized roles like:

Planning agents that break down problems into logical steps
Coding agents that implement specific functionality
Testing agents that generate test cases and assertions
Debugging agents that identify and fix issues
Documentation agents that explain code and generate comments

This multi-agent approach allows for parallel processing and specialization, making these tools more effective for complex projects.

Contextual Understanding

Modern agentic tools maintain and leverage much deeper context than earlier generations. Contextual understanding is achieved through sophisticated indexing systems that maintain representations of the codebase and its relationships. This leads to systems that can parse and understand entire project structures (at least in theory, but we’ll talk about that later), track dependencies between files and modules and understand project-specific conventions and patterns.

They can also be connected to external documentations and APIs, which helps with tasks and frameworks that haven’t been present in the training data.

Iterative Refinement Loops

Perhaps the most important technical advancement is the ability to execute code, evaluate results, and refine solutions iteratively. This creates a feedback loop that mirrors human development patterns:

Generate initial code based on requirements
Execute the code in a sandboxed environment
Evaluate results against expected outcomes
Identify and fix issues
Repeat until success criteria are met

This capability transforms these tools from simple suggestion engines to autonomous problem-solvers that can work through complex issues methodically.

4. Performance Limitations in Professional Contexts

Despite their impressive capabilities, agentic code assistance tools still face significant limitations in professional development environments:

Complex Logic and (Large) Context Understanding

While these tools excel at pattern recognition and code generation, they still struggle with deeply understanding complex business logic and project-specific requirements. AI agents may generate syntactically correct code that fails to capture the nuanced business logic required for production applications.

At a technical level, this limitation stems from the fundamental architecture of language models, which ultimately predict tokens based on patterns rather than truly "understanding" domain-specific concepts. This leads to scenarios where generated code looks reasonable but contains subtle logical errors.

Code Quality and Maintainability Issues

Without careful oversight, AI-generated code can introduce technical debt through:

Inefficient algorithms or implementations
Poor modularity and excessive coupling
Inconsistent naming conventions and coding styles
Redundant or unnecessary code
Over-engineering simple solutions

These issues arise because current models optimize for producing working code rather than highly maintainable code. They may also repeat anti-patterns found in their training data without recognizing them as problematic.

Security Vulnerabilities

An often overlooked but particularly concerning limitation: AI models trained on public repositories may inadvertently reproduce security vulnerabilities present in that training data. Common issues include improper input validation, SQL injection vulnerabilities, outdated or vulnerable dependencies and hardcoded credentials.

This creates significant risks for production code and necessitates rigorous security review of all AI-generated code.

Training Data Limitations

Current agentic tools are limited by their training data, which may be outdated and lacks exposure to certain specialized domains.

As a result, these tools often perform best on mainstream use cases with commonly used technologies and may struggle with cutting-edge or highly specialized development tasks.

5. Best Practices for Maintaining Code Quality

The strategies for effectively leveraging agentic code assistance without sacrificing code quality - not unlike traditional programming - require disciplined practices:

Clear and Specific Prompting

The quality of generated code depends heavily on the quality of instructions provided. Effective practices include:

Providing detailed specifications rather than vague requests
Including examples of expected output or behavior
Specifying relevant constraints and requirements
Referencing existing patterns within the codebase

Developers who know what to ask for and how to phrase it can significantly improve the quality and relevance of AI-generated code.

Aligning with Coding Standards

To maintain consistency across codebases, teams should configure AI tools to adhere to team-specific style guides and apply automatic formatters and linters after generation.

Some advanced tools allow training on organization-specific codebases, which helps align generated code with internal standards. Although - to the best of my knowledge - this is not something that a lot of companies are doing… yet!

Human Oversight and Comprehensive Testing

AI-generated code should never bypass human review. Many organizations implement a "trust but verify" approach, using AI to accelerate development while maintaining rigorous human oversight.

Independently of that, AI-generated code should have to go through a well-crafted testing regime:

Unit tests for individual functions and components
Integration tests for interactions between systems
End-to-end tests for complete workflows
Stress tests for performance under load
Security tests for vulnerability detection

Many teams leverage AI itself to generate comprehensive test suites alongside implementation code. But keep in mind that these tests also have to be checked by a human! There’s no point in testing your code if you’re testing the wrong things.

Documentation and Encapsulation

Lastly, documentation not only helps humans to make sense of your code, it also serves as an anchor for the AI whenever it's starting to forget what your requirements were. A well-documented README with clear explanations of purpose and function can go a long way. Some tools have started to include this in their suggested best practices (e.g., Claude Code with CLAUDE.md).

The goal of this process is to ensure that the project remains clean and maintanable by steering the AI to produce code that is encapsulated in modular, reusable components, structured with clear separation of concerns and consistently named according to project conventions.

Conclusion

Agentic code assistance represents a fundamental shift in how software is developed. Moving beyond simple suggestions, these tools now offer increasingly autonomous capabilities that span the entire development lifecycle. While they bring tremendous productivity benefits in the short-term, they also introduce new challenges in quality control, security, and the evolving role of human developers.

For professional developers, the key to successfully leveraging these tools lies in understanding both their capabilities and limitations. Used thoughtfully - with proper oversight, testing, and integration into existing workflows - agentic code assistance can significantly accelerate development while maintaining or even improving code quality.

While we can expect them to handle increasingly complex tasks, this won’t just magically make all of these problems go away. The most successful organizations will be those that thoughtfully integrate these technologies into their workflows, combining AI capabilities with human expertise to deliver better software faster than ever before.