Javascript

How to build an AI assistant with OpenAI, Vercel AI SDK, and Ollama with Next.js

Published Sep 27, 2024

Updated Oct 1, 2024

8 min read

In today’s blog post, we’ll build an AI Assistant using three different AI models: Whisper and TTS from OpenAI and Llama 3.1 from Meta.

While exploring AI, I wanted to try different things and create an AI assistant that works by voice. This curiosity led me to combine OpenAI’s Whisper and TTS models with Meta’s Llama 3.1 to build a voice-activated assistant.

Here’s how these models will work together:

First, we’ll send our audio to the Whisper model, which will convert it from speech to text.
Next, we’ll pass that text to the Llama 3.1 model. Llama will understand the text and generate a response.
Finally, we’ll take Llama’s response and send it to the TTS model, turning the text back into speech. We’ll then stream that audio back to the client.

Let’s dive in and start building this excellent AI Assistant!

Getting started

We will use different tools to build our assistant. To build our client side, we will use Next.js. However, you could choose whichever framework you prefer.

To use our OpenAI models, we will use their TypeScript / JavaScript SDK. To use this API, we require the following environmental variable: OPENAI_API_KEY—

To get this key, we need to log in to the OpenAI dashboard and find the API keys section. Here, we can generate a new key.

Awesome. Now, to use our Llama 3.1 model, we will use Ollama and the Vercel AI SDK, utilizing a provider called ollama-ai-provider.

Ollama will allow us to download our preferred model (we could even use a different one, like Phi) and run it locally. The Vercel SDK will facilitate its use in our Next.js project.

To use Ollama, we just need to download it and choose our preferred model. For this blog post, we are going to select Llama 3.1. After installing Ollama, we can verify if it is working by opening our terminal and writing the following command:

Notice that I wrote “llama3.1” because that’s my chosen model, but you should use the one you downloaded.

Kicking things off

It's time to kick things off by setting up our Next.js app. Let's start with this command:

npx create-next-app@latest

After running the command, you’ll see a few prompts to set the app's details. Let's go step by step:

Name your app.
Enable app router.

The other steps are optional and entirely up to you. In my case, I also chose to use TypeScript and Tailwind CSS.

Now that’s done, let’s go into our project and install the dependencies that we need to run our models:

npm i ai ollama-ai-provider openai

Building our client logic

Now, our goal is to record our voice, send it to the backend, and then receive a voice response from it.

To record our audio, we need to use client-side functions, which means we need to use client components. In our case, we don’t want to transform our whole page to use client capabilities and have the whole tree in the client bundle; instead, we would prefer to use Server components and import our client components to progressively enhance our application.

So, let’s create a separate component that will handle the client-side logic.

Inside our app folder, let's create a components folder, and here, we will be creating our component:

app
 ↳components
  ↳audio-recorder.tsx

Let’s go ahead and initialize our component. I went ahead and added a button with some styles in it:

// app/components/audio-recorder.tsx
'use client'
export default function AudioRecorder() {
    function handleClick(){
      console.log('click')
    }

    return (
        <section>
		<button onClick={handleClick}
                    className={`bg-blue-500 text-white px-4 py-2 rounded shadow-md hover:bg-blue-400 focus:outline-none focus:ring-2 focus:ring-blue-500 focus:ring-offset-2 focus:ring-offset-white transition duration-300 ease-in-out absolute top-1/2 left-1/2 -translate-x-1/2 -translate-y-1/2`}>
                Record voice
            </button>
        </section>
    )
}

And then import it into our Page Server component:

// app/page.tsx
import AudioRecorder from '@/app/components/audio-recorder';

export default function Home() {
  return (
      <AudioRecorder />
  );
}

Now, if we run our app, we should see the following:

Awesome! Now, our button doesn’t do anything, but our goal is to record our audio and send it to someplace; for that, let us create a hook that will contain our logic:

app
 ↳hooks
  ↳useRecordVoice.ts

import { useEffect, useRef, useState } from 'react';

export function useRecordVoice() {
  return {}
}

We will use two APIs to record our voice: navigator and MediaRecorder. The navigator API will give us information about the user’s media devices like the user media audio, and the MediaRecorder will help us record the audio from it. This is how they’re going to play out together:

// apps/hooks/useRecordVoice.ts
import { useEffect, useRef, useState } from 'react';

export function useRecordVoice() {
    const [isRecording, setIsRecording] = useState(false);
    const [mediaRecorder, setMediaRecorder] = useState<MediaRecorder | null>(null);

     const startRecording = async () => {
        if(!navigator?.mediaDevices){
            console.error('Media devices not supported');
            return;
        }

        const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
        const mediaRecorder = new MediaRecorder(stream);
        setIsRecording(true)
        setMediaRecorder(mediaRecorder);
        mediaRecorder.start(0)
    }

    const stopRecording = () =>{
        if(mediaRecorder) {
            setIsRecording(false)
            mediaRecorder.stop();
        }
    }

  return {
    isRecording,
    startRecording,
    stopRecording,
  }
}

Let’s explain this code step by step. First, we create two new states. The first one is for keeping track of when we are recording, and the second one stores the instance of our MediaRecorder.

 const [isRecording, setIsRecording] = useState(false);
    const [mediaRecorder, setMediaRecorder] = useState<MediaRecorder | null>(null);

Then, we’ll create our first method, startRecording. Here, we are going to have the logic to start recording our audio. We first check if the user has media devices available thanks to the navigator API that gives us information about the browser environment of our user:

If we don’t have media devices to record our audio, we just return. If they do, then let us create a stream using their audio media device.

// check if they have media devices
if(!navigator?.mediaDevices){
 console.error('Media devices not supported');
 return;
}
// create stream using the audio media device
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });

Finally, we go ahead and create an instance of a MediaRecorder to record this audio:

// create an instance passing in the stream as parameter
const mediaRecorder = new MediaRecorder(stream);
// Set this state to true to 
setIsRecording(true)
// Store the instance in the state
setMediaRecorder(mediaRecorder);
// Start recording inmediately
mediaRecorder.start(0)

Then we need a method to stop our recording, which will be our stopRecording. Here, we will just stop our recording in case a media recorder exists.

if (mediaRecorder) {
  setIsRecording(false)
  mediaRecorder.stop();
}

We are recording our audio, but we are not storing it anywhere. Let’s add a new useEffect and ref to accomplish this. We would need a new ref, and this is where our chunks of audio data will be stored.

const audioChunks = useRef<Blob[]>([]);

In our useEffect we are going to do two main things: store those chunks in our ref, and when it stops, we are going to create a new Blob of type audio/mp3:

export function useRecordVoice() {   
   const audioChunks = useRef<Blob[]>([]);

   ...
   useEffect(() => {
        if (mediaRecorder) {
            // listen to when data is available and store it as chunks in our ref
            mediaRecorder.ondataavailable = (e) => {
                audioChunks.current.push(e.data);
            }

            mediaRecorder.onstop = () => {
                // Listen to when we stop recording audio 
                // Then, convert our data to a Blob of type audio/mp3 and reset the ref
                const audioBlob = new Blob(audioChunks.current, { type: 'audio/mp3' });
                audioChunks.current = [];
            }
        }

    }, [mediaRecorder]);
    ...
}

It is time to wire this hook with our AudioRecorder component:

'use client'
import { useRecordVoice } from '@/hooks/useRecordVoice';

export default function AudioRecorder() {
    const { isRecording, stopRecording, startRecording } = useRecordVoice();

    async function handleClick() {
        if (isRecording) {
           stopRecording();
         } else {
         await startRecording();
       }
     }

    return (
        <div>

            <button onClick={handleClick}
                    className={`bg-blue-500 text-white px-4 py-2 rounded shadow-md hover:bg-blue-400 focus:outline-none focus:ring-2 focus:ring-blue-500 focus:ring-offset-2 focus:ring-offset-white transition duration-300 ease-in-out absolute top-1/2 left-1/2 -translate-x-1/2 -translate-y-1/2`}>
                {isRecording ? "Stop Recording" : "Start Recording"}
            </button>

        </div>
    )
}

Let’s go to the other side of the coin, the backend!

Setting up our Server side

We want to use our models on the server to keep things safe and run faster. Let’s create a new route and add a handler for it using route handlers from Next.js. In our App folder, let’s make an “Api” folder with the following route in it:

app
 ↳api
  ↳chat
    ↳route.ts

Our route is called ‘chat’. In the route.ts file, we’ll set up our handler. Let’s start by setting up our OpenAI SDK.

const openai = getOpenai();

export async function POST(req: Request) {
  // our logic will go here
}

// inside a utils folder apps/utils/get-openai.ts
import OpenAI from 'openai';

const openai = new OpenAI({
    apiKey: process.env.OPENAI_API_KEY,
});

export function getOpenai() {
    return openai;
}

In this route, we’ll send the audio from the front end as a base64 string. Then, we’ll receive it and turn it into a Buffer object.

export async function POST(req: Request) {
    const { audio } = await req.json();
    const audioBuffer = Buffer.from(audio, 'base64');
 }

It’s time to use our first model. We want to turn this audio into text and use OpenAI’s Whisper Speech-To-Text model. Whisper needs an audio file to create the text. Since we have a Buffer instead of a file, we’ll use their ‘toFile’ method to convert our audio Buffer into an audio file like this:

import { toFile } from 'openai';

export async function POST(req: Request) {
    const { audio } = await req.json();
    const audioBuffer = Buffer.from(audio, 'base64');

    try {
        // FileLike object
        const audioFile = await toFile(audioBuffer, 'audio.mp3');

    } catch (err) {
        console.error(err);
        return NextResponse.json(
            {
                err: err,
                error: 'Error converting audio',
            },
            {
                status: 500,
            }
        );
    }
}

Notice that we specified “mp3”. This is one of the many extensions that the Whisper model can use. You can see the full list of supported extensions here: https://platform.openai.com/docs/api-reference/audio/createTranscription#audio-createtranscription-file

Now that our file is ready, let’s pass it to Whisper! Using our OpenAI instance, this is how we will invoke our model:

import { toFile } from 'openai';
const openai = getOpenai();

export async function POST(req: Request) {
        ...
        const audioFile = await toFile(audioBuffer, 'audio.mp3');

        const transcription = await openai.audio.transcriptions.create({
            // here we specify the model
            model: 'whisper-1',
            // our audio file
            file: audioFile,
        });
        ...
}

That’s it! Now, we can move on to the next step: using Llama 3.1 to interpret this text and give us an answer. We’ll use two methods for this. First, we’ll use ‘ollama’ from the ‘ollama-ai-provider’ package, which lets us use this model with our locally running Ollama. Then, we’ll use ‘generateText’ from the Vercel AI SDK to generate the text. Side note: To make our Ollama run locally, we need to write the following command in the terminal:

ollama serve

import { toFile } from 'openai';
// new imports
import { ollama } from 'ollama-ai-provider';
import { generateText } from 'ai';

const openai = getOpenai();

export async function POST(req: Request) {
        ...
        const audioFile = await toFile(audioBuffer, 'audio.mp3');

        const transcription = await openai.audio.transcriptions.create({
            model: 'whisper-1',
            file: audioFile,
        });

        const { text: response } = await generateText({
            // we specify our model running locally in the background
            model: ollama('llama3.1'),
            // we can set initial instructions to our model
            system: 'You know a lot about video games',
            // the text we want the model to interpret
            prompt: transcription.text,
        });
        ...
}

Finally, we have our last model: TTS from OpenAI. We want to reply to our user with audio, so this model will be really helpful. It will turn our text into speech:

import { toFile } from 'openai';
// new imports
import { ollama } from 'ollama-ai-provider';
import { generateText } from 'ai';

const openai = getOpenai();

export async function POST(req: Request) {
        ...
        const audioFile = await toFile(audioBuffer, 'audio.mp3');

        const transcription = await openai.audio.transcriptions.create({
            model: 'whisper-1',
            file: audioFile,
        });

        const { text: response } = await generateText({
            model: ollama('llama3.1'),
            system: 'You know a lot about video games',
            prompt: transcription.text,
        });

        const voiceResponse = await openai.audio.speech.create({
            // Specify here our tts model
            model: 'tts-1',
            // we pass in our response
            input: response,
            // We can choose a variety of different voices
            // I chose 'onyx' but you can pick from this list: <https://platform.openai.com/docs/guides/text-to-speech/quickstart>
            voice: 'onyx',
        });
        ...
}

The TTS model will turn our response into an audio file. We want to stream this audio back to the user like this:

import { toFile } from 'openai';
import { getOpenai } from '@/utils/getOpenai';
import { ollama } from 'ollama-ai-provider';
import { NextResponse } from 'next/server';
import { generateText } from 'ai';

const openai = getOpenai();

export async function POST(req: Request) {
    const { audio } = await req.json();
    const audioBuffer = Buffer.from(audio, 'base64');

    try {
        const audioFile = await toFile(audioBuffer, 'audio.mp3');

        const transcription = await openai.audio.transcriptions.create({
            model: 'whisper-1',
            file: audioFile,
        });

        const { text: response } = await generateText({
            model: ollama('llama3.1'),
            system: 'You know a lot about video games',
            prompt: transcription.text,
        });

        const voiceResponse = await openai.audio.speech.create({
            model: 'tts-1',
            input: response,
            voice: 'onyx',
        });

        // stream back our audio
        return new Response(voiceResponse.body, {
            headers: {
                 // we specify the content type         
                'Content-Type': 'audio/mpeg',
                // we indicate that this is going to be streamed in chunks of data
                'Transfer-Encoding': 'chunked',
            },
        });
    } catch (err) {
        console.error(err);
        return NextResponse.json(
            {
                err: err,
                error: 'Error converting audio',
            },
            {
                status: 500,
            }
        );
    }
}

And that’s all the whole backend code! Now, back to the frontend to finish wiring everything up.

Putting It All Together

In our useRecordVoice.tsx hook, let's create a new method that will call our API endpoint. This method will also take the response back and play to the user the audio that we are streaming from the backend.

// app/hooks/useRecordVoice.tsx
...
export function useRecordVoice() {
// new state to track when our server is loading the response for us
const [loading, setLoading] = useState(false);

async function getResponse(audioBlob: Blob) {
  // We transform our audio to base64 to send it to the endpoint
  const audioBase64 = await transformBlobToBase64(audioBlob);

  try {
            setLoading(true);
            // Calling out "chat" endpoint
            const res = await fetch('/api/chat', {
                method: 'POST',
                // Sending our base64 audio here
                body: JSON.stringify({ audio: audioBase64 }),
                headers: {
                    'Content-Type': 'application/json',
                },
            });

            if (!res.ok) {
                throw new Error('Error getting response');
            }

      } catch (err) {
         console.error(err);
      } finally {
        setLoading(false);
      }

}

    useEffect(() => {
        if (mediaRecorder) {
            mediaRecorder.ondataavailable = (e) => {
                audioChunks.current.push(e.data);
            };

            mediaRecorder.onstop = () => {
                const audioBlob = new Blob(audioChunks.current, {
                    type: 'audio/mp3',
                });
                // we call our method here
                void getResponse(audioBlob);
                audioChunks.current = [];
            };
        }
    }, [mediaRecorder]);

...

// app/utils/transform-blob-to-base64.ts
export function transformBlobToBase64(blob: Blob): Promise<string> {
    return new Promise((resolve, reject) => {
        const reader = new FileReader();
        reader.onloadend= () => {
            resolve(reader?.result?.toString().split(',')[1] || '');
        }
        reader.onerror = reject;
        reader.readAsDataURL(blob);
    })
}

Great! Now that we’re getting our streamed response, we need to handle it and play the audio back to the user. We’ll use the AudioContext API for this. This API allows us to store the audio, decode it and play it to the user once it’s ready:

...
async function getResponse(audioBlob: Blob) {
    const audioBase64 = await transformBlobToBase64(audioBlob);

    try {
        setLoading(true);
        const res = await fetch('/api/chat', {
            method: 'POST',
            body: JSON.stringify({ audio: audioBase64 }),
            headers: {
                'Content-Type': 'application/json',
            },
        });

        if (!res.ok) {
            throw new Error('Error getting response');
        }

        // Create an instance of AudioContext
        const audioContext = new AudioContext();

        // Create a reader to read the streaming response
        const reader = res.body?.getReader();
        if (!reader) {
            throw new Error('Error getting response');
        }

        // Create a buffer source to store the audio
        const source = audioContext.createBufferSource();

        // Array to hold the audio chunks received from the backend
        let audioChunks: Uint8Array[] = [];

        // Flag to check if the audio streaming has finished
        let isDataStreamed = false;

        while (!isDataStreamed) {
            // Start reading the data
            const { value, done } = await reader.read();

            // If true, the stream has finished
            if (done) {
                isDataStreamed = true;
                break;
            }

            // Add each data chunk to our list of audio chunks
            if (value) {
                audioChunks.push(value);
            }
        }

        // Merge all buffer chunks into a single Uint8Array
        const audioBuffer = new Uint8Array(
            audioChunks.reduce(
                (acc, val) => acc.concat(Array.from(val)),
                [] as number[]
            )
        );

        // Decode the audio data and store it in our source buffer
        source.buffer = await audioContext.decodeAudioData(
            audioBuffer.buffer
        );

        // Connect the source to the audio output (speakers or headphones)
        source.connect(audioContext.destination);

        // Start playing the audio
        source.start(0);
    } catch (err) {
        console.error(err);
    } finally {
        setLoading(false);
    }
}

...

return {
    startRecording,
    stopRecording,
    isRecording,
    // Return the loading state
    loading,
};

And that's it! Now the user should hear the audio response on their device. To wrap things up, let's make our app a bit nicer by adding a little loading indicator:

// app/components/audio-recorder.tsx

'use client';
import { useRecordVoice } from '@/hooks/useRecordVoice';

export default function AudioRecorder() {
    const { isRecording, stopRecording, startRecording, loading } =
        useRecordVoice();
    async function handleClick() {
        if (isRecording) {
            stopRecording();
        } else {
            await startRecording();
        }
    }

    // New condition
    if (loading) {
        return <div>Loading...</div>;
    }

    return (
        <div>
            <button
                onClick={handleClick}
                className={`bg-blue-500 text-white px-4 py-2 rounded shadow-md hover:bg-blue-400 focus:outline-none focus:ring-2 focus:ring-blue-500 focus:ring-offset-2 focus:ring-offset-white transition duration-300 ease-in-out absolute top-1/2 left-1/2 -translate-x-1/2 -translate-y-1/2`}
            >
                {isRecording ? 'Stop Recording' : 'Start Recording'}
            </button>
        </div>
    );
}

Conclusion

In this blog post, we saw how combining multiple AI models can help us achieve our goals. We learned to run AI models like Llama 3.1 locally and use them in our Next.js app. We also discovered how to send audio to these models and stream back a response, playing the audio back to the user.

This is just one of many ways you can use AI—the possibilities are endless. AI models are amazing tools that let us create things that were once hard to achieve with such quality. Thanks for reading; now, it’s your turn to build something amazing with AI!

You can find the complete demo on GitHub: AI Assistant with Whisper TTS and Ollama using Next.js

This Dot is a consultancy dedicated to guiding companies through their modernization and digital transformation journeys. Specializing in replatforming, modernizing, and launching new initiatives, we stand out by taking true ownership of your engineering projects.

We love helping teams with projects that have missed their deadlines or helping keep your strategic digital initiatives on course. Check out our case studies and our clients that trust us with their engineering.

About the author(s)

Jesús Padrón
@padron4497 @jesus4497

Exploring Angular Forms: A New Alternative with Signals

Exploring Angular Forms: A New Alternative with Signals In the world of Angular, forms are essential for user interaction, whether you're crafting a simple login page or a more complex user profile interface. Angular traditionally offers two primary approaches: template-driven forms and reactive forms. In my previous series on Angular Reactive Forms, I explored how to harness reactive forms' power to manage complex logic, create dynamic forms, and build custom form controls. A new tool for managing reactivity - signals - has been introduced in version 16 of Angular and has been the focus of Angular maintainers ever since, becoming stable with version 17. Signals allow you to handle state changes declaratively, offering an exciting alternative that combines the simplicity of template-driven forms with the robust reactivity of reactive forms. This article will examine how signals can add reactivity to both simple and complex forms in Angular. Recap: Angular Forms Approaches Before diving into the topic of enhancing template-driven forms with signals, let’s quickly recap Angular's traditional forms approaches: 1. Template-Driven Forms: Defined directly in the HTML template using directives like ngModel, these forms are easy to set up and are ideal for simple forms. However, they may not provide the fine-grained control required for more complex scenarios. Here's a minimal example of a template-driven form: ` ` 2. Reactive Forms: Managed programmatically in the component class using Angular's FormGroup, FormControl, and FormArray classes; reactive forms offer granular control over form state and validation. This approach is well-suited for complex forms, as my previous articles on Angular Reactive Forms discussed. And here's a minimal example of a reactive form: ` ` Introducing Signals as a New Way to Handle Form Reactivity With the release of Angular 16, signals have emerged as a new way to manage reactivity. Signals provide a declarative approach to state management, making your code more predictable and easier to understand. When applied to forms, signals can enhance the simplicity of template-driven forms while offering the reactivity and control typically associated with reactive forms. Let’s explore how signals can be used in both simple and complex form scenarios. Example 1: A Simple Template-Driven Form with Signals Consider a basic login form. Typically, this would be implemented using template-driven forms like this: ` ` This approach works well for simple forms, but by introducing signals, we can keep the simplicity while adding reactive capabilities: ` ` In this example, the form fields are defined as signals, allowing for reactive updates whenever the form state changes. The formValue signal provides a computed value that reflects the current state of the form. This approach offers a more declarative way to manage form state and reactivity, combining the simplicity of template-driven forms with the power of signals. You may be tempted to define the form directly as an object inside a signal. While such an approach may seem more concise, typing into the individual fields does not dispatch reactivity updates, which is usually a deal breaker. Here’s an example StackBlitz with a component suffering from such an issue: Therefore, if you'd like to react to changes in the form fields, it's better to define each field as a separate signal. By defining each form field as a separate signal, you ensure that changes to individual fields trigger reactivity updates correctly. Example 2: A Complex Form with Signals You may see little benefit in using signals for simple forms like the login form above, but they truly shine when handling more complex forms. Let's explore a more intricate scenario - a user profile form that includes fields like firstName, lastName, email, phoneNumbers, and address. The phoneNumbers field is dynamic, allowing users to add or remove phone numbers as needed. Here's how this form might be defined using signals: ` > Notice that the phoneNumbers field is defined as a signal of an array of signals. This structure allows us to track changes to individual phone numbers and update the form state reactively. The addPhoneNumber and removePhoneNumber methods update the phoneNumbers signal array, triggering reactivity updates in the form. ` > In the template, we use the phoneNumbers signal array to dynamically render the phone number input fields. The addPhoneNumber and removePhoneNumber methods allow users to reactively add or remove phone numbers, updating the form state. Notice the usage of the track function, which is necessary to ensure that the ngFor directive tracks changes to the phoneNumbers array correctly. Here's a StackBlitz demo of the complex form example for you to play around with: Validating Forms with Signals Validation is critical to any form, ensuring that user input meets the required criteria before submission. With signals, validation can be handled in a reactive and declarative manner. In the complex form example above, we've implemented a computed signal called formValid, which checks whether all fields meet specific validation criteria. The validation logic can easily be customized to accommodate different rules, such as checking for valid email formats or ensuring that all required fields are filled out. Using signals for validation allows you to create more maintainable and testable code, as the validation rules are clearly defined and react automatically to changes in form fields. It can even be abstracted into a separate utility to make it reusable across different forms. In the complex form example, the formValid signal ensures that all required fields are filled and validates the email and phone numbers format. This approach to validation is a bit simple and needs to be better connected to the actual form fields. While it will work for many use cases, in some cases, you might want to wait until explicit "signal forms" support is added to Angular. Tim Deschryver started implementing some abstractions around signal forms, including validation and wrote an article about it. Let's see if something like this will be added to Angular in the future. Why Use Signals in Angular Forms? The adoption of signals in Angular provides a powerful new way to manage form state and reactivity. Signals offer a flexible, declarative approach that can simplify complex form handling by combining the strengths of template-driven forms and reactive forms. Here are some key benefits of using signals in Angular forms: 1. Declarative State Management: Signals allow you to define form fields and computed values declaratively, making your code more predictable and easier to understand. 2. Reactivity: Signals provide reactive updates to form fields, ensuring that changes to the form state trigger reactivity updates automatically. 3. Granular Control: Signals allow you to define form fields at a granular level, enabling fine-grained control over form state and validation. 4. Dynamic Forms: Signals can be used to create dynamic forms with fields that can be added or removed dynamically, providing a flexible way to handle complex form scenarios. 5. Simplicity: Signals can offer a simpler, more concise way to manage form states than traditional reactive forms, making building and maintaining complex forms easier. Conclusion In my previous articles, we explored the powerful features of Angular reactive forms, from dynamic form construction to custom form controls. With the introduction of signals, Angular developers have a new tool that merges the simplicity of template-driven forms with the reactivity of reactive forms. While many use cases warrant Reactive Forms, signals provide a fresh, powerful alternative for managing form state in Angular applications requiring a more straightforward, declarative approach. As Angular continues to evolve, experimenting with these new features will help you build more maintainable, performant applications. Happy coding!...

Oct 4, 2024

5 mins

AngularJavaScript

Introduction to Zod for Data Validation

As web developers, we're often working with data from external sources like APIs we don't control or user inputs submitted to our backends. We can't always rely on this data to take the form we expect, and we can encounter unexpected errors when it deviates from expectations. But with the Zod library, we can define what our data ought to look like and parse the incoming data against those defined schemas. This lets us work with that data confidently, or to quickly throw an error when it isn't correct. Why use Zod? TypeScript is great for letting us define the shape of our data in our code. It helps us write more correct code the first time around by warning us if we are doing something we shouldn't. But TypeScript can't do everything for us. For example, we can define a variable as a string or a number, but we can't say "a string that starts with user_id_ and is 20 characters long" or "an integer between 1 and 5". There are limits to how much TypeScript can narrow down our data for us. Also, TypeScript is a tool for us developers. When we compile our code, our types are not available to the vanilla JavaScript. JavaScript can't validate that the data we actually use in our code matches what we thought we'd get when we wrote our TypeScript types unless you're willing to manually write code to perform those checks. This is where we can reach for a tool like Zod. With Zod, we can write data schemas. These schemas, in the simplest scenarios, look very much like TypeScript types. But we can do more with Zod than we can with TypeScript alone. Zod schemas let us create additional rules for data parsing and validation. A 20-character string that starts with user_id_? It's z.string().startsWith('user_id_').length(20). An integer between 1 and 5 inclusive? It's z.number().int().gte(1).lte(5). Zod's primitives give us many extra functions to be more specific about *exactly* what data we expect. Unlike TypeScript, Zod schemas aren't removed on compilation to JavaScript—we still have access to them! If our app receives some data, we can verify that it matches the expected shape by passing it to your Zod schema's parse function. You'll either get back your data in exactly the shape you defined in your schema, or Zod will give you an error to tell you what was wrong. Zod schemas aren't a replacement for TypeScript; rather, they are an excellent complement. Once we've defined our Zod schema, it's simple to derive a TypeScript type from it and to use that type as we normally would. But when we really need to be sure our data conforms to the schema, we can always parse the data with our schema for that extra confidence. Defining Data Schemas Zod schemas are the variables that define our expectations for the shape of our data, validate those expectations, and transform the data if necessary to match our desired shape. It's easy to start with simple schemas, and to add complexity as required. Zod provides different functions that represent data structures and related validation options, which can be combined to create larger schemas. In many cases, you'll probably be building a schema for a data object with properties of some primitive type. For example, here's a schema that would validate a JavaScript object representing an order for a pizza: ` Zod provides a number of primitives for defining schemas that line up with JavaScript primitives: string, number, bigint, boolean, date, symbol, undefined, and null. It also includes primitives void, any, unknown, and never for additional typing information. In addition to basic primitives, Zod can define object, array, and other native data structure schemas, as well as schemas for data structures not natively part of JavaScript like tuple and enum. The documentation contains considerable detail on the available data structures and how to use them. Parsing and Validating Data with Schemas With Zod schemas, you're not only telling your program what data should look like; you're also creating the tools to easily verify that the incoming data matches the schema definitions. This is where Zod really shines, as it greatly simplifies the process of validating data like user inputs or third party API responses. Let's say you're writing a website form to register new users. At a minimum, you'll need to make sure the new user's email address is a valid email address. For a password, we'll ask for something at least 8 characters long and including one letter, one number, and one special character. (Yes, this is not really the best way to write strong passwords; but for the sake of showing off how Zod works, we're going with it.) We'll also ask the user to confirm their password by typing it twice. First, let's create a Zod schema to model these inputs: ` So far, this schema is pretty basic. It's only making sure that whatever the user types as an email is an email, and it's checking that the password is at least 8 characters long. But it is *not* checking if password and confirmPassword match, nor checking for the complexity requirements. Let's enhance our schema to fix that! ` By adding refine with a custom validation function, we have been able to verify that the passwords match. If they don't, parsing will give us an error to let us know that the data was invalid. We can also chain refine functions to add checks for our password complexity rules: ` Here we've chained multiple refine functions. You could alternatively use superRefine, which gives you even more fine grained control. Now that we've built out our schema and added refinements for extra validation, we can parse some user inputs. Let's see two test cases: one that's bound to fail, and one that will succeed. ` There are two main ways we can use our schema to validate our data: parse and safeParse. The main difference is that parse will throw an error if validation fails, while safeParse will return an object with a success property of either true or false, and either a data property with your parsed data or an error property with the details of a ZodError explaining why the parsing failed. In the case of our example data, userInput2 will parse just fine and return the data for you to use. But userInput1 will create a ZodError listing all of the ways it has failed validation. ` ` We can use these error messages to communicate to the user how they need to fix their form inputs if validation fails. Each error in the list describes the validation failure and gives us a human readable message to go with it. You'll notice that the validation errors for checking for a valid email and for checking password length have a lot of details, but we've got three items at the end of the error list that don't really tell us anything useful: just a custom error of Invalid input. The first is from our refine checking if the passwords match, and the second two are from our refine functions checking for password complexity (numbers and special characters). Let's modify our refine functions so that these errors are useful! We'll add our own error parameters to customize the message we get back and the path to the data that failed validation. ` Now, our error messages from failures in refine are informative! You can figure out which form fields aren't validating from the path, and then display the messages next to form fields to let the user know how to remedy the error. ` By giving our refine checks a custom path and message, we can make better use of the returned errors. In this case, we can highlight specific problem form fields for the user and give them the message about what is wrong. Integrating with TypeScript Integrating Zod with TypeScript is very easy. Using z.infer<typeof YourSchema> will allow you to avoid writing extra TypeScript types that merely reflect the intent of your Zod schemas. You can create a type from any Zod schema like so: ` Using a TypeScript type derived from a Zod schema does *not* give you any extra level of data validation at the type level beyond what TypeScript is capable of. If you create a type from z.string.min(3).max(20), the TypeScript type will still just be string. And when compiled to JavaScript, even that will be gone! That's why you still need to use parse/safeParse on incoming data to validate it before proceeding as if it really does match your requirements. A common pattern with inferring types from Zod schemas is to use the same name for both. Because the schema is a variable, there's no name conflict if the type uses the same name. However, I find that this can lead to confusing situations when trying to import one or the other—my personal preference is to name the Zod schema with Schema at the end to make it clear which is which. Conclusion Zod is an excellent tool for easily and confidently asserting that the data you're working with is exactly the sort of data you were expecting. It gives us the ability to assert at runtime that we've got what we wanted, and allows us to then craft strategies to handle what happens if that data is wrong. Combined with the ability to infer TypeScript types from Zod schemas, it lets us write and run more reliable code with greater confidence....

Nov 15, 2024

7 mins

JavaScriptZod

How Vim Transformed My Workflow for the Better

> Before getting into this article, I want to say that this is just my personal experience. I don’t think Vim is the best editor for everyone, but it worked well for me. What is Vim for you who don’t know? Well, Vim is a text editor, which is a tool you use to write and edit text or code on your computer. What makes Vim unique is that it has lots of keyboard shortcuts. Instead of using the mouse to click around, you can keep your fingers on the keyboard, making you faster once you get the hang of it. I've always heard good things about Vim. People say it’s great and that once you learn it, it can speed up your coding. Vim users sometimes act like they’re in a special club, which is funny. But jokes aside, I was curious because of all the good things I heard, especially about how comfy and efficient it is for coding. I knew learning Vim wouldn’t be easy. From past experiences (like learning touch typing), I knew it would be like climbing a steep hill. It's akin to learning to ride a bike with special gear. At first, it might feel tricky, but once you get used to it, you can ride faster and smoother. There’s a stage I like to call the "Hybrid" point. It's when you’re getting to grips with something new, and you're not proficient at it yet, but you also start losing comfort with the old way of doing things. This phase is a bit frustrating, and I wasn’t sure I wanted to go through this again as it could have led to a decrease in productivity, which was the opposite of what I was aiming for. However, the prospect of a smoother and faster coding experience kept me excited to try Vim. Discovering a New Coding Comfort Zone So, I began my journey with Vim. I started learning the basic moves by chatting with GPT and playing Vim Snake – A fun game that's like the classic snake game, but with Vim controls – Slowly but surely, I got the hang of it. Once I felt more at ease, I decided to add a Vim plugin to my IDE. Right now, I'm using IntelliJ. This plugin let me use Vim right inside IntelliJ. It was the best of both worlds—I could use Vim’s quick commands and still have my IDE’s features, plus the option to use my mouse when I wanted to. In the beginning, the transition was rough. I felt out of my element, constantly reaching for my mouse. My coding speed took a hit as I was making more mistakes. For instance, I’d try to edit a line of code, and end up deleting it or accidentally switching to another tab. It was frustrating. But I pushed through the initial hurdles, working in this odd hybrid mode with both my mouse and Vim commands. Gradually, things started to click and my workflow began to improve. As the weeks went by, I got more and more comfortable. There were times I’d code without touching the mouse, and let me tell you, it felt AMAZING. Here's why: Just using your keyboard to zip around your code is a total game-changer. Here's a simple example to illustrate this. Let's say you want to move the line of code you're on to the line below. Normally, you'd grab your mouse, highlight the code, cut it, hit enter to create a new line, paste the code, then put the mouse down to continue coding. With Vim, you just press the "dd" keys to cut and copy the line, then “p” to paste it below. And just like that, you're done! It may seem small, but when all these little time-savers add up, you end up saving a TON of time. Without Vim With Vim --- But the magic of Vim doesn't end there. You can tweak Vim in numerous ways by adding your own commands, modifying existing ones, or loading up on plugins to extend its functionality. For instance, I tried out the easy-motion plugin and it’s been a game-changer for navigating through files. I also set up shortcuts to switch between tabs, which has made things even smoother. With all these enhancements, I found myself breezing through code like a pro and seeing what I needed in no time. And if I ever need a new command, a quick chat with GPT sets me right back on track! The more I explored Vim, the more I discovered its immense potential. It's not just a text editor; it's a powerful tool that can be molded to fit your coding style. From creating custom key mappings to defining your own text objects, Vim allowed me to tailor my editing environment exactly how I wanted. And the community around Vim is incredible. There’s a plethora of resources: tutorials like vimtutor, courses such as Vim Basics Course, Learning Vim in a Week, interactive apps, blogs, and plugins available that can help overcome any challenge I faced. It's been a rewarding journey that not only improved my efficiency but also broadened my perspective on what's possible with the right tools. The journey with Vim has also opened up a new world of exploration for me. I’ve started checking out other text editors and tools that prioritize keyboard use, seeking to refine my setup further. Each new discovery, whether it's a handy plugin or a clever command, feels like unlocking a new level in a game. It’s exciting to think that there's so much more to learn and that I can continue to fine-tune my workflow. I know there's still much to explore, but the excitement keeps growing—it's genuinely enjoyable! Up next on my adventure is diving into Neovim, and I can't wait to share my experiences with it. Conclusion It's clear that stepping out of my comfort zone and delving into Vim has been nothing short of transformative. But it revamped my coding workflow and ignited a newfound curiosity to explore and learn. The hurdles encountered along the way were just stepping stones to achieving a level of proficiency that I now enjoy. Vim, with its simplicity yet profound flexibility, has proven to be more than just a text editor—it's a coder's companion on the journey towards efficiency and mastery. As I venture further, experimenting with Neovim and other tools, the horizon of possibilities keeps expanding. In the realm of coding, time is of the essence, and Vim has gifted me the ability to reclaim it, one keystroke at a time. It's not just about the shortcuts or the plugins; it's about the mindset of continuous learning and improvement that Vim instills. This blog post is not just an endorsement of Vim but a testament to the boundless exploration that awaits when one is willing to embrace new tools and methodologies....

Dec 20, 2023

5 mins

VimToolingSoftware EngineeringTips

D1 SQLite: Writing queries with the D1 Client API

Writing queries with the D1 Client API In the previous post we defined our database schema, got up and running with migrations, and loaded some seed data into our database. In this post we will be working with our new database and seed data. If you want to participate, make sure to follow the steps in the first post. We’ve been taking a minimal approach so far by using only wrangler and sql scripts for our workflow. The D1 Client API has a small surface area. Thanks to the power of SQL, we will have everything we need to construct all types of queries. Before we start writing our queries, let's touch on some important concepts. Prepared statements and parameter binding This is the first section of the docs and it highlights two different ways to write our SQL statements using the client API: prepared and static statements. Best practice is to use prepared statements because they are more performant and prevent SQL injection attacks. So we will write our queries using prepared statements. We need to use parameter binding to build our queries with prepared statements. This is pretty straightforward and there are two variations. By default we add ? ’s to our statement to represent a value to be filled in. The bind method will bind the parameters to each question mark by their index. The first ? is tied to the first parameter in bind, 2nd, etc. I would stick with this most of the time to avoid any confusion. ` I like this second method less as it feels like something I can imagine messing up very innocently. You can add a number directly after a question mark to indicate which number parameter it should be bound to. In this exampl, we reverse the previous binding. ` Reusing prepared statements If we take the first example above and not bind any values we have a statement that can be reused: ` Querying For the purposes of this post we will just build example queries by writing them out directly in our Worker fetch handler. If you are building an app I would recommend building functions or some other abstraction around your queries. select queries Let's write our first query against our data set to get our feet wet. Here’s the initial worker code and a query for all authors: ` We pass our SQL statement into prepare and use the all method to get all the rows. Notice that we are able to pass our types to a generic parameter in all. This allows us to get a fully typed response from our query. We can run our worker with npm run dev and access it at http://localhost:8787 by default. We’ll keep this simple workflow of writing queries and passing them as a json response for inspection in the browser. Opening the page we get our author results. joins Not using an ORM means we have full control over our own destiny. Like anything else though, this has tradeoffs. Let’s look at a query to fetch the list of posts that includes author and tags information. ` Let’s walk through each part of the query and highlight some pros and cons. ` * The query selects all columns from the posts table. * It also selects the name column from the authors table and renames it to author_name. * It aggregates the name column from the tags table into a JSON array. If there are no tags, it returns an empty JSON array. This aggregated result is renamed to tags. ` * The query starts by selecting data from the posts table. * It then joins the authors table to include author information for each post, matching posts to authors using the author_id column in posts and the id column in authors. * Next, it left joins the posts_tags table to include tag associations for each post, ensuring that all posts are included even if they have no tags. * Next, it left joins the tags table to include tag names, matching tags to posts using the tag_id column in posts_tags and the id column in tags. * Finally, group the results by the post id so that all rows with the same post id are combined in a single row SQL provides a lot of power to query our data in interesting ways. JOIN ’s will typically be more performant than performing additional queries.You could just as easily write a simpler version of this query that uses subqueries to fetch post tags and join all the data by hand with JavaScript. This is the nice thing about writing SQL, you’re free to fetch and handle your data how you please. Our results should look similar to this: ` This brings us to our next topic. Marshaling / coercing result data A couple of things we notice about the format of the result data our query provides: Rows are flat. We join the author directly onto the post and prefix its column names with author. ` Using an ORM we might get the data back as a child object: ` Another thing is that our tags data is a JSON string and not a JavaScript array. This means that we will need to parse it ourselves. ` This isn’t the end of the world but it is some more work on our end to coerce the result data into the format that we actually want. This problem is handled in most ORM’s and is their main selling point in my opinion. insert / update / delete Next, let’s write a function that will add a new post to our database. ` There’s a few queries involved in our create post function: * first we create the new post * next we run through the tags and either create or return an existing tag * finally, we add entries to our post_tags join table to associate our new post with the tags assigned We can test our new function by providing post content in query params on our index page and formatting them for our function. ` I gave it a run like this: http://localhost:8787authorId=1&tags=Food%2CReview&title=A+review+of+my+favorite+Italian+restaurant&content=I+got+the+sausage+orchette+and+it+was+amazing.+I+wish+that+instead+of+baby+broccoli+they+used+rapini.+Otherwise+it+was+a+perfect+dish+and+the+vibes+were+great And got a new post with the id 11. UPDATE and DELETE operations are pretty similar to what we’ve seen so far. Most complexity in your queries will be similar to what we’ve seen in the posts query where we want to JOIN or GROUP BY data in various ways. To update the post we can write a query that looks like this: ` COALESCE acts similarly to if we had written a ?? b in JavaScript. If the binded value that we provide is null it will fall back to the default. We can delete our new post with a simple DELETE query: ` Transactions / Batching One thing to note with D1 is that I don’t think the traditional style of SQLite transactions are supported. You can use the db.batch API to achieve similar functionality though. According to the docs: Batched statements are SQL transactions ↗. If a statement in the sequence fails, then an error is returned for that specific statement, and it aborts or rolls back the entire sequence. ` Summary In this post, we've taken a hands-on approach to exploring the D1 Client API, starting with defining our database schema and loading seed data. We then dove into writing queries, covering the basics of prepared statements and parameter binding, before moving on to more complex topics like joins and transactions. We saw how to construct and execute queries to fetch data from our database, including how to handle relationships between tables and marshal result data into a usable format. We also touched on inserting, updating, and deleting data, and how to use transactions to ensure data consistency. By working through these examples, we've gained a solid understanding of how to use the D1 Client API to interact with our database and build robust, data-driven applications....

Dec 23, 2024

6 mins

JavaScript

Let's innovate together!

We're ready to be your trusted technical partners in your digital innovation journey.

Whether it's modernization or custom software solutions, our team of experts can guide you through best practices and how to build scalable, performant software that lasts.

How to build an AI assistant with OpenAI, Vercel AI SDK, and Ollama with Next.js

Getting started

Kicking things off

Building our client logic

Setting up our Server side

Putting It All Together

Conclusion

Jesús Padrón

You might also like

Exploring Angular Forms: A New Alternative with Signals

Introduction to Zod for Data Validation

How Vim Transformed My Workflow for the Better

D1 SQLite: Writing queries with the D1 Client API

Let's innovate together!

You might also like

Exploring Angular Forms: A New Alternative with Signals

Introduction to Zod for Data Validation

How Vim Transformed My Workflow for the Better

D1 SQLite: Writing queries with the D1 Client API