Skip to content

Introduction to Zod for Data Validation

Introduction to Zod for Data Validation

As web developers, we're often working with data from external sources like APIs we don't control or user inputs submitted to our backends. We can't always rely on this data to take the form we expect, and we can encounter unexpected errors when it deviates from expectations. But with the Zod library, we can define what our data ought to look like and parse the incoming data against those defined schemas. This lets us work with that data confidently, or to quickly throw an error when it isn't correct.

Why use Zod?

TypeScript is great for letting us define the shape of our data in our code. It helps us write more correct code the first time around by warning us if we are doing something we shouldn't.

But TypeScript can't do everything for us. For example, we can define a variable as a string or a number, but we can't say "a string that starts with user_id_ and is 20 characters long" or "an integer between 1 and 5". There are limits to how much TypeScript can narrow down our data for us.

Also, TypeScript is a tool for us developers. When we compile our code, our types are not available to the vanilla JavaScript. JavaScript can't validate that the data we actually use in our code matches what we thought we'd get when we wrote our TypeScript types unless you're willing to manually write code to perform those checks.

This is where we can reach for a tool like Zod. With Zod, we can write data schemas. These schemas, in the simplest scenarios, look very much like TypeScript types. But we can do more with Zod than we can with TypeScript alone. Zod schemas let us create additional rules for data parsing and validation. A 20-character string that starts with user_id_? It's z.string().startsWith('user_id_').length(20). An integer between 1 and 5 inclusive? It's z.number().int().gte(1).lte(5). Zod's primitives give us many extra functions to be more specific about exactly what data we expect.

Unlike TypeScript, Zod schemas aren't removed on compilation to JavaScript—we still have access to them! If our app receives some data, we can verify that it matches the expected shape by passing it to your Zod schema's parse function. You'll either get back your data in exactly the shape you defined in your schema, or Zod will give you an error to tell you what was wrong.

Zod schemas aren't a replacement for TypeScript; rather, they are an excellent complement. Once we've defined our Zod schema, it's simple to derive a TypeScript type from it and to use that type as we normally would. But when we really need to be sure our data conforms to the schema, we can always parse the data with our schema for that extra confidence.

Defining Data Schemas

Zod schemas are the variables that define our expectations for the shape of our data, validate those expectations, and transform the data if necessary to match our desired shape. It's easy to start with simple schemas, and to add complexity as required. Zod provides different functions that represent data structures and related validation options, which can be combined to create larger schemas. In many cases, you'll probably be building a schema for a data object with properties of some primitive type. For example, here's a schema that would validate a JavaScript object representing an order for a pizza:

import { z } from 'zod';

const PizzaOrderSchema = z.object({
	diameter: z.number().gte(12).lte(28), // a number between 12 and 28 inclusive
	crust: z.enum(['thin', 'thick', 'stuffed']), // one of these specific strings
	toppings: z.array(z.string()), // an array of any kind of string
	hasPineapple: z.boolean().optional(), // a boolean than might be undefined
	orderCreated: z.date() // a JavaScript Date object
})

Zod provides a number of primitives for defining schemas that line up with JavaScript primitives: string, number, bigint, boolean, date, symbol, undefined, and null. It also includes primitives void, any, unknown, and never for additional typing information. In addition to basic primitives, Zod can define object, array, and other native data structure schemas, as well as schemas for data structures not natively part of JavaScript like tuple and enum. The documentation contains considerable detail on the available data structures and how to use them.

Parsing and Validating Data with Schemas

With Zod schemas, you're not only telling your program what data should look like; you're also creating the tools to easily verify that the incoming data matches the schema definitions. This is where Zod really shines, as it greatly simplifies the process of validating data like user inputs or third party API responses.

Let's say you're writing a website form to register new users. At a minimum, you'll need to make sure the new user's email address is a valid email address. For a password, we'll ask for something at least 8 characters long and including one letter, one number, and one special character. (Yes, this is not really the best way to write strong passwords; but for the sake of showing off how Zod works, we're going with it.) We'll also ask the user to confirm their password by typing it twice. First, let's create a Zod schema to model these inputs:

import { z } from 'zod';

const UserRegistrationSchema = z.object({
	email: z.string().email(),
	password: z.string().min(8),
	confirmPassword: z.string().min(8)
});

So far, this schema is pretty basic. It's only making sure that whatever the user types as an email is an email, and it's checking that the password is at least 8 characters long. But it is not checking if password and confirmPassword match, nor checking for the complexity requirements. Let's enhance our schema to fix that!

const UserRegistrationSchema = z.object({
	// ...
}).refine(data => data.password === data.confirmPassword);

By adding refine with a custom validation function, we have been able to verify that the passwords match. If they don't, parsing will give us an error to let us know that the data was invalid.

We can also chain refine functions to add checks for our password complexity rules:

const UserRegistrationSchema = z
	.object({
		// ...
	})
	.refine(data => data.password === data.confirmPassword)
	.refine(data => /[a-z]/i.test(data.password)) // must have letters
	.refine(data => /\d/i.test(data.password)) // must have numbers
	.refine(data => /\W/i.test(data.password)); // must have symbols

Here we've chained multiple refine functions. You could alternatively use superRefine, which gives you even more fine grained control. Now that we've built out our schema and added refinements for extra validation, we can parse some user inputs. Let's see two test cases: one that's bound to fail, and one that will succeed.

// This will fail validation!
const userInput1 = {
	email: 'foo',
	password: 'bar',
	confirmPassword: 'baz'
};

// This will succeed validation!
const userInput2 = {
	email: 'user@example.com',
	password: 'Tr0ub4dor&3',
	confirmPassword: 'Tr0ub4dor&3'
};

There are two main ways we can use our schema to validate our data: parse and safeParse. The main difference is that parse will throw an error if validation fails, while safeParse will return an object with a success property of either true or false, and either a data property with your parsed data or an error property with the details of a ZodError explaining why the parsing failed.

In the case of our example data, userInput2 will parse just fine and return the data for you to use. But userInput1 will create a ZodError listing all of the ways it has failed validation.

UserRegistrationSchema.parse(userInput1);
ZodError: [
  {
    "validation": "email",
    "code": "invalid_string",
    "message": "Invalid email",
    "path": [
      "email"
    ]
  },
  {
    "code": "too_small",
    "minimum": 8,
    "type": "string",
    "inclusive": true,
    "exact": false,
    "message": "String must contain at least 8 character(s)",
    "path": [
      "password"
    ]
  },
  {
    "code": "too_small",
    "minimum": 8,
    "type": "string",
    "inclusive": true,
    "exact": false,
    "message": "String must contain at least 8 character(s)",
    "path": [
      "confirmPassword"
    ]
  },
  {
    "code": "custom",
    "message": "Invalid input",
    "path": []
  },
  {
    "code": "custom",
    "message": "Invalid input",
    "path": []
  },
  {
    "code": "custom",
    "message": "Invalid input",
    "path": []
  }
]

We can use these error messages to communicate to the user how they need to fix their form inputs if validation fails. Each error in the list describes the validation failure and gives us a human readable message to go with it.

You'll notice that the validation errors for checking for a valid email and for checking password length have a lot of details, but we've got three items at the end of the error list that don't really tell us anything useful: just a custom error of Invalid input. The first is from our refine checking if the passwords match, and the second two are from our refine functions checking for password complexity (numbers and special characters).

Let's modify our refine functions so that these errors are useful! We'll add our own error parameters to customize the message we get back and the path to the data that failed validation.

const UserRegistrationSchema = z
	.object({
		email: z.string().email(),
		password: z.string().min(8),
		confirmPassword: z.string().min(8),
	})
	.refine(data => data.password === data.confirmPassword, {
		message: 'Passwords do not match!',
		path: ['confirmPassword'],
	})
	.refine(data => /[a-z]/i.test(data.password), {
		message: 'Password missing letters!',
		path: ['password'],
	})
	.refine(data => /\d/i.test(data.password), {
		message: 'Password missing numbers!',
		path: ['password'],
	})
	.refine(data => /\W/i.test(data.password), {
		message: 'Password missing special characters!',
		path: ['password'],
	});

Now, our error messages from failures in refine are informative! You can figure out which form fields aren't validating from the path, and then display the messages next to form fields to let the user know how to remedy the error.

  {
    "code": "custom",
    "message": "Passwords do not match!",
    "path": [
      "confirmPassword"
    ]
  },
  {
    "code": "custom",
    "message": "Password missing numbers!",
    "path": [
      "password"
    ]
  },
  {
    "code": "custom",
    "message": "Password missing special characters!",
    "path": [
      "password"
    ]
  }

By giving our refine checks a custom path and message, we can make better use of the returned errors. In this case, we can highlight specific problem form fields for the user and give them the message about what is wrong.

Integrating with TypeScript

Integrating Zod with TypeScript is very easy. Using z.infer<typeof YourSchema> will allow you to avoid writing extra TypeScript types that merely reflect the intent of your Zod schemas. You can create a type from any Zod schema like so:

const ExampleSchema = z.object({
	foo: z.string(),
	bar: z.number()
});

type Example = z.infer<typeof ExampleSchema>;

// Equivalent to:
// type Example = {
//	foo: string;
//	bar: number;
// }

const example1: Example = { foo: 'test', bar: 53 }; // Type is valid!
const example2: Example = 'this will fail'; // Type error!

Using a TypeScript type derived from a Zod schema does not give you any extra level of data validation at the type level beyond what TypeScript is capable of. If you create a type from z.string.min(3).max(20), the TypeScript type will still just be string. And when compiled to JavaScript, even that will be gone! That's why you still need to use parse/safeParse on incoming data to validate it before proceeding as if it really does match your requirements.

A common pattern with inferring types from Zod schemas is to use the same name for both. Because the schema is a variable, there's no name conflict if the type uses the same name. However, I find that this can lead to confusing situations when trying to import one or the other—my personal preference is to name the Zod schema with Schema at the end to make it clear which is which.

Conclusion

Zod is an excellent tool for easily and confidently asserting that the data you're working with is exactly the sort of data you were expecting. It gives us the ability to assert at runtime that we've got what we wanted, and allows us to then craft strategies to handle what happens if that data is wrong. Combined with the ability to infer TypeScript types from Zod schemas, it lets us write and run more reliable code with greater confidence.

This Dot is a consultancy dedicated to guiding companies through their modernization and digital transformation journeys. Specializing in replatforming, modernizing, and launching new initiatives, we stand out by taking true ownership of your engineering projects.

We love helping teams with projects that have missed their deadlines or helping keep your strategic digital initiatives on course. Check out our case studies and our clients that trust us with their engineering.

Let's innovate together!

We're ready to be your trusted technical partners in your digital innovation journey.

Whether it's modernization or custom software solutions, our team of experts can guide you through best practices and how to build scalable, performant software that lasts.

Prefer email? hi@thisdot.co