JSON Schema isn’t a hot topic that gets a lot of attention compared to GraphQL or other similar tools. I discovered the power of JSON Schema while I was building a REST API with Fastify. What is it exactly? The website describes it as “the vocabulary that enables JSON data consistency, validity, and interoperability at scale”. Or more simply, it’s a schema specification for JSON data. This article is going to highlight some of the benefits gained by defining a JSON Schema for a REST API.
JSON Schema Basics
Here’s an example of a simple schema representing a user:
{
"$id": "<https://example.com/schemas/user>",
"$schema": "<http://json-schema.org/draft-07/schema#>",
"type": "object",
"properties": {
"firstName": {
"type": "string"
},
"lastName": {
"type": "string"
},
"email": {
"type": "string",
"format": "email"
},
"age": {
"type": "integer"
},
"newsletterSubscriber": {
"type": "boolean"
},
"favoriteGenres": {
"type": "array",
"items": {
"type": "string"
}
}
},
"required": ["email"],
"additionalProperties": false
}
If you’re familiar with JSON already, you can probably understand most of this at a glance. This schema represents a JSON object with some properties that define a User in our system, for example. Along with the object’s properties, we can define additional metadata about the object. We can describe which fields are required and whether or not the schema can accept any additional properties that aren’t defined on the properties list.
Types
We covered a lot of types in our example schema. The root type of our JSON schema is an object with various properties defined on it. The base types available to define in your JSON Schema map to valid JSON types: object
, array
, string
, number
, integer
, boolean
. Check the type reference page to learn more.
Formats
The email
property in our example has an additional field named format
next to its type. The format
property allows us to define a semantic identification for string values. This allows our schema definition to be more specific about the type of values allowed for a given field. “hello” is not a valid string value for our email type.
Another common example is for date or timestamp values that get serialized. Validation implementations can use the format definition to make sure a value matches the expected type and format defined. There’s a section on the website that lists the various formats available for the string type.
Schema Structuring
JSON Schema supports referencing schemas from within a schema. This is a very important feature that helps us keep our schemas DRY. Looking back to our initial example we might want to define a schema for a list of users. We defined an id on our user schema of “user”, we can use this to reference that schema from another schema.
{
"type": "array",
"items": {
"$ref": "<https://example.com/schemas/user>"
}
}
In this example we have a simple schema that is just an array whose items definition references our user schema. This schema is exactly the same as if we defined our initial schema inside of "items": { }
. The JSON Schema website has a page dedicated to structuring schemas.
JSON Schema Benefits
Validation
One of the main benefits of defining a schema for your API is being able to validate inputs, outputs. Inputs include things like the request body, URL parameters, and search parameters. The output is your response JSON data or headers. There are some different libraries available to handle schema validation. A popular choice and the one used by Fastify is called Ajv.
Security
Validating inputs has some security advantages. It can prevent bad or malicious data from being accepted by your API. For instance, you can specify that a certain field must be an integer, or that a string must match a certain regex pattern. This can help prevent SQL injection, cross-site scripting (XSS).
Defining a schema for your response types can help to prevent leaking sensitive data from your database. Your web server can be configured to not include any data that is not defined in the schema from your responses.
Performance
By validating data at the schema level, you can reject invalid requests early, before they reach more resource-intensive parts of your application. This can help protect against Denial of Service (DoS) attacks.
fast-json-stringify
is a library that creates optimized stringify
functions from JSON schemas that can help improve response times and throughput for JSON API’s.
Documentation
JSON Schema also greatly aids in API documentation. Tools like OpenAPI and Swagger use JSON Schema to automatically generate human-readable API documentation. This documentation provides developers with clear, precise information about your API’s endpoints, request parameters, and response formats. This not only helps to maintain consistent and clear communication within your development team, but also makes your API more accessible to outside developers.
Type-safety
I plan to cover this in more detail in an upcoming post but there are tools available that can help achieve type-safety both on your server and client-side by pairing JSON Schema with TypeScript. In Fastify for example, you can infer types in your request handlers based on your JSON Schema specifications.
Schema Examples
I’ve taken some example schemas from the Fastify website to walk through how they would work in practice.
### queryStringJsonSchema
const queryStringJsonSchema = {
type: 'object',
properties: {
name: { type: 'string' },
excitement: { type: 'integer' }
},
additionalProperites: "false"
}
We would use this schema to define, validate, and parse the query string of an incoming request in our API.
Given a query string like: ?name=Dane&excitement=10&other=additional
- we can expect to receive an object that looks like this:
{
name: "Dane",
excitement: 10
}
Since additionalProperties
are not allowed, the other
property that wasn’t defined on our schema gets parsed out.
### paramsJsonSchema
Imagining we had a route in our API defined like /users/:userId/posts/:slug
const paramsJsonSchema = {
type: 'object',
properties: {
userId: { type: 'number' },
slug: { type: 'string' }
},
additionalProperties: "false",
required: ["userId", "slug"]
}
Given this url: /users/1/posts/hello-world
- we can expect to get an object in our handler that looks like this:
{
userId: 1,
slug: "hello-world"
}
We can be sure about this since the schema doesn’t allow for additional properties and both properties are required. If either field was missing or not matching its type, our API can automatically return a proper error response code.
Just to highlight what we are getting here again. We are able to provide fine-grained schema definitions for all the inputs and outputs of our API. Aside from serving as documentation and specification, it powers validation, parsing, and sanitizing values. I’ve found this to be a very simple and powerful tool in my toolbox.
Summary
In this post, we've explored the power and functionality of JSON Schema, a tool that often doesn't get the spotlight it deserves. We've seen how it provides a robust structure for JSON data, ensuring consistency, validity, and interoperability on a large scale. Through our user schema example, we've delved into key features like types, formats, and the ability to structure schemas using references, keeping our code DRY. We've also discussed the substantial benefits of using JSON Schema, such as validation, enhanced security, improved performance, and the potential for type-safety. We've touched on useful libraries like Ajv for validation and fast-json-stringify
for performance optimization.
In a future post we will explore how we can utilize JSON Schema to achieve end-to-end type-safety in our applications.