General

The Dangers of ORMs and How to Avoid Them

Published Jul 3, 2024

Updated Jul 3, 2024

8 min read

Background

We recently engaged with a client reporting performance issues with their product. It was a legacy ASP .NET application using Entity Framework to connect to a Microsoft SQL Server database. As the title of this post and this information suggest, the misuse of this Object-Relational Mapping (ORM) tool led to these performance issues. Since our blog primarily focuses on JavaScript/TypeScript content, I will leverage TypeORM to share examples that communicate the issues we explored and their fixes, as they are common in most ORM tools.

Our Example Data

If you're unfamiliar with TypeORM I recommend reading their docs or our other posts on the subject. To give us a variety of data situations, we will leverage a classic e-commerce example using products, product variants, customers, orders, and order line items. Our data will be related as follows:

The relationship between customers and products is one we care about, but it is optional as it could be derived from customer->order->order line item->product. The TypeORM entity code would look as follows:

import { Entity, PrimaryGeneratedColumn, Column, OneToMany, ManyToOne, JoinColumn } from 'typeorm';

@Entity()
export class Product {
  @PrimaryGeneratedColumn()
  id: number;

  @Column()
  name: string;

  @Column({ type: 'text' }) // Use 'text' type for potentially long descriptions 
  description: string;

  @OneToMany(() => ProductVariant, (variant) => variant.product)
  variants: ProductVariant[];

  @OneToMany(() => OrderLineItem, (item) => item.product)
  lineItems: OrderLineItem[];
}

@Entity()
export class ProductVariant {
  @PrimaryGeneratedColumn()
  id: number;

  @Column()
  name: string;

  @Column()
  price: number;

  @ManyToOne(() => Product, (product) => product.variants)
  @JoinColumn() // Specifies the foreign key column
  product: Product;
}

@Entity()
export class Customer {
  @PrimaryGeneratedColumn()
  id: number;

  @Column()
  name: string;

  @OneToMany(() => Order, (order) => order.customer)
  orders: Order[];
}

@Entity()
export class Order {
  @PrimaryGeneratedColumn()
  id: number;

  @Column()
  orderDate: Date;

  @Column()
  amount: number;

  @ManyToOne(() => Customer, (customer) => customer.orders)
  @JoinColumn()
  customer: Customer;

  @OneToMany(() => OrderLineItem, (item) => item.order)
  items: OrderLineItem[];
}

@Entity()
export class OrderLineItem {
  @PrimaryGeneratedColumn()
  id: number;

  @Column()
  quantity: number;

  @ManyToOne(() => Product, (product) => product.lineItems)
  @JoinColumn()
  product: Product;

  @ManyToOne(() => Order, (order) => order.items)
  @JoinColumn()
  order: Order;
}

With this example set, let's explore some common misuse patterns and how to fix them.

N + 1 Queries

One of ORMs' superpowers is quick access to associated records. For example, we want to get a product and its variants. We have 2 options for writing this operation. The first is to join the table at query time, such as:

import { getRepository } from 'typeorm';

async function getProductWithVariants(productId: number) {
  const productRepository = getRepository(Product);
  const product = await productRepository.findOne({ 
    where: { id: productId },
    relations: ['variants']  // Eagerly load variants via join
  });

  if (product) {
    return product; // The product object now includes its variants
  } else {
    throw new Error('Product not found');
  }
}

Which resolves to a SQL statement like (mileage varies with the ORM you use):

SELECT *
FROM Product as p, ProductVariant as pv
WHERE p.id = productId
AND p.id = pv.productId;

The other option is to query the product variants separately, such as:

import { getRepository } from 'typeorm';

async function getProductWithVariants(productId: number) {
  const productRepository = getRepository(Product);
  const productVariantRepository = getRepository(ProductVariant);

  const product = await productRepository.findOne({ where: { id: productId } });

  if (product) {
    const variants = await productVariantRepository.find({
      where: { product: { id: productId } } 
    });

    return { …product, variants };
  } else {
    throw new Error('Product not found');
  }
}

This example executes 2 queries. The join operation performs 1 round trip to the database at ~200ms, whereas the two operations option performs 2 round trips at ~100ms. Depending on the location of your database to your server, the round trip time will impact your decision here on which to use. In this example, the decision to join or not is relatively unimportant and you can implement caching to make this even faster. But let's imagine a different situation/query that we want to run. Let's say we want to fetch a set of orders for a customer and all their order items and the products associated with them. With our ORM, we may want to write the following:

import { getRepository } from 'typeorm';

async function fetchCustomerOrdersWithDetails(customerId: number) {
  const orderRepository = getRepository(Order);
  const orderItemRepository = getRepository(OrderItem);
  const productRepository = getRepository(Product);

  const orders = await orderRepository.find({ 
    where: { customer: { id: customerId } } 
  });

  for (const order of orders) {
    const orderItem = await orderItemRepository.findOne(order);
    order.items.push(orderItem);

    for (const item of order.items) {
      const product = await productRepository.findOne(item.product);
      item.product = product;
    }
  }

  return orders; // All orders with items and products populated
}

When written out like this and with our new knowledge of query times, we can see that we'll have the following performance: 100ms * 1 query for orders + 100ms * order items count + 100ms * order items' products count. In the best case, this only takes 100ms, but in the worst case, we're talking about seconds to process all the data. That's an O(1 + N + M) operation! We can eagerly fetch with joins or collection queries based on our entity keys, and our query performance becomes closer to 100ms for orders + 100ms for order line items join + 100ms for product join. In the worst case, we're looking at 300ms or O(1)!

Normally, N+1 queries like this aren't so obvious as they're split into multiple helper functions. In other languages' ORMs, some accessors look like property lookups, e.g. order.orderItems. This can be achieved with TypeORM using their lazy loading feature (more below), but we don't recommend this behavior by default. Also, you need to be wary of whether your ORM can be utilized through a template/view that may be looping over entities and fetching related records. In general, if you see a record being looped over and are experiencing slowness, you should verify if you've prefetched the data to avoid N+1 operations, as they can be a major bottleneck for performance. Our above example can be optimized by doing the following:

async function fetchCustomerOrdersWithDetails(customerId: number) {
  const orderRepository = getRepository(Order);
  const orderItemRepository = getRepository(OrderItem);
  const productRepository = getRepository(Product);

  const orders = await orderRepository.find({ 
    where: { customer: { id: customerId } } 
  });
  const orderItems = await orderItemRepository.find({
    where: orders.map(order => ({ order: { id: order.id } }))
  });
  const products = await productRepository.find({
    where: orderItems.map(orderItem => ({ id: orderItem.product.id }))
  });

  const orderItemsByOrder = groupBy(orderItems, 'orderId');
  const productsById = indexBy(products, 'id');

  for (const order of orders) {
    order.items = orderItemsByOrder[order.id];

    for (const item of order.items) {
      item.product = productsById[item.productId];
    }
  }

  return orders;
}

function indexBy(arr, key) {
  return arr.reduce((acc, item) => {
    acc[item[key]] = item;
    return acc;
  }, {});
}

function groupBy(arr, key) {
  return arr.reduce((acc, item) => {
    if (!acc[item[key]]) {
      acc[item[key]] = [];
    }
    acc[item[key]].push(item);
    return acc;
  }, {});
}

Here, we prefetch all the needed order items and products and then index them in a hash table/dictionary to look them up during our loops. This keeps our code with the same readability but improves the performance, as in-memory lookups are constant time and nearly instantaneous. For performance comparison, we'd need to compare it to doing the full operation in the database, but this removes the egregious N+1 operations.

Eager v Lazy Loading

Another ORM feature to be aware of is eager loading and lazy loading. This implementation varies greatly in different languages and ORMs, so please reference your tool's documentation to confirm its behavior. TypeORM's eager v lazy loading works as follows:

If eager loading is enabled for a field when you fetch a record, the relationship will automatically preload in memory.
If lazy loading is enabled, the relationship is not available by default, and you need to request it via a key accessor that executes a promise.
If neither is enabled, it defaults to the behavior ascribed above when we explained handling N+1 queries.

Sometimes, you don't want these relationships to be preloaded as you don't use the data. This behavior should not be used unless you have a set of relations that are always loaded together. In our example, products likely will always need product variants loaded, so this is a safe eager load, but eager loading order items on orders wouldn't always be used and can be expensive.

You also need to be aware of the nuances of your ORM. With our original problem, we had an operation that looked like product.productVariants.insert({ … }). If you read more on Entity Framework's handling of eager v lazy loading, you'll learn that in this example, the product variants for the product are loaded into memory first and then the insert into the database happens. Loading the product variants into memory is unnecessary. A product with 100s (if not 1000s) of variants can get especially expensive. This was the biggest offender in our client's code base, so flipping the query to include the ID in the insert operation and bypassing the relationship saved us seconds in performance.

Database Field Performance Issues

Another issue in the project was loading records with certain data types in fields. Specifically, the text type. The text type can be used to store arbitrarily long strings like blog post content or JSON blobs in the absence of a JSON type. Most databases use a technique that stores text fields off rows, which requires a special file system lookup operation to fetch that data. This can make a typical database lookup that would take ~100ms under normal conditions to take ~200ms. If you combine this problem with some of the N+1 and eager loading problems we've mentioned, this can lead to seconds, if not minutes, of query slowdown. For these, you should consider not including the column by default as part of your ORM.

TypeORM allows for this via their hidden columns feature. In this example, for our product description, we could change the definition to be:

@Entity()
export class Product {

  // … omitted for brevity

  @Column({ type: 'text', select: false })
  description: string;

  // … omitted for brevity
}

This would allow us to query products without descriptions quickly. If we needed to include the product description in an operation, we'd have to use the addSelect function to our query to include that data in our result like:

const product = await productRepository
  .addSelect('product.description')
  .findOne({ 
    where: { id: productId },
  });

This is an optimization you should be wary of making in existing systems but one worth considering to improve performance, especially for data reports. Alternatively, you could optimize a query using the select method to limit the fields returned to those you need.

Database v. In-Memory Fetching

Going back to one of our earlier examples, we wrote the following:

const orderItems = await orderItemRepository.find({
  where: orders.map(order => ({ order: { id: order.id } }))
});
const orderItemsByOrder = groupBy(orderItems, 'orderId');

This involves loading our data in memory and then using system memory to perform a group-by operation. Our database could have also returned this result grouped. We opted to perform this operation like this because it made fetching the order item IDs easier. This takes some performance challenges away from our database and puts the performance effort on our servers. Depending on your database and other system constraints, this is a trade-off, so do some benchmarking to confirm your best options here.

This example is not too bad, but let's say we wanted to get all the orders for a customer with an item that cost more than $20. Your first inclination might be to use your ORM to fetch all the orders and their items and then filter that operation in memory using JavaScript's filter method. In this case, you're loading data you don't need into memory. This is a time to leverage the database more. We could write this query as follows in SQL:

SELECT o.*
FROM Order as o, OrderLineItem li
WHERE o.customerId = customerId
AND o.id = li.orderId
AND li.amount > 20

This just loads the orders that had the data that matched our condition. We could write this as:

const orderRepository = getRepository(Order); 
const orders = await orderRepository 
  .createQueryBuilder('o') // Alias for Order 
  .innerJoinAndSelect('o.items', 'li') // Join with OrderLineItem, alias 'li' 
  .where('o.customerId = :customerId', { customerId }) // Filter by customer ID 
  .andWhere('li.amount > 20') // Filter by item cost 
  .getMany();

We constrained this to a single customer, but if it were for a set of customers, it could be significantly faster than loading all the data into memory. If our server has memory limitations, this is a good concern to be aware of when optimizing for performance. We noticed a few instances on our client's implementation where the filter operations were applied in functions that appeared to run the operation in the database but were running the operation in memory, so this was preloading more data into memory than needed on a memory-constrained server. Refer to your ORM manual to avoid this type of performance hit.

Lack of Indexes

The final issue we encountered was a lack of indexes on key lookups. Some ORMs do not support defining indexes in code and are manually applied to databases. These tend not to be documented, so an out-of-sync issue can happen in different environments. To avoid this challenge, we prefer ORMs that support indexes in code like TypeORM. In our last example, we filtered on the cost of an order item, but the cost field does not contain an index. This leads to a full table scan of our data collection filtered by the customer. The query cost can be very expensive if a customer has thousands of orders. Adding the index can make our query super fast, but it comes at a cost. Each new index makes writing to the database slower and can exponentially increase the size of our database needs. You should only add indexes to fields that you are querying against regularly. Again, be sure you can notate these indexes in code so they can be replicated across environments easily.

In our client's system, the previous developer did not include indexes in the code, so we retroactively added the database indexes to the codebase. We recommend using your database's recommended tool for inspection to determine what indexes are in place and keep these systems in sync at all times.

Conclusion

ORMs can be an amazing tool to help you and your teams build applications quickly. However, they have gotchas for performance that you should be aware of and can identify while developing or during code reviews. These are some of the most common examples I could think of for best practices. When you hear the horror stories about ORMs, these are some of the challenges typically discussed. I'm sure there are more, though. What are some that you know? Let us know!

This Dot is a consultancy dedicated to guiding companies through their modernization and digital transformation journeys. Specializing in replatforming, modernizing, and launching new initiatives, we stand out by taking true ownership of your engineering projects.

We love helping teams with projects that have missed their deadlines or helping keep your strategic digital initiatives on course. Check out our case studies and our clients that trust us with their engineering.

About the author(s)

Dustin Goodman
Engineering Manager with a passion for web and application development. Speaker and writer on work experiences and software development. Dog dad and musician fueled by coffee.
@dustinsgoodman @dustinsgoodman

It's Impossible For This Code to Fail - with Loren Sands-Ramshaw

Loren Sands-Ramshaw, Developer Relations Engineer at Temporal joins Rob Ocel to talk about reliable application development. They introduce the topic of durable execution and talk about reliability in systems, unraveling common issues developers face and showcase the benefits that durable execution can bring to software development. They also talk about the challenges of traditional programming and the complexities of event-driven architecture. Listen to the full podcast here: https://modernweb.podbean.com/e/modern-web-podcast-s11e19-its-impossible-for-this-code-to-fail/...

Jan 24, 2024

1 min

ArchitectureJavaScript

What is a Monorepo and What Are the Advantages for Using It in Your Project?

Monorepos are very popular in the tech industry, and many large companies like Microsoft and Google use them. But what exactly is a monorepo, and would it work for your project? In this article, we are going to explore what the monorepo architecture is, and learn about some of the commonly used tools for monorepos. Table of contents - What is a monorepo? - Unified build process and CI/CD process - Increased cross team communication - Steeper learning curve and issues of inclusivity for newer developers - A look into the starter.dev monorepo - What is starter.dev? - Why did the team choose a monorepo structure? - What are Yarn Workspaces? - Unified CI/CD pipeline with Amplify - How does the team manage issues? - Conclusion What is a monorepo? A monorepo is a single version controlled repository that contains multiple independent projects. Some monorepos might contain just 2 or 3 projects while others will contain hundreds of projects using different technologies. This differs from the polyrepo structure where the projects are spread out amongst multiple repositories. It is also important to note that a monorepo is not the same as a monolith. A monolith has multiple sub-projects inside one giant project while monorepos have multiple independent projects inside one repository. Unified build process and CI/CD process In a monorepo structure, all of CI/CD(Continuous integration/Continuous delivery) lives in the same repository. This provides teams with more flexibility, and allows you to deploy your services together. You can also choose to deploy your projects separately. Increased cross team communication In a polyrepo structure, you will have multiple teams working on different projects spread across many repositories. Often times these teams will not be aware of the status of other repositories outside of their main one. With a monorepo structure, teams are more aware of the changes being made to the projects used in the repo and might spot issues in another projects. When all of your projects are under a single repository, it makes it easier to establish a consistent code style and set of guidelines across all projects. Some of the guidelines can include naming conventions, best practices for code reviews, branching policies, etc. In situations where there are breaking changes in the main branch, a monorepo can help identify where the breaking change came from and that team would be responsible for fixing it. A monorepo can also help teams discuss different versioning strategies for the projects. Steeper learning curve and issues of inclusivity for newer developers Inclusivity of all developers on a team is really important for the success and outcome of a software project. When you have many diverse perspectives and levels it will lead to a stronger finished product. But a monorepo structure could be seen as intimidating to novice developers. It might be their first time seeing a list of independent projects within the same repository and they might not know how to best contribute in a meaningful way especially if it is an open source situation. It can also be difficult for newer developers to understand the git history under a monorepo structure. If you are using a tool like git blame,it will have to sift through a lot of unrelated commits just to generate that information for you. For these reasons, it is important for the team to help newer developers through the monorepo structure and create issues where all levels can meaningfully contribute. A look into the starter.dev monorepo Let's take a look at a couple of This Dot Labs open source monorepos and understand why the decision was made to use a monorepo. Please note: We will not be exploring all of the possible tools used for monorepos. If you are interested in exploring more monorepo tools, then please checkout this curated list. What is starter.dev? starter.dev makes it easy to setup a new project with zero configuration involved. Once you install the starter.dev package(npm i @this-dot/create-starter), you can run the npx @this-dot/create-starter command and it will provide you with a list of starter kits to choose from. Each starter kit includes testing, linting, code examples and a basic configuration setup. starter.dev is an open source project that consists of a documentation website repository and a GitHub showcases repository dedicated to all of the different starter kits. Why did the team choose a monorepo structure? When it came time to plan out the project, the team had to decide if they wanted to maintain each kit in isolation or create different folders for each kit under the same repository. The reason for using a monorepo structure is to provide the team the ability to build the CLI(command-line interface) with greater ease and allow the team to still build the kits in isolation. What are Yarn Workspaces? The documentation website repository uses Yarn Workspaces which allows you to setup multiple packages and use the yarn install command to install them all at once. In the root directory, we have starters and packages directories. If we look at the starters directory, we will see all of the starter kits. Each starter kit will have its own package.json, basic configuration, README and code examples setup. Inside the root package.json file, you will notice this "workspaces": [ "packages/*"] key. ` That will tell Yarn to import all of the packages that are listed inside of the packages folder. Unified CI/CD pipeline with Amplify The starter.dev GitHub showcases repository has a unified CI/CD and uses Amplify. Inside the root directory, there is an Amplify yaml file where each application has their own set of build commands. Each project listed in the monorepo has its own deployment process setup. ` How does the team manage issues? If you look at the issues tab, you will notice that all of the developers are following this same naming convention for issues: [starter kit] title for issue. This works well for a variety of reasons: 1. Any developer interested in contributing to the project, can see at a quick glance which issues fit their skill and interest level. 2. It will be easier to track the status of the issues on a project board and see which areas need more attention. Conclusion We have explored the monorepo structure and talked about a few features. We have also compared it to other architectures like monoliths and polyrepos to see how if differs. Then we took an in depth look into starter.dev and how it uses Yarn Workspaces to help organize the project. I hope your enjoyed this article and best of luck on your programming journey....

Sep 23, 2022

6 mins

Architecture

Demystifying React Server Components

React Server Components (RSCs) are the latest addition to the React ecosystem, and they've caused a bit of a disruption to how we think about React. Dan Abramov recently wrote an article titled "The Two Reacts" that explores the paradigms of client component and server component mental models and leaves us with the thought on how we can cohesively combine these concepts in a meaningful way. It took me a while to finally give RSCs the proper exploration to truly understand the "new" model and grasp where React is heading. First off, the new model isn't really new so much as it introduces a few new concepts for us to consider when architecting our applications. Once I understood the pattern, I found an appreciation for the model and what it's trying to help us accomplish. In this post, I hope I can show you the progression of the React architecture in applications as I've experienced them and how I think RSCs help us improve this model and our apps. A "Brief" History of React Rendering and Data-Fetching Patterns One of the biggest challenges in React since its early days is "how do we server render pages?" Server-side rendering (SSR) is one of the techniques we can use to ensure users see data on initial load and helps with our site's SEO. Without SSR, users would see blank screens or loading spinners, and then a bunch of content would appear shortly after. An excellent graphic on Remix's website demonstrates this behavior from the end user's perspective and it's a problem we generally try to avoid as developers. This problem is so vast and difficult that we've been trying to solve it since 2015. Rick Hanlon from the React Core Team reminded us just how complicated this problem was recently. If you think react is complicated now, go back to 2015 when you had to configure babel and webpack yourself, you had to do SSR and routing yourself, you had to decide between flow and typescript, you had to learn flux and immutable.js, and you had to choose 1 of 100 boilerplates— Ricky (@rickhanlonii) January 28, 2024 But SSR has its issues too. Sometimes SSR is slow because we need to do a lot of data fetching for our page. Because of these large payloads, we'll defer their rendering using lazy loading patterns. All of a sudden we have spinners again! How we've managed these components has changed too. Over the years, we've seen a variety of patterns emerge for managing these problems. We had server-side pre-fetching where we had to hydrate our frontends application state. Then we tried controller-view patterned components for our lazily loaded client-side components. With the evolution of React, we were able to simplify the controller-view patterns to leverage hooks. Now, we're in a new era of multi-server entry points on a page with RSCs. The Benefits of React Server Components RSCs give us this new paradigm that allows us to have multiple entries into our server on a single page. Leveraging features like Next.js' streaming mode and caching, we can limit what our pages block on for SSR and optimize their performance. To illustrate this, let's look at this small block of PHP for something we might have done in the 2000s: index.php ` For simplicity, the blocks indicate server boundaries where we can execute PHP functionality. Once we exit that block, we can no longer leverage or communicate with the server. In this example, we're displaying a welcome message to a user and a list of posts. If we look at Next.js' page router leveraging getServerSideProps, we would maybe write this same page as follows: pages/posts.jsx ` In this case, we're having our server do a lot to fetch the data we need to render this page and all its components. This also makes the line between server and client much clearer as getServerSideProps runs before our client renders, and we're unable to go back to that function without an API route and client-side fetch. Now, let's look at Next.js' app router with server components. This same component could be rendered as follows: app/posts/page.tsx ` This moves us a bit closer back to our PHP version as we don't have to split our server and client functionality as explicitly. This example is relatively simple and only renders a few components that could be easily rendered as static content. This page also probably requires all the content to be rendered upfront. But, from this, we can see that we're able to simplify how server data fetching is done. If nothing else, we could use this pattern to make our SSR patterns better and client render the rest of our apps, but then we'd lose out on some of the additional benefits that we can get when we combine this with streaming. Let's look at a post page that probably has the content and a comments section. We don't need to render the comments immediately on page load or block on it because it's a secondary feature on the page. This is where we can pull in Suspense and make our page shine. Our code might look as follows: app/posts/[slug]/page.jsx ` Our PostComments are a server component that renders static content again, but we don't want its render to block our page from being served to the client. What we've done here is moved the fetch operation into the comments component and wrapped it in a Suspense boundary in our page to defer its render until its content has been fetched and then stream it onto the page into its render location. We could also add a fallback to our suspense boundary if we wanted to put a skeleton loader or other indication UI. With these changes, we're writing components similarly to how we've written them on the client historically but simplified our data fetch. What happens when we start to progressively enhance our features with client-side JavaScript? Client Components with RSCs All our examples focused on staying in the server context which is great for simple applications, but what happens when we want to add interactivity? If you were to put a useState, useEffect, onClick, or other client-side React feature into a server component, you'd find some error messages because you can't run client code from a server context. If we think back to our PHP example, that makes a lot of sense why this is the case, but how do we work around this? For me, this is where my first really mental challenge with RSCs started. Let's use our PostComments as an example to enhance by in-place sorting the section from the server. ` In this example, we're using a server component the same way as our previous example to do the initial data fetch and stream when ready. However, we're immediately passing the results to a client component that renders the elements and enhances our code with a list of sort options. When we select the sort we want, our code makes a request to the server at a predefined route and gets the new data that we re-render in the new order on screen. This is how we might have done things without RSCs before but without a useEffect for our initial rendering. If you're familiar with the old controller-view pattern, this is relatively similar to that pattern, but we have to relegate client re-fetching to the view (client) component, where we might have had all fetch and re-fetch patterns in the controller (server) component Another way to solve this same problem is leveraging server actions, but that would cause the page to re-render. Given the caching mechanisms in Next.js, this is probably fine, but it's not the user experience people are expecting. Server actions are a topic of their own, so I won't cover them in this post, but they're important for the holistic ecosystem experience in the new mindset. Interleaving Nested Server & Client Components These examples have shown a clean approach where we keep our server components at the top of our rendering tree and have a clear line where we move from server mode to client mode. But one of the advantages of RSCs is that we can interleave where our server component entry points may exist. Let's think about a product carousel on a storefront page for example. We may have built this as follows before: ` Here we're rendering carousels that render their cards and our tree goes server -> client -> server. This is not allowed in the new paradigm because the React compiler cannot detect that we moved back into a server component when we called ProductCards from a client component. Instead, we would need to refactor this to be: ` Here, we've changed ProductCarousel to accept children that are our ProductCards server component. This allows the compiler to detect the boundary and render it appropriately. I'd also recommend adding Suspense boundaries to these carousels but I omitted it for the sake of brevity in this example. Some Suggested Best Practices Our team has been using these new patterns for some time and have developed some patterns we've identified as best practices for us to help keep some of these boundaries clear that I thought worth sharing. 1. Be explicit with component types We've found that being explicit with component types is essential to expressing intent. Some components are strictly server, some are strictly client, and others can be used in both contexts. Our team likes using the server-only package to express this intent but we found we needed more. We've opted for the following naming conventions: Server only components: component-name.server.(jsx|tsx) Client only components: component-name.client.(jsx|tsx) Universal components: component-name.(jsx|tsx) This does not apply to special framework file names but we've found this helps us delineate where we are and to help with interleaving. 2. Defer Fetch when Cache is Available If we're trying to render a component that is a collection of other components, we've found deferring the fetch to the child allows our cache to do more for us in rendering in cases where the same element may exist in different locations. This is similar to how GraphQL's data loaders works and can ideally boost the performance of cached server components. So, in our product example above, we may only fetch the IDs of the products we need for the ProductCards and then fetch all the data per card. Yes, this is an extra fetch and causes an N+1, but if our cache is in play, end users will feel a performance gain. 3. Colocate loaders and queries If you're using GraphQL or need loading states for components for Suspense fallback, we recommend colocating those elements in the same file or directory. This makes it easier to modify and manage related elements for your components in a more meaningful way. 4. Use Suspense to defer non-essential content This will depend on your website and needs. Still, we recommend deferring as many non-essential elements to Suspense as possible. We define non-essential to be anything you could consider secondary to the page. On a blog post, this could be recommended posts or comments. On a product page, this could be reviews and related products. So long as the primary focus of your page is outside a Suspense boundary, your usage is probably acceptable. Conclusion RSCs represent a significant evolution in the React ecosystem, offering new paradigms and opportunities for improving the architecture of web applications. They enable multiple server entry points on a single page, optimizing server-side rendering, and simplifying data fetching. RSCs allow for a clearer separation between server and client functionality, enhancing both performance and maintainability. To make the most of RSCs, developers should consider best practices such as being explicit with component types, deferring fetch when caching is available, colocating loaders and queries, and using Suspense to defer non-essential content. Embracing these practices can help harness the potential of React Server Components and pave the way for more efficient and interactive web applications. We're excited about this development in the React ecosystem and the developments happening across teams and frameworks that are opting into using them. While Next.js offers a great solution, we're excited to see similar enhancements to Remix and RedwoodJS....

Feb 2, 2024

9 mins

ReactNextJSArchitecture

D1 SQLite: Writing queries with the D1 Client API

Writing queries with the D1 Client API In the previous post we defined our database schema, got up and running with migrations, and loaded some seed data into our database. In this post we will be working with our new database and seed data. If you want to participate, make sure to follow the steps in the first post. We’ve been taking a minimal approach so far by using only wrangler and sql scripts for our workflow. The D1 Client API has a small surface area. Thanks to the power of SQL, we will have everything we need to construct all types of queries. Before we start writing our queries, let's touch on some important concepts. Prepared statements and parameter binding This is the first section of the docs and it highlights two different ways to write our SQL statements using the client API: prepared and static statements. Best practice is to use prepared statements because they are more performant and prevent SQL injection attacks. So we will write our queries using prepared statements. We need to use parameter binding to build our queries with prepared statements. This is pretty straightforward and there are two variations. By default we add ? ’s to our statement to represent a value to be filled in. The bind method will bind the parameters to each question mark by their index. The first ? is tied to the first parameter in bind, 2nd, etc. I would stick with this most of the time to avoid any confusion. ` I like this second method less as it feels like something I can imagine messing up very innocently. You can add a number directly after a question mark to indicate which number parameter it should be bound to. In this exampl, we reverse the previous binding. ` Reusing prepared statements If we take the first example above and not bind any values we have a statement that can be reused: ` Querying For the purposes of this post we will just build example queries by writing them out directly in our Worker fetch handler. If you are building an app I would recommend building functions or some other abstraction around your queries. select queries Let's write our first query against our data set to get our feet wet. Here’s the initial worker code and a query for all authors: ` We pass our SQL statement into prepare and use the all method to get all the rows. Notice that we are able to pass our types to a generic parameter in all. This allows us to get a fully typed response from our query. We can run our worker with npm run dev and access it at http://localhost:8787 by default. We’ll keep this simple workflow of writing queries and passing them as a json response for inspection in the browser. Opening the page we get our author results. joins Not using an ORM means we have full control over our own destiny. Like anything else though, this has tradeoffs. Let’s look at a query to fetch the list of posts that includes author and tags information. ` Let’s walk through each part of the query and highlight some pros and cons. ` * The query selects all columns from the posts table. * It also selects the name column from the authors table and renames it to author_name. * It aggregates the name column from the tags table into a JSON array. If there are no tags, it returns an empty JSON array. This aggregated result is renamed to tags. ` * The query starts by selecting data from the posts table. * It then joins the authors table to include author information for each post, matching posts to authors using the author_id column in posts and the id column in authors. * Next, it left joins the posts_tags table to include tag associations for each post, ensuring that all posts are included even if they have no tags. * Next, it left joins the tags table to include tag names, matching tags to posts using the tag_id column in posts_tags and the id column in tags. * Finally, group the results by the post id so that all rows with the same post id are combined in a single row SQL provides a lot of power to query our data in interesting ways. JOIN ’s will typically be more performant than performing additional queries.You could just as easily write a simpler version of this query that uses subqueries to fetch post tags and join all the data by hand with JavaScript. This is the nice thing about writing SQL, you’re free to fetch and handle your data how you please. Our results should look similar to this: ` This brings us to our next topic. Marshaling / coercing result data A couple of things we notice about the format of the result data our query provides: Rows are flat. We join the author directly onto the post and prefix its column names with author. ` Using an ORM we might get the data back as a child object: ` Another thing is that our tags data is a JSON string and not a JavaScript array. This means that we will need to parse it ourselves. ` This isn’t the end of the world but it is some more work on our end to coerce the result data into the format that we actually want. This problem is handled in most ORM’s and is their main selling point in my opinion. insert / update / delete Next, let’s write a function that will add a new post to our database. ` There’s a few queries involved in our create post function: * first we create the new post * next we run through the tags and either create or return an existing tag * finally, we add entries to our post_tags join table to associate our new post with the tags assigned We can test our new function by providing post content in query params on our index page and formatting them for our function. ` I gave it a run like this: http://localhost:8787authorId=1&tags=Food%2CReview&title=A+review+of+my+favorite+Italian+restaurant&content=I+got+the+sausage+orchette+and+it+was+amazing.+I+wish+that+instead+of+baby+broccoli+they+used+rapini.+Otherwise+it+was+a+perfect+dish+and+the+vibes+were+great And got a new post with the id 11. UPDATE and DELETE operations are pretty similar to what we’ve seen so far. Most complexity in your queries will be similar to what we’ve seen in the posts query where we want to JOIN or GROUP BY data in various ways. To update the post we can write a query that looks like this: ` COALESCE acts similarly to if we had written a ?? b in JavaScript. If the binded value that we provide is null it will fall back to the default. We can delete our new post with a simple DELETE query: ` Transactions / Batching One thing to note with D1 is that I don’t think the traditional style of SQLite transactions are supported. You can use the db.batch API to achieve similar functionality though. According to the docs: Batched statements are SQL transactions ↗. If a statement in the sequence fails, then an error is returned for that specific statement, and it aborts or rolls back the entire sequence. ` Summary In this post, we've taken a hands-on approach to exploring the D1 Client API, starting with defining our database schema and loading seed data. We then dove into writing queries, covering the basics of prepared statements and parameter binding, before moving on to more complex topics like joins and transactions. We saw how to construct and execute queries to fetch data from our database, including how to handle relationships between tables and marshal result data into a usable format. We also touched on inserting, updating, and deleting data, and how to use transactions to ensure data consistency. By working through these examples, we've gained a solid understanding of how to use the D1 Client API to interact with our database and build robust, data-driven applications....

Dec 23, 2024

6 mins

JavaScript

Let's innovate together!

We're ready to be your trusted technical partners in your digital innovation journey.

Whether it's modernization or custom software solutions, our team of experts can guide you through best practices and how to build scalable, performant software that lasts.

The Dangers of ORMs and How to Avoid Them

Background

Our Example Data

N + 1 Queries

Eager v Lazy Loading

Database Field Performance Issues

Database v. In-Memory Fetching

Lack of Indexes

Conclusion

Dustin Goodman

You might also like

It's Impossible For This Code to Fail - with Loren Sands-Ramshaw

What is a Monorepo and What Are the Advantages for Using It in Your Project?

Demystifying React Server Components

D1 SQLite: Writing queries with the D1 Client API

Let's innovate together!

You might also like

It's Impossible For This Code to Fail - with Loren Sands-Ramshaw

What is a Monorepo and What Are the Advantages for Using It in Your Project?

Demystifying React Server Components

D1 SQLite: Writing queries with the D1 Client API