General

What is a Monorepo and What Are the Advantages for Using It in Your Project?

Published Sep 23, 2022

Updated Feb 13, 2023

6 min read

This article was written over 18 months ago and may contain information that is out of date. Some content may be relevant but please refer to the relevant official documentation or available resources for the latest information.

Monorepos are very popular in the tech industry, and many large companies like Microsoft and Google use them. But what exactly is a monorepo, and would it work for your project?

In this article, we are going to explore what the monorepo architecture is, and learn about some of the commonly used tools for monorepos.

What is a monorepo?
A look into the starter.dev monorepo
Conclusion

What is a monorepo?

A monorepo is a single version controlled repository that contains multiple independent projects. Some monorepos might contain just 2 or 3 projects while others will contain hundreds of projects using different technologies. This differs from the polyrepo structure where the projects are spread out amongst multiple repositories. It is also important to note that a monorepo is not the same as a monolith. A monolith has multiple sub-projects inside one giant project while monorepos have multiple independent projects inside one repository.

Unified build process and CI/CD process

In a monorepo structure, all of CI/CD(Continuous integration/Continuous delivery) lives in the same repository. This provides teams with more flexibility, and allows you to deploy your services together. You can also choose to deploy your projects separately.

Increased cross team communication

In a polyrepo structure, you will have multiple teams working on different projects spread across many repositories. Often times these teams will not be aware of the status of other repositories outside of their main one. With a monorepo structure, teams are more aware of the changes being made to the projects used in the repo and might spot issues in another projects.

When all of your projects are under a single repository, it makes it easier to establish a consistent code style and set of guidelines across all projects. Some of the guidelines can include naming conventions, best practices for code reviews, branching policies, etc.

In situations where there are breaking changes in the main branch, a monorepo can help identify where the breaking change came from and that team would be responsible for fixing it. A monorepo can also help teams discuss different versioning strategies for the projects.

Steeper learning curve and issues of inclusivity for newer developers

Inclusivity of all developers on a team is really important for the success and outcome of a software project. When you have many diverse perspectives and levels it will lead to a stronger finished product. But a monorepo structure could be seen as intimidating to novice developers. It might be their first time seeing a list of independent projects within the same repository and they might not know how to best contribute in a meaningful way especially if it is an open source situation. It can also be difficult for newer developers to understand the git history under a monorepo structure. If you are using a tool like git blame,it will have to sift through a lot of unrelated commits just to generate that information for you. For these reasons, it is important for the team to help newer developers through the monorepo structure and create issues where all levels can meaningfully contribute.

A look into the starter.dev monorepo

Let's take a look at a couple of This Dot Labs open source monorepos and understand why the decision was made to use a monorepo.

Please note: We will not be exploring all of the possible tools used for monorepos. If you are interested in exploring more monorepo tools, then please checkout this curated list.

What is starter.dev?

starter.dev makes it easy to setup a new project with zero configuration involved. Once you install the starter.dev package(npm i @this-dot/create-starter), you can run the npx @this-dot/create-starter command and it will provide you with a list of starter kits to choose from. Each starter kit includes testing, linting, code examples and a basic configuration setup.

starter.dev is an open source project that consists of a documentation website repository and a GitHub showcases repository dedicated to all of the different starter kits.

Why did the team choose a monorepo structure?

When it came time to plan out the project, the team had to decide if they wanted to maintain each kit in isolation or create different folders for each kit under the same repository. The reason for using a monorepo structure is to provide the team the ability to build the CLI(command-line interface) with greater ease and allow the team to still build the kits in isolation.

What are Yarn Workspaces?

The documentation website repository uses Yarn Workspaces which allows you to setup multiple packages and use the yarn install command to install them all at once.

In the root directory, we have starters and packages directories. If we look at the starters directory, we will see all of the starter kits. Each starter kit will have its own package.json, basic configuration, README and code examples setup.

Inside the root package.json file, you will notice this "workspaces": [ "packages/*"] key.

{
  "name": "starter.dev",
  "version": "0.1.0",
  "private": true,
  "scripts": {
    "lint": "yarn workspace website lint && yarn workspace @this-dot/create-starter lint",
    "build:cli": "yarn workspace @this-dot/create-starter build",
    "build:website": "yarn workspace website build",
    "start:cli": "yarn workspace @this-dot/create-starter start",
    "start:website": "yarn workspace website preview",
    "dev:website": "yarn workspace website dev",
    "watch:cli": "yarn workspace @this-dot/create-starter watch"
  },
  "workspaces": [
    "packages/*"
  ]
}

That will tell Yarn to import all of the packages that are listed inside of the packages folder.

Unified CI/CD pipeline with Amplify

The starter.dev GitHub showcases repository has a unified CI/CD and uses Amplify. Inside the root directory, there is an Amplify yaml file where each application has their own set of build commands. Each project listed in the monorepo has its own deployment process setup.

version: 1
applications:
  - appRoot: next-react-query-tailwind
    frontend:
      phases:
        preBuild:
          commands:
            - nvm install --lts=gallium
            - yarn install
        build:
          commands:
            - yarn run build
      artifacts:
        baseDirectory: .next
        files:
          - "**/*"
      cache:
        paths:
          - node_modules/**/*

  - appRoot: angular-apollo-tailwind
    frontend:
      phases:
        preBuild:
          commands:
            - nvm install --lts=gallium
            - yarn install
            - yarn generate
        build:
          commands:
            - node -v
            - yarn run build
      artifacts:
        baseDirectory: dist/starter-dev-angular
        files:
          - "**/*"
      cache:
        paths:
          - node_modules/**/*

How does the team manage issues?

If you look at the issues tab, you will notice that all of the developers are following this same naming convention for issues: [starter kit] title for issue.

This works well for a variety of reasons:

Any developer interested in contributing to the project, can see at a quick glance which issues fit their skill and interest level.
It will be easier to track the status of the issues on a project board and see which areas need more attention.

Conclusion

We have explored the monorepo structure and talked about a few features. We have also compared it to other architectures like monoliths and polyrepos to see how if differs.

Then we took an in depth look into starter.dev and how it uses Yarn Workspaces to help organize the project.

I hope your enjoyed this article and best of luck on your programming journey.

This Dot is a consultancy dedicated to guiding companies through their modernization and digital transformation journeys. Specializing in replatforming, modernizing, and launching new initiatives, we stand out by taking true ownership of your engineering projects.

We love helping teams with projects that have missed their deadlines or helping keep your strategic digital initiatives on course. Check out our case studies and our clients that trust us with their engineering.

About the author(s)

Jessica Wilkins
Jessica Wilkins is a classical musician turned Software Engineer. Prior to joining the tech industry, she spent her time running her own sheet music company (JDW Sheet Music) as well as performing and teaching in Los Angeles, CA. She enjoys working with React and TypeScript. She is also a prolific technical writer for freeCodeCamp.
@codergirl1991 @jdwilkin4

Leveling Up Your Work Through Architecture Design and Time Estimation

> This post can be useful for developers at any level! However, it is written mostly for entry-level developers who are starting to transition into a more intermediate role. You’re rocking your first development job. You’re completing tasks, the team loves you, you still have tons of questions but you can get answers and build what you’re asked to build. But you want to keep improving your skills and giving yourself more options. Some of that skill can only come from actually building things over time. But surely there’s something you can do to level up besides that, right? There is! One way to help deepen your understanding of the software you’re writing and make yourself look more impressive is by improving your ability to estimate your time, and how you talk about your work. That’s what we’re going to dig into today! Some terminology You might have heard of, or have read terms like “software architecture” and “quality attributes” and, upon trying to look into them, been met with a LOT of jargon and dense terminology. So before we get too far into this, let’s go over a few terms we’ll use. (This is by no means meant to be a complete description of these terms. The aim is to be clear enough to continue our discussion here, and let you get started.) - Software architecture: This term means how your system is organized. It’s like the blueprint for the application or website you’re building. How do all the pieces we build fit together and interact with each other? What’s the main goal we’re trying to achieve with this codebase? That’s what this term is talking about. - Software design: Once we have the blueprint, we need to figure out how to build the pieces to make that structure come to life. This is where the design comes in. It’s more code-specific and focuses more on the specifics of what we need to build and how. What languages are we using? How do we get this component to be interactive or usable like we want? This is all the design. - Software quality attributes: Sometimes referred to as “ilities” because a lot of them end with those letters, these are words we use to describe the software we write. Common ones would be accessibility, reliability, or performance. These are words used to describe some of the key qualities we want our software to have. You can start with this list of quality attributes or do a search for “software quality attributes”, and find all sorts of articles covering a lot of the common ones. - Time estimation: This is the skill of being able to consider a task and estimate how long it would take you to complete the task. Time can be a funny thing, and it’s pretty common to think something will be easy and have it take way longer, or think something will be complex, and it turns out to be straightforward. - Trade-offs: This is the idea that two things can’t be equally balanced and/or important. There’s a joke about how people want things fast, cheap, and good and you can’t have all three. This is the same idea. Very often, by increasing one quality, you have to decrease another. The balance of those, and the choice to focus on one quality more than another, is called a trade-off. You’re trading the strength of one quality for another. When we’re using the phrase “architecture design” in this article, we’re combining all these concepts together. How do we get started? Ok, so we’ve got some terms now. But what do we do with them? How do we get better at designing software, and estimating our time? I’ve got two templates to share with you that can help you start to develop these skills. Both will involve writing, but I’ve tried to make them as straightforward to fill out as possible so it’s not a chore to use them. I’ll share links for the templates later on. But first, we’ll go over the actual contents of both of them and talk about how to use them. These templates can be useful for any task, no matter how complex. However, super simple tasks like fixing a broken link on a page, or adjusting a color value to improve its contrast may not be the best choice for it. It is useful to use this template on a task that you fully understand, so you get some practice with thinking about the tradeoffs you’re making, and get used to some of the terminology and how to talk about the software you build. But this can really help when you start getting larger tasks or features- ones that might contain multiple parts working together or some complex logic to get the task working properly. The architecture design template For this template, the goal is to complete this first before you do any work on your task. Use this as a tool to help you think through your work before you get started on it so you can have a better understanding of what the task is asking of you, and let you start thinking about the attributes you’re building for. There are 7 sections for this document, each with a title and description. Let’s go over each piece to see how they work. Big picture: What’s the ask? *What’s the ticket/task trying to solve? What’s the end goal of this body of work? Describe the problem in the most ELI5 way you can.* Use this section to describe the task you’re working on. Act like you’re talking to someone with very little understanding of the situation, like a project manager on another team or to a friend of yours that doesn’t work with you. This section helps you make sure you have a strong handle on the work you’re being asked to do, and what the end result should be. Requirements *Is there acceptance criteria? A specific way it needs to work? What makes this ticket / task count as complete? Steps to reproduce and/or screenshots are helpful here too, if available.* Some of this can often be copied over from the ticket you’re working on. Make note of how you can tell that this task is complete, and any specifics that you need to make sure work as expected when you’re ready to have someone review it. Constraints *Are there specific things you can / can’t use? Something you have to make sure doesn’t break? Tools you have to use because of where you have to work in the codebase?* Making note of constraints can be super helpful. Maybe you have to use a specific UI library to complete this because the rest of the project uses it. Or perhaps this task has been attempted before, and you don’t want to repeat a version that didn’t work. Getting used to thinking about the constraints you’re working within can help prep your brain to think around these concepts. Architecture and Tradeoffs *Where does this code live? (login, ui, db, etc) What software quality attributes are we focused on for this task? List out the primary and secondary attributes (maybe a third if it feels necessary).* In this section, we’re focusing on developing our architecture and design skills in detail. Talk a little about how the codebase is structured, and where the work for this task will be located. Also, pick one or two software quality attributes that you think fit this task. Does the work you’re doing help improve the site’s overall accessibility? Does it make the application more reliable for end users? Does it help to keep our codebase’s maintainability within a reasonable level? Use one of the lists above or your own searches to build out a list of the attributes you might build for regularly, and pick out one or two of them that relate the most to this task. Then, talk a little about why you picked those traits and your thoughts about why they’re applicable. Being able to explain your reasoning here will both help you gain more understanding of the work you’re doing, and the words you’re using to describe it, and give whoever reads this document a better glimpse into the work you do and how you think about it! Initial Gameplan *Considering all the above, what seems like the path forward? What are the steps for how you think it can be solved/completed? Where are you starting?* Now that you’ve spent a little time thinking through the task, and some of the things you should remember about it (like how it’s structured, the tools you have to use, and what’s important to keep in mind while you’re solving it) - write out the steps you think you should take to accomplish the task. It’s perfectly okay to not fully follow this plan as you start to build! We can never fully predict everything that might happen as we start to work in a codebase. The goal here is to give yourself a solid place to start, with all the details we’ve listed fresh in your mind. Process Notes *As you solve this - keep this area updated with rough notes. What did you try that didn’t work? Why not? If you have to pivot, what did you switch to and why? Maybe something worked but you still pivoted - what tradeoffs were made, and what’s the reasoning behind it?* As the description implies, this space is all for you. Try to keep notes as you go. If you do have to change course from your game plan, why and what did you change? Did you find another problem you didn’t even know existed? Or maybe you were able to try a new concept out and it worked! Celebrate your process here. None of this needs to be in complete sentences or easy for anyone but you to understand. This space is all yours to help you keep thinking through the work as you’re doing it! Final Solution *You did it! Form a narrative. Now that you’ve got the issue solved, what does it look like? How did you do it? What did you learn along the way? Is the final solution the same as what you thought, or did the ask pivot along the way? Share screenshots if available.* The final wrap-up! Try to write this similarly to how you started with the big picture. Did your initial plan end up working, and if not why? Were there any interesting plot twists throughout the process? Share the wins, the struggles, and the end result here. Screenshots can be perfect here to see a visual of your work! I also typically leave a little space at the bottom to wrap up any lessons I learned for myself, and leave a little room to talk about my time estimate and reasonings behind how it turned out the way it did. But those are completely optional. --- Because I use a Notion database to help me keep track of these documents, I also have a few properties I can fill out related to the project this document is for, what my time estimate and result were, and the quality attributes I selected. Having those visible at a glance is super helpful for me. The biggest thing I’d recommend to track with this is the date you filled it out. Having that date to help you organize and find your documents is super helpful as your collection grows. I typically title mine based on the project it’s for and the name of the task, but you can name yours whatever you’d like! Now let’s talk about how to track our time for this task. The time estimate template This one is a lot less detailed, but just as important! Being able to improve your knowledge of how long different tasks take you will be a super handy skill as your career continues. The main idea with this template is to keep track of two sections: how long you think something will take you (the estimate), and how long it truly takes you. Our goal is to get those two sections to be as close to the same number as possible. Remember that our goal here is NOT perfection. That’s impossible to reach. Our goal is simply to get them to be close to each other. Most people (even managers) understand that an estimate is not a guarantee. But the closer you can get your estimates to the true time it takes you, the more reliable you seem and the more accurately your team can make progress. There are three main sections to this template. The actual numbers Most importantly, we want an estimated total time and an actual total time, as well as a calculation of the difference between those two numbers. Is our estimate higher, or is the actual total time higher? That difference is what we’re trying to keep as close to zero as possible. I like to break my time tracking down into categories, so I can also see if particular parts of my work take me longer (or shorter) than I think. For each of these, I do the same thing: make an estimate, and record the actual value. The categories I like to track are: Investigation: time to fill out my architecture design, do a little looking into the task, make sure I understand fully what I’m being asked to do, and that I have all the information I need to complete it. Coding: time to do the actual work. Most of my time calculation goes into this category. Testing: I use this for either writing actual unit or end-to-end tests, or for manual testing. This is where I double-check to make sure I didn’t break anything or track time spent on that one last piece of functionality that doesn’t quite work right. Review: any feedback I get on my Pull Requests or code reviews that require me to make changes go into this category. Space to record the numbers as I go The Pomodoro technique works really well for me with time tracking, though I change up my “working time” numbers to make them easier for me to calculate. I’ve found the most straight forward way for me to do this is to have a title for each section of time that I’m tracking, with space underneath each one. Then, I have a legend of colored dot emojis related to a different amount of time: 15 minutes, 30 minutes, and 60 minutes. Then, as I do my rounds of working time, each time I’m done with a round, I select the right colored dot for the amount of time I did, copy it, and paste it under the section the work I did belongs to. Keep repeating this until the work is complete! Once your task is done, all you need to do is count your dots, and record the total number they represent in your actual time section from above. Here’s a screenshot of what one of my completed sections look like, so you can better see what I mean. From this tracking section, I can quickly see that I spent 15 minutes on both testing and review, and an hour and 45 minutes on coding. If I don’t happen to spend any time on a section (in this instance, I’d done some investigating in a separate ticket, so I already knew the work that needed to be done), I just leave it blank. I have a section at the top of my page for tracking the total number for each category. So I use this area to keep track as I’m doing small rounds of work. Then, when I’m done, I add up each number and put it in the section at the top of the page for that category. Notes I also have a section for notes at the bottom of my document. I keep this area for anything relating to my time tracking. Did something break unexpectedly, causing my estimate to be off? Did I not need a section for some reason? Or did something work way better than I thought it would? Those are the kinds of things worth keeping notes of here. Can you tell why your estimate and actual times were off? Sometimes we simply don’t know, but being able to keep notes on the things we can realize as we’re doing them helps us get better the next time! Tying these together and talking about it with others You can use both of these templates together or on their own, and you’ll gain a great amount of knowledge about your skill level and your growth over time from them! But they can also really shine using them together. While these are great tools for your own personal knowledge and growth, I also highly recommend sharing them with someone else. It’s a great idea to set a goal to be better with estimating your time, and sharing that goal with your manager. Then, you can use the time estimation template to build up some estimates, and share those with your manager. Maybe you have a senior developer on your team that you really respect, and you’d like to get their opinion on the task you’re working on. If you’ve filled out the architecture design for that task, you can share it with them and have a conversation about it. They can potentially provide you with things to consider for your next task, or get you thinking more deeply about the quality attributes you selected, and why they may or may not have been the best fit for your work. The design documents are also great to share with your manager! It’s a great way to show that you’re starting to think about your work on a deeper level and starting to consider the quality and complexity of your work. It can also be helpful reference material for them when promotions and new work becomes available. They’re more likely to think you might be a good pick for the next big thing if you’re already showing them you’re starting to think about things at a higher level! The Notion template links The links for both of these documents as Notion templates are below. Please feel free to duplicate a copy for yourself if you like using Notion, or just take a peek at them to see the actual layout of them and adapt that for whatever tool works best for you. These documents are set up to be used within a Notion database. We won’t cover that in detail here, but the Notion documentation should be able to take care of those details for you (and we have a link to get you started below as well). In general, you can create a database within Notion, and then set a template to use for each new entry. That way, when you go to create a new item, it will automatically pull up these templates for you, so you don’t have to copy and paste them every time! And the fields at the top will be visible when you go to look at your database, which makes it a little nicer to get a quick overview of your documents and the progress you’re making. Now go forth and deepen your knowledge! - Time estimation template - Architecture design template - Notion database templates documentation...

Jan 6, 2023

14 mins

Career developmentArchitectureNotion

The Dangers of ORMs and How to Avoid Them

Background We recently engaged with a client reporting performance issues with their product. It was a legacy ASP .NET application using Entity Framework to connect to a Microsoft SQL Server database. As the title of this post and this information suggest, the misuse of this Object-Relational Mapping (ORM) tool led to these performance issues. Since our blog primarily focuses on JavaScript/TypeScript content, I will leverage TypeORM to share examples that communicate the issues we explored and their fixes, as they are common in most ORM tools. Our Example Data If you're unfamiliar with TypeORM I recommend reading their docs or our other posts on the subject. To give us a variety of data situations, we will leverage a classic e-commerce example using products, product variants, customers, orders, and order line items. Our data will be related as follows: The relationship between customers and products is one we care about, but it is optional as it could be derived from customer->order->order line item->product. The TypeORM entity code would look as follows: ` With this example set, let's explore some common misuse patterns and how to fix them. N + 1 Queries One of ORMs' superpowers is quick access to associated records. For example, we want to get a product and its variants. We have 2 options for writing this operation. The first is to join the table at query time, such as: ` Which resolves to a SQL statement like (mileage varies with the ORM you use): ` The other option is to query the product variants separately, such as: ` This example executes 2 queries. The join operation performs 1 round trip to the database at 200ms, whereas the two operations option performs 2 round trips at 100ms. Depending on the location of your database to your server, the round trip time will impact your decision here on which to use. In this example, the decision to join or not is relatively unimportant and you can implement caching to make this even faster. But let's imagine a different situation/query that we want to run. Let's say we want to fetch a set of orders for a customer and all their order items and the products associated with them. With our ORM, we may want to write the following: ` When written out like this and with our new knowledge of query times, we can see that we'll have the following performance: 100ms * 1 query for orders + 100ms * order items count + 100ms * order items' products count. In the best case, this only takes 100ms, but in the worst case, we're talking about seconds to process all the data. That's an O(1 + N + M) operation! We can eagerly fetch with joins or collection queries based on our entity keys, and our query performance becomes closer to 100ms for orders + 100ms for order line items join + 100ms for product join. In the worst case, we're looking at 300ms or O(1)! Normally, N+1 queries like this aren't so obvious as they're split into multiple helper functions. In other languages' ORMs, some accessors look like property lookups, e.g. order.orderItems. This can be achieved with TypeORM using their lazy loading feature (more below), but we don't recommend this behavior by default. Also, you need to be wary of whether your ORM can be utilized through a template/view that may be looping over entities and fetching related records. In general, if you see a record being looped over and are experiencing slowness, you should verify if you've prefetched the data to avoid N+1 operations, as they can be a major bottleneck for performance. Our above example can be optimized by doing the following: ` Here, we prefetch all the needed order items and products and then index them in a hash table/dictionary to look them up during our loops. This keeps our code with the same readability but improves the performance, as in-memory lookups are constant time and nearly instantaneous. For performance comparison, we'd need to compare it to doing the full operation in the database, but this removes the egregious N+1 operations. Eager v Lazy Loading Another ORM feature to be aware of is eager loading and lazy loading. This implementation varies greatly in different languages and ORMs, so please reference your tool's documentation to confirm its behavior. TypeORM's eager v lazy loading works as follows: - If eager loading is enabled for a field when you fetch a record, the relationship will automatically preload in memory. - If lazy loading is enabled, the relationship is not available by default, and you need to request it via a key accessor that executes a promise. - If neither is enabled, it defaults to the behavior ascribed above when we explained handling N+1 queries. Sometimes, you don't want these relationships to be preloaded as you don't use the data. This behavior should not be used unless you have a set of relations that are always loaded together. In our example, products likely will always need product variants loaded, so this is a safe eager load, but eager loading order items on orders wouldn't always be used and can be expensive. You also need to be aware of the nuances of your ORM. With our original problem, we had an operation that looked like product.productVariants.insert({ … }). If you read more on Entity Framework's handling of eager v lazy loading, you'll learn that in this example, the product variants for the product are loaded into memory first and then the insert into the database happens. Loading the product variants into memory is unnecessary. A product with 100s (if not 1000s) of variants can get especially expensive. This was the biggest offender in our client's code base, so flipping the query to include the ID in the insert operation and bypassing the relationship saved us _seconds_ in performance. Database Field Performance Issues Another issue in the project was loading records with certain data types in fields. Specifically, the text type. The text type can be used to store arbitrarily long strings like blog post content or JSON blobs in the absence of a JSON type. Most databases use a technique that stores text fields off rows, which requires a special file system lookup operation to fetch that data. This can make a typical database lookup that would take 100ms under normal conditions to take 200ms. If you combine this problem with some of the N+1 and eager loading problems we've mentioned, this can lead to seconds, if not minutes, of query slowdown. For these, you should consider not including the column by default as part of your ORM. TypeORM allows for this via their hidden columns feature. In this example, for our product description, we could change the definition to be: ` This would allow us to query products without descriptions quickly. If we needed to include the product description in an operation, we'd have to use the addSelect function to our query to include that data in our result like: ` This is an optimization you should be wary of making in existing systems but one worth considering to improve performance, especially for data reports. Alternatively, you could optimize a query using the select method to limit the fields returned to those you need. Database v. In-Memory Fetching Going back to one of our earlier examples, we wrote the following: ` This involves loading our data in memory and then using system memory to perform a group-by operation. Our database could have also returned this result grouped. We opted to perform this operation like this because it made fetching the order item IDs easier. This takes some performance challenges away from our database and puts the performance effort on our servers. Depending on your database and other system constraints, this is a trade-off, so do some benchmarking to confirm your best options here. This example is not too bad, but let's say we wanted to get all the orders for a customer with an item that cost more than $20. Your first inclination might be to use your ORM to fetch all the orders and their items and then filter that operation in memory using JavaScript's filter method. In this case, you're loading data you don't need into memory. This is a time to leverage the database more. We could write this query as follows in SQL: ` This just loads the orders that had the data that matched our condition. We could write this as: ` We constrained this to a single customer, but if it were for a set of customers, it could be significantly faster than loading all the data into memory. If our server has memory limitations, this is a good concern to be aware of when optimizing for performance. We noticed a few instances on our client's implementation where the filter operations were applied in functions that appeared to run the operation in the database but were running the operation in memory, so this was preloading more data into memory than needed on a memory-constrained server. Refer to your ORM manual to avoid this type of performance hit. Lack of Indexes The final issue we encountered was a lack of indexes on key lookups. Some ORMs do not support defining indexes in code and are manually applied to databases. These tend not to be documented, so an out-of-sync issue can happen in different environments. To avoid this challenge, we prefer ORMs that support indexes in code like TypeORM. In our last example, we filtered on the cost of an order item, but the cost field does not contain an index. This leads to a full table scan of our data collection filtered by the customer. The query cost can be very expensive if a customer has thousands of orders. Adding the index can make our query super fast, but it comes at a cost. Each new index makes writing to the database slower and can exponentially increase the size of our database needs. You should only add indexes to fields that you are querying against regularly. Again, be sure you can notate these indexes in code so they can be replicated across environments easily. In our client's system, the previous developer did not include indexes in the code, so we retroactively added the database indexes to the codebase. We recommend using your database's recommended tool for inspection to determine what indexes are in place and keep these systems in sync at all times. Conclusion ORMs can be an amazing tool to help you and your teams build applications quickly. However, they have gotchas for performance that you should be aware of and can identify while developing or during code reviews. These are some of the most common examples I could think of for best practices. When you hear the horror stories about ORMs, these are some of the challenges typically discussed. I'm sure there are more, though. What are some that you know? Let us know!...

Jul 3, 2024

8 mins

Architectural DesignDesign PatternsArchitecture

How to Handle Uploaded Images and Avoid Image Distortion

When you are working with images in your application, you might run into issues where the image's aspect ratio is different from the container's specified width and height. This could lead to images looking stretched and distorted. In this article, we will take a look at how to solve this problem by using the object-fit CSS property. A Look Into the Issue Using the "Let's Chat With" App Let's Chat With is an open source application that facilitates networking between attendees for virtual and in-person conferences. When users sign up for the app, they can join a conference and create a new profile with their name, image, and bio. When the team at This Dot Labs was testing the application, they noticed that some of the profile images were coming out distorted. The original uploaded source image did not have an aspect ratio of 1:1. A 1:1 aspect ratio refers to an image's width and height being the same. Since the image was not a square, it was not fitting well within the dimensions below. ` In order to fix this problem, the team decided to use the object-fit CSS property. What is the object-fit CSS property? The object-fit property is used to determine how an image or video should resize in order to fit inside its container. There are 5 main values you can use with the object-fit property. - object-fit: contain; - resizes the content to fit inside the container without cropping it - object-fit: cover; - ensures the all of the content covers the container and will crop if necessary - object-fit: fill; - fills the container with the content by stretching it and ignoring the aspect ratio. This could lead to image distortion. - object-fit: none; - does not resize the content which could lead to the content spilling out of the container - object-fit: scale-down; - scales larger content down to fit inside the container When the object-fit: cover; property was applied to the profile image in Let's Chat With, the image was no longer distorted. ` When Should You Consider Using the object-fit Property? There will be times where you will not be able to upload different sized images to fit different containers. You might be in a situation like Let's Chat With, where the user is uploading images to your application. In that case, you will need to apply a solution to ensure that the content appropriately resizes within the container without becoming distorted. Conclusion In this article, we learned about how to fix distorted uploaded images using the object-fit property. We examined the bug inside the Let's Chat With application and how that bug was solved using object-fit: cover;. We also talked about when you should consider using the object-fit property. If you want to check out the Let's Chat with app, you can signup here. If you are interested in contributing to the app, you can check out the GitHub repository....

Jul 28, 2023

3 mins

JavaScript

Understanding Sourcemaps: From Development to Production

What Are Sourcemaps? Modern web development involves transforming your source code before deploying it. We minify JavaScript to reduce file sizes, bundle multiple files together, transpile TypeScript to JavaScript, and convert modern syntax into browser-compatible code. These optimizations are essential for performance, but they create a significant problem: the code running in production does not look like the original code you wrote. Here's a simple example. Your original code might look like this: ` After minification, it becomes something like this: ` Now imagine trying to debug an error in that minified code. Which line threw the exception? What was the value of variable d? This is where sourcemaps come in. A sourcemap is a JSON file that contains a mapping between your transformed code and your original source files. When you open browser DevTools, the browser reads these mappings and reconstructs your original code, allowing you to debug with variable names, comments, and proper formatting intact. How Sourcemaps Work When you build your application with tools like Webpack, Vite, or Rollup, they can generate sourcemap files alongside your production bundles. A minified file references its sourcemap using a special comment at the end: ` The sourcemap file itself contains a JSON structure with several key fields: ` The mappings field uses an encoding format called VLQ (Variable Length Quantity) to map each position in the minified code back to its original location. The browser's DevTools use this information to show you the original code while you're debugging. Types of Sourcemaps Build tools support several variations of sourcemaps, each with different trade-offs: Inline sourcemaps: The entire mapping is embedded directly in your JavaScript file as a base64 encoded data URL. This increases file size significantly but simplifies deployment during development. ` External sourcemaps: A separate .map file that's referenced by the JavaScript bundle. This is the most common approach, as it keeps your production bundles lean since sourcemaps are only downloaded when DevTools is open. Hidden sourcemaps: External sourcemap files without any reference in the JavaScript bundle. These are useful when you want sourcemaps available for error tracking services like Sentry, but don't want to expose them to end users. Why Sourcemaps During development, sourcemaps are absolutely critical. They will help avoid having to guess where errors occur, making debugging much easier. Most modern build tools enable sourcemaps by default in development mode. Sourcemaps in Production Should you ship sourcemaps to production? It depends. While security by making your code more difficult to read is not real security, there's a legitimate argument that exposing your source code makes it easier for attackers to understand your application's internals. Sourcemaps can reveal internal API endpoints and routing logic, business logic, and algorithmic implementations, code comments that might contain developer notes or TODO items. Anyone with basic developer tools can reconstruct your entire codebase when sourcemaps are publicly accessible. While the Apple leak contained no credentials or secrets, it did expose their component architecture and implementation patterns. Additionally, code comments can inadvertently contain internal URLs, developer names, or company-specific information that could potentially be exploited by attackers. But that’s not all of it. On the other hand, services like Sentry can provide much more actionable error reports when they have access to sourcemaps. So you can understand exactly where errors happened. If a customer reports an issue, being able to see the actual error with proper context makes diagnosis significantly faster. If your security depends on keeping your frontend code secret, you have bigger problems. Any determined attacker can reverse engineer minified JavaScript. It just takes more time. Sourcemaps are only downloaded when DevTools is open, so shipping them to production doesn't affect load times or performance for end users. How to manage sourcemaps in production You don't have to choose between no sourcemaps and publicly accessible ones. For example, you can restrict access to sourcemaps with server configuration. You can make .map accessible from specific IP addresses. Additionally, tools like Sentry allow you to upload sourcemaps during your build process without making them publicly accessible. Then configure your build to generate sourcemaps without the reference comment, or use hidden sourcemaps. Sentry gets the mapping information it needs, but end users can't access the files. Learning from Apple's Incident Apple's sourcemap incident is a valuable reminder that even the largest tech companies can make deployment oversights. But it also highlights something important: the presence of sourcemaps wasn't actually a security vulnerability. This can be achieved by following good security practices. Never include sensitive data in client code. Developers got an interesting look at how Apple structures its Svelte codebase. The lesson is that you must be intentional about your deployment configuration. If you're going to include sourcemaps in production, make that decision deliberately after considering the trade-offs. And if you decide against using public sourcemaps, verify that your build process actually removes them. In this case, the public repo was quickly removed after Apple filed a DMCA takedown. (https://github.com/github/dmca/blob/master/2025/11/2025-11-05-apple.md) Making the Right Choice So what should you do with sourcemaps in your projects? For development: Always enable them. Use fast options, such as eval-source-map in Webpack or the default configuration in Vite. The debugging benefits far outweigh any downsides. For production: Consider your specific situation. But most importantly, make sure your sourcemaps don't accidentally expose secrets. Review your build output, check for hardcoded credentials, and ensure sensitive configurations stay on the backend where they belong. Conclusion Sourcemaps are powerful development tools that bridge the gap between the optimized code your users download and the readable code you write. They're essential for debugging and make error tracking more effective. The question of whether to include them in production doesn't have a unique answer. Whatever you decide, make it a deliberate choice. Review your build configuration. Verify that sourcemaps are handled the way you expect. And remember that proper frontend security doesn't come from hiding your code. Useful Resources * Source map specification - https://tc39.es/ecma426/ * What are sourcemaps - https://web.dev/articles/source-maps * VLQ implementation - https://github.com/Rich-Harris/vlq * Sentry sourcemaps - https://docs.sentry.io/platforms/javascript/sourcemaps/ * Apple DMCA takedown - https://github.com/github/dmca/blob/master/2025/11/2025-11-05-apple.md...

Nov 21, 2025

5 mins

JavaScript

Let's innovate together!

We're ready to be your trusted technical partners in your digital innovation journey.

Whether it's modernization or custom software solutions, our team of experts can guide you through best practices and how to build scalable, performant software that lasts.

What is a Monorepo and What Are the Advantages for Using It in Your Project?

Table of contents

What is a monorepo?

Unified build process and CI/CD process

Increased cross team communication

Steeper learning curve and issues of inclusivity for newer developers

A look into the starter.dev monorepo

What is starter.dev?

Why did the team choose a monorepo structure?

What are Yarn Workspaces?

Unified CI/CD pipeline with Amplify

How does the team manage issues?

Conclusion

Jessica Wilkins

You might also like

Leveling Up Your Work Through Architecture Design and Time Estimation

The Dangers of ORMs and How to Avoid Them

How to Handle Uploaded Images and Avoid Image Distortion

Understanding Sourcemaps: From Development to Production

Let's innovate together!

You might also like

Leveling Up Your Work Through Architecture Design and Time Estimation

The Dangers of ORMs and How to Avoid Them

How to Handle Uploaded Images and Avoid Image Distortion

Understanding Sourcemaps: From Development to Production