Skip to content

Build IT Better - DevOps - Monitoring Roundup

This article was written over 18 months ago and may contain information that is out of date. Some content may be relevant but please refer to the relevant official documentation or available resources for the latest information.

Build IT Better DevOps - Monitoring Roundup

On This Dot's Build IT Better show, I talk to people who make popular tools that help developers make great software. In my most recent series, we looked at application monitoring tools. Marcus Olssen from Grafana and Ben Vinegar from Sentry showed us how the tools they work on can help developers keep their applications running smoothly.

Grafana

Grafana is an organization that builds a number of open source observation and monitoring tools for collecting and visualizing application metrics. Their namesake product is a platform for aggregating and visualizing any kind of data from a near limitless number of sources via their rich plugin library. Grafana's commercial counterpart, Grafana Labs, maintains this plugin library as well as educational resources for the Grafana ecosystem and paid products and services for companies that are looking for help managing their own Grafana tooling.

Flexibility

Grafana is a platform for application monitoring and analytics that offers a really huge amount of flexibility for collecting and analyzing application data. Instead of providing a hyper-focused application monitoring solution, Grafana provides unparallelled flexibility for collecting almost any kind of data. Grafana offers built in integrations for all the most popular SQL and non-SQL databases, as well as Grafana's own popular application monitoring tools, Prometheus, Loki, and Tempo (and a handful of other popular sources). Community developed plugins can be used to add support for most other platforms. This flexibility allows Grafana to have applications outside the traditional application monitoring use cases. Some are even using Grafana to track their own home energy usage and health data. You can really analyze almost any kind of data in Grafana.

grafana-dash

*A Grafana dashboard with custom metrics*

Datasource Compatibility

While flexibility allows Grafana to reach across industries to find users and use cases, it still excels at traditional application monitoring. Developers can use Prometheus to pull data out of their own applications. Most popular host operating systems and appliation development frameworks offer community developed integrations with Prometheus that will provide useful system and application data like resource usage and response time, as well as the ability to publish your own custom application data. Loki is a tool for aggregating and querying system and application logs. Also, you can use Tempo for aggregating distributed application trace data from tools like Jaeger, OpenTelemetry, and Zipkin. If you use all 4 tools together, you can visually trace transactions all the way through your application, even as the user shifts between different components of your microservice architecture.

Visualization and Analysis

All of this flexible data collection technology would be useless without Grafana's equally flexible visualization platform. Once you've integrated all your data sources, you can use Grafana to explore and visualize data you've collected. You can use dashboards to create an array of vizualizations of your data. As a DevOps engineer, one of my favorite things about Grafana is their Dashboard library. The dashboard library contains community developed dashboards for a number of popular (and not so popular) application frameworks and backend tools and systems. Instead of needing to make your own dashboards from scratch for monitoring Rails apps and PostgreSQL databases, you can simply add and modify community Dashboards, saving you time and providing insights you may not have considered on your own. Finally, we have to mention the Explore tool. It can be easy to overlook with everything that's possible with Dashboards, but it allows users to easily view, query, and analyze data streams on the fly without needing to create permanent dashboards.

grafana-nginx

*Grafana Nginx Dashboard - available from the dashboard library*

This big tent collection of features makes Grafana a great platform for observing any amount of any kind data. The flexibility does come with the overhead of needing to know a lot about a number of different tools like Prometheus and Loki, which have a non-trivial amount of overhead on their own. As with any community-developed content, plugins and dashboards from the library don't always work as expected out of the box and will often need to be modified to line up with your devops procedures and environments.

Sentry

Sentry, like Grafana, is a tool for monitoring application health and performance. However, unlike Grafana, Sentry is laser-focused on providing curated experiences with deep first party integrations for popular application development tools and provides some additional tools for tracking user errors and code changes, which it uses as the framing narrative for all of the data the Sentry platform surfaces for developers. Integratons are available for most popular frontend JavaScript frameworks (React, Angular, Vue, etc) and backend applications in Python, Ruby, Go, and more. Sentry gives developers a huge amount of visibility without the overhead of more complex devops driven platforms like Grafana.

Developer Focused

Sentry's primary goal is to help you understand what's wrong with all of the parts of your application. They do this by giving you a view into what errors your users are experiencing in real time. Sentry collects data on all of the exceptions thrown by applications which have a Sentry integration. As you investigate individual issues, Sentry provides you with a curated collection of datapoints to cross reference with the specific error. Sentry provides some very traditional data, such as user like browser agent, OS, their geographical location, and the url they were visiting, but it also connects that error back to the code. Not only can you see the stack trace and easily see the lines of code where the error manifested, but Sentry also uses its deep integration to provide what they call "Breadcrumbs." These are pieces of data about what actual activity led up to the error. Depending on the what type of application you're troubleshooting, this might be things like log output, events fired from UI elements, or your own custom breadcrumb events. These can give you a better idea of the actions the user took leading up to the error.

sentry-issues-55bdb220b35700e93b331462bac651b1

*Sentry's Issue (aka Error) View*

sentry-breadcrumbs-fa33456aee1d74894d7ae77738721609

*A sample of Sentry's Breadcrumbs*

Integrations

In addition to helping you identify the root cause of your errors, Sentry also aggregates errors to make it easier for you to understand which errors have the highest impact on your application. You can easily identify errors that are happening frequently and on critical paths. If you've enabled integration with a source control platform like GitHub, Sentry will even make suggestions as to which code commits introduced the problem. All these features together will help you tackle application health like a devops expert, without needing to be a devops expert.

Application Performance

Debugging and error surfacing aren't the only place where Sentry shines. I'm really excited to talk about Sentry's performance and application tracing platform. Using their deep framework and platform integrations, you're able to collect a lot of performance data from your applicaitons and to coallate them with user behaviors. Similar to the debugging experience, Sentry starts you from a broad view of your performance picture, and shows you the slowest pages and endpoints of your application, and provides you with another curated experince for investigating and resolving performance problems. The most interesting aspect of the performance investigation tools are transactions, or traces. When you choose a slow page to begin investigating, alongside the individual performance metrics for that page, are transactions. These transactions allow you to see the performance of your pages broken into waterfall graphs, like you might already be used to from browser dev tools. However, Sentry adds some really cool tricks since they're deeply integrated into all the parts of your application. If you analyze a transaction that starts from your javascript app and see that there's a fetch request that's taking a long time, assuming the API is part of your stack that's integrated with Sentry, you can click down into that fetch request within the Sentry UI and switch contexts to the API application and see a waterfall graph of what the API did to handle that request, allowing you to simply traverse your whole application to identify the exact source of performance problems. These transactions also benefit from the same Breadcrumb and code change data that's provided in the error analysis tools.

Conclusions

Sentry and Grafana are both strong tools to add to your DevOps toolbelt. While they both provide great features for observing application health and analyzing data, they really fill two pretty different niches. Sentry provides curated developer experiences and deep integrations that will help developers dive head first into error and performance monitoring for their applications without needing to be experts. However for experts and "data scientists" Grafana provides an incredibly powerful and flexible platform for not only analyzing application metrics and health, but really any data you can manage to get into a Dashboard. Some organizations may even benefit from using both tools for different use cases.

This Dot is a consultancy dedicated to guiding companies through their modernization and digital transformation journeys. Specializing in replatforming, modernizing, and launching new initiatives, we stand out by taking true ownership of your engineering projects.

We love helping teams with projects that have missed their deadlines or helping keep your strategic digital initiatives on course. Check out our case studies and our clients that trust us with their engineering.

You might also like

Migrating a classic Express.js to Serverless Framework cover image

Migrating a classic Express.js to Serverless Framework

Problem Classic Express.js applications are great for building backends. However, their deployment can be a bit tricky. There are several solutions on the market for making deployment "easier" like Heroku, AWS Elastic Beanstalk, Qovery, and Vercel. However, "easier" means special configurations or higher service costs. In our case, we were trying to deploy an Angular frontend served through Cloudfront, and needed a separately deployed backend to manage an OAuth flow. We needed an easy to deploy solution that supported HTTPS, and could be automated via CI. Serverless Framework The Serverless Framework is a framework for building and deploying applications onto AWS Lambda, and it allowed us to easily migrate and deploy our Express.js server at a low cost with long-term maintainability. This was so simple that it only took us an hour to migrate our existing API, and get it deployed so we could start using it in our production environment. Serverless Init Script To start this process, we used the Serverless CLI to initialize a new Serverless Express.js project. This is an example of the settings we chose for our application: ` Here's a quick explanation of our choices: What do you want to make? This prompt offers several possible scaffolding options. In our case, the Express API was the perfect solution since that's what we were migrating. What do you want to call this project? You should put whatever you want here. It'll name the directory and define the naming schema for the resources you deploy to AWS. What org do you want to add this service to? This question assumes you are using the serverless.com dashboard for managing your deployments. We're choosing to use Github Actions and AWS tooling directly though, so we've opted out of this option. Do you want to deploy your project? This will attempt to deploy your application immediately after scaffolding. If you don't have your AWS credentials configured correctly, this will use your default profile. We needed a custom profile configuration since we have several projects on different AWS accounts so we opted out of the default deploy. Serverless Init Output The init script from above outputs the following: - .gitignore - handler.js - package.json - README.md - serverless.yml The key here is the serverless.yml and handler.js files that are outputted. serverless.yml ` handler.js ` As you can see, this gives a standard Express server ready to just work out of the box. However, we needed to make some quality of life changes to help us migrate with confidence, and allow us to use our API locally for development. Quality of Life Improvements There are several things that Serverless Framework doesn't provide out of the box that we needed to help our development process. Fortunately, there are great plugins we were able to install and configure quickly. Environment Variables We need per-environment variables as our OAuth providers are specific per host domain. Serverless Framework supports .env files out of the box but it does require you to install the dotenv package and to turn on the useDotenv flag in the serverless.yml. Babel/TypeScript Support As you can see in the above handler.js file, we're getting CommonJS instead of modern JavaScript or TypeScript. To get these, you need webpack or some other bundler. serverless-webpack exists if you want full control over your ecosystem, but there is also serverless-bundle that gives you a set of reasonable defaults on webpack 4 out of the box. We opted into this option to get us started quickly. Offline Mode With classic Express servers, you can use a simple node script to get the server up and running to test locally. Serverless wants to be run in the AWS ecosystem making it. Lucky for us, David Hérault has built and continues to maintain serverless-offline allowing us to emulate our functions locally before we deploy. Final Configuration Given these changes, our serverless.yml file now looks as follows: ` Some important things to note: - The order of serverless-bundle and serverless-offline in the plugins is critically important. - The custom port for serverless-offline can be any unused port. Keep in mind what port your frontend server is using when setting this value for local development. - We set the profile and stage in our provider configuration. This allowed us to use specify the environment settings and AWS profile credentials to use for our deployment. With all this set, we're now ready to deploy the basic API. Deploying the new API Serverless deployment is very simple. We can run the following command in the project directory: ` This command will deploy the API to AWS, and create the necessary resources including the API Gateway and related Lambdas. The first deploy will take roughly 5 minutes, and each subsequent deply will only take a minute or two! In its output, you'll receive a bunch of information about the deployment, including the deployed URL that will look like: ` You can now point your app at this API and start using it. Next Steps A few issues we still have to resolve but are easily fixed: - New Lambdas are not deploying with their Environment Variables, and have to be set via the AWS console. We're just missing some minor configuration in our serverless.yml. - Our deploys don't deploy on merges to main. For this though, we can just use the official Serverless Github Action. Alternatively, we could purchase a license to the Serverless Dashboard, but this option is a bit more expensive, and we're not using all of its features on this project. However, we've used this on other client projects, and it really helped us manage and monitor our deployments. Conclusion Given all the above steps, we were able to get our API up and running in a few minutes. And due to it being a 1-to-1 replacement for an existing Express server, we were able to port our existing implementations into this new Serverless implementation, deploy it to AWS, and start using it in just a couple of hours. This particular architecture is a great means for bootstraping a new project, but it does come with some scaling issues for larger projects. As such, we do not recommend a pure Serverless, and Express for monolithic projects, and instead suggest utilizing some of the amazing capabilities of Serverless Framework with AWS to horizontally scale your application into smaller Lambda functions....

Vercel vs. VPS - What's the drama, and which one should you choose? cover image

Vercel vs. VPS - What's the drama, and which one should you choose?

Vercel vs. VPS - What's the drama, and which one should you choose? Vercel recently announced an update to their pricing. For some users, that meant a significant increase in costs, which has led to some discussions about the costs of using cloud providers like Vercel, with a few suggesting that a VPS is a far superior and cost-effective solution. Does it make sense to move your popular app to a VPS and save hundreds of dollars? As you might have guessed, the issue is far more nuanced than some tweets suggest. Let's break it down. Integrations Let's talk about the elephant in the room first. Vercel is tailored explicitly for Next.js, providing tooling for conformance checks, continuous deployment, branch previews, and more. All of that works out of the box, and if you were to figure all of that out on your own, you'd be spending a lot of time and money on it, exposing yourself to potential bugs, problems, and disturbances in your workflow. If you're a one-person startup, you may get away with setting up a simple CI/CD pipeline on a VPS, but as your team grows, you'll need to invest more time and money into maintaining it. And that's not even considering the time you'll spend onboarding new team members and teaching them how to use your custom setup. That being said, some tools can make things easier for you and standardize the process. For example, coolify does a great job of providing a simple and easy-to-use interface for deploying Next.js apps to a VPS. You could also deploy a Docker container with all the necessary tools and configurations, but that's another can of worms. There’s also OpenNext adapter that lets you deploy across different environments, such as AWS. While that takes care of the deployment part, it doesn't solve the potential need for other integrations like conformance checks, branch previews, etc. Scaling and Performance Scaling is sometimes overrated in software engineering. Especially at the beginning of a greenfield project, you're better off focusing on building a great product and getting users than worrying about scaling. That said, it's essential to have a plan for scaling when the time comes, not to mention projects that are already popular and are considering moving to a VPS. I'm not saying you can't successfully run a popular high-traffic project built in Laravel from a Debian server in your basement. But I'm also not suggesting that you should optimize your hosting costs at the expense of investing precious time and energy into building and configuring infrastructure that could be better spent on building your product, especially given the fact that most cloud providers offer a free tier that can handle a decent amount of traffic. And there is one more thing to consider. Vercel has a global network of edge nodes that can cache your content and serve it to users from the nearest location, which can significantly improve your app's performance. If your app is targeted at a global audience, serving it from one server located in a single region may provide a suboptimal experience for users in other areas. Edge Nodes Speaking of edge nodes and Vercel, there has been a pretty significant announcement made by Vercel recently. They have moved away from edge rendering back to Node.js. The reason is that while they initially thought it may be worth the trade-off, having your compute resources far away from your database makes the app slower. Furthermore, the cost benefits of edge rendering are less significant than they initially thought. That doesn't mean they moved away from edge nodes completely. Static assets and the essential HTML and CSS are still served by the edge nodes. With networks getting fast and Node.js more performant, however, the argument for having a VPS with a CDN instead of a cloud provider like Vercel becomes more compelling. Skill Requirements Running a VPS requires a certain level of technical expertise. You must know how to set up and configure a server, install and configure software, manage security, monitor performance, and troubleshoot problems. If you're uncomfortable with these tasks, you may spend a lot of time learning and troubleshooting. While learning new things is great, it's not always the best use of your time, especially if you're trying to build a product. Part of the cost of using a cloud provider like Vercel is that they take care of all the technical details so you can focus on building your product. Suppose you're not comfortable with managing servers. In that case, it is worth paying extra for the convenience of not having to worry about it while benefiting from a global network of edge nodes and integrations that can save you time and money in the long run. If you're a developer who enjoys tinkering with servers and learning new things, running a VPS can be a fun and rewarding experience. Similarly, running a VPS may be a good option for you if you're a small team with a dedicated DevOps person who can handle all the technical details. But suppose you're a solo founder or a big company with an abundant budget. In that case, you may be better off using a cloud provider like Vercel that can save you time and money in the long run, allowing you to validate PRs in real-time and comment on a build preview in the process. Security A considerable concern when running a VPS is security. You must keep your server updated with security patches, configure firewalls, monitor logs for suspicious activity, and take other steps to protect your server from attacks. There are a lot of bots out there scanning the internet for vulnerable servers, and if you're not careful, you could end up with a compromised server that's being used to send spam, mine cryptocurrency, or launch DDoS attacks. Most cloud providers take care of these things for you, so you don't have to worry about them. Speaking of DDoS attacks, cloud providers usually have built-in DDoS protection that can help mitigate the impact of an attack. If you're running a VPS, you'll need to set up your own DDoS protection, which can be a complex and expensive process, or set up a CloudFlare in front of your server, effectively giving into a cloud provider anyway. That being said, I have read a few horror stories about landing an enormous bill from a cloud provider after a DDoS attack, so it's not all sunshine and rainbows. However, if you opt for a cloud hosting solution such as Vercel and unfortunately get attacked, remember they can be chill about it and refund some of the costs. Be aware of the possibility of a DDoS attack, though, and set up limits and alerts to prevent it from happening in the first place. Tech Stack Flexibility While you don't have to worry about keeping your server up to date when using a cloud provider, it is unfortunate that most cloud providers are usually late in supporting the newest Node.js versions. Despite Node.js 20 being the latest LTS version, some cloud providers' features still don't support it. If you plan to use bleeding-edge technologies or have specific requirements that cloud providers don't support, you may be better off running a VPS. Cost Finally, let's talk about cost. Running a VPS can be cheaper than using a cloud provider like Vercel, especially if you have a high-traffic app generating a lot of revenue. However, you need to consider the hidden costs of running a VPS, such as the time and money you'll spend on maintaining it, the cost of setting up and configuring a server, the cost of monitoring and troubleshooting problems, and the cost of scaling and performance optimization. If we take a model app that transfers around 1.5 TB of data per month, does a bit of computation, and we apply the new Vercel pricing, it could look something like this: 1. Base Plan Cost: $20 per month 2. Additional Fast Data Transfer: 500 GB x $0.15/GB = $75 (1 TB included in the base plan) 3. Additional Fast Origin Transfer: 50 GB x $0.06/GB = $3 (100 GB included in the base plan) 4. Additional Edge Requests: 2 million x $2/million = $4 (10 million included in the base plan) 5. Additional Function Execution Time: 200 GB-hours x $0.18/GB-hour = $36 (1,000 GB-hours included in the base plan) That would bring our total cost to $138 per month, assuming you pay for only one seat. If we decided to run a VPS instead, we would need something with at least 4 GB RAM, 2 vCPUs, and 80 GB SSD storage. A VPS with those specs would cost around $20 per month. You would also need to add the cost of a CDN, which would amount to $20 per month if you chose the CloudFlare Pro plan. That would bring our total cost to $40 per month. In this case, running a VPS would be significantly cheaper than using Vercel. However, this assumes that you know how much your app will scale in advance or that you're willing to go through the hassle of migrating your application to a larger server when the time comes. The thing is, if you're starting a new project, you could probably start with the free tier of a cloud provider like Vercel, later transfer to the Pro plan, and scale up as needed. In many cases, the appeal of the pay-as-you-go model is worth paying a bit extra in the long run. Conclusion Should you move your popular app to a VPS and save some dollars? The answer is it depends. It all boils down to assessing your project's specific needs, technical expertise, and growth expectations. The recent uproar regarding Vercel's pricing changes has illuminated this dilemma, presenting a classic case of convenience versus cost. Vercel stands out for its seamless integration with Next.js, providing out-of-the-box solutions that significantly reduce the time and effort required for deployment and ongoing maintenance. This convenience is especially valuable for developers leveraging modern development practices with minimal setup. Additionally, Vercel's edge network capabilities offer significant performance benefits for global applications, ensuring faster load times and improved user experience. However, these benefits come at a cost, which can be significantly higher than running a VPS, especially as your application scales. For those with the requisite skills, a VPS can offer a more cost-effective solution without the constraints of a platform-centric approach. This route grants more control over the hosting environment and potentially lowers costs at the expense of increased complexity in setup, scaling, and maintenance. It's also important to consider the non-financial costs associated with each option. Vercel provides a robust, developer-friendly environment that can significantly accelerate development cycles and reduce the burden on your team. On the other hand, managing a VPS requires a steep learning curve and ongoing attention to security, performance, and scalability. TLDR: If you're a solo founder or a small team looking to deploy and scale your application with minimal overhead quickly, Vercel is likely the best choice. If you are a Linux enthusiast or have a dedicated DevOps team, you can create a decent setup with a VPS and a CDN, but be prepared to invest time and effort into maintaining it. If you are a bigger company or team, the decision will likely depend on your specific needs, growth expectations, and budget constraints....

Putting Your NodeJS App in a Docker Container cover image

Putting Your NodeJS App in a Docker Container

Putting Your App in a Docker Container The days of having to provision servers and VMs by hand or by using complicated and heavy handed toolchains like Chef, Puppet, and Ansible are over. Docker simplifies the process by providing developers with a simple domain specific language for creating pre-configured virtual machine images, and simple tools for building, publishing and running them on the (virtual) hardware you're already using. In this guide, I will show you how to install your NodeJS application into a Docker container. The Dockerfile Language Docker containers are built using a single file called a Dockerfile. This file uses a simple domain specific language (think SQL) to define how to configure a virtual machine to run your application. The language provides a small number of commands called "instructions" that can be used to define the steps required to build a new virtual machine called a "container". First, I'll explain which instructions we'll be using and what they'll be used for. FROM The FROM instruction is used to define a base image to use as the foundation for your custom image. You can use any local or published image with FROM. There are published images that only contain popular Linux distributions (or even Windows!) and there are also images that come preinstalled with popular software development stacks like NodeJS, Python, or .NET. RUN The RUN instruction is used in the image build process to run commands required to bootstrap your application environment. We'll use it mostly to install dependencies, but it's capable of running any command that your container OS supports. COPY The COPY instruction is used to copy files from the local filesystem into the container. We will use this instruction to copy our application code, etc., into our image. ENTRYPOINT The ENTRYPOINT instruction contains a command that will be run when your container is launched. It is different from RUN because the command passed to ENTRYPOINT does not run at build time. Instead the command passed to ENTRYPOINT will run when your container is started via docker run (check out my Docker CLI Deep Dive post). Only a single ENTRYPOINT instruction per Dockerfile is allowed. If used multiple times, only the last usage will be operative. The value of ENTRYPOINT can be overridden when running a container image. CMD The CMD instruction is an extension of the ENTRYPOINT instruction. The content passed to CMD is tacked onto the end of the command passed to ENTRYPOINT to create a complete command to start your application. Like with ENTRYPOINT, only the final usage of CMD in a Dockerfile is operative and the value given can be overridden at runtime. EXPOSE EXPOSE is a little different from the other instructions in that it doesn't have a practical purpose. It only exists to provide metadata about what port the container expects to expose. You don't _need_ to use EXPOSE in your container, but anyone who has to understand how to connect to it will appreciate it. And more... More details about these Dockerfile instructions and some others that I didn't cover are available via the official Dockerfile reference. Choosing a Base Image Generally, choosing a base image will be simple. If you're using a popular language, there is most likely already an official image available that installs the language runtimes. For instance, NodeJS application developers will most likely find it easiest to use the official NodeJS image provided via Dockerhub. This image is developed by the NodeJS team and comes pre-installed with everything you need to run basic NodeJS applications. Similarly, users of other popular languages will find similar images available (ex: Python, Ruby). Once you've chosen your base image, you also need to choose which specific version you will use. Generally, images are available with any supported version of a language's toolchain so that a wide range of applications can be supported using official images. You can usually find a list of all available version tags on an image's DockerHub page. In addition to offering images with different versions of language tools installed, there are also typically additional images available with different operating systems as well. Unless specified otherwise, images usually use the most recent Debian Linux release as their base image. Since it's considered best practice to keep the size of your images as low as possible, most languages also offer variants of their images built with a "slim" version of Debian Linux, or built with Alpine Linux, a Linux distribution designed for building Docker containers with tiny footprints. Both Debian Slim and Alpine ship with fewer system packages installed than the typical Debian Linux base image. They only include the packages that are required to run the language tools. This will make your Docker images more compact, but may result in more work to build your containers if you require specific system dependencies that are not preinstalled in those versions. Some languages, like .NET Core, even offer Windows-based images. Though it's typically not necessary, you can choose to use a base operating system image without any additional language specific tools installed by default. Images containing _only_ Debian Linux, Debian Slim, and Alpine Linux are available. However, the most popular images contain many other operating systems like Ubuntu Linux, Red Hat Linux or Windows are available as well. Choosing one of these images will add much more complexity to your Dockerfile. It is _highly reccommended_ that you use a more specific official image if your use case allows. In the interest of keeping things simple for our example NodeJS app, we will choose the most recent (at the time of writing) Debian Linux version of the official NodeJS image. This image is called node:15. Note that we have only included a major version number in the image's version tag (The "version tag" is the part after the colon that specifies a specific version of an image). The NodeJS team (as well as most other maintainers of official images) also publishes images with more specific versions of Node. Using node:15 instead of node:15.5.1 means that my image with be automatically upgraded to new versions of NodeJS 15 at build time when an update is available. This is good for development, but for production workloads, you may want to use a more specific version so you don't get surprised with upgrades to NodeJS that your application can't support. Starting Your Dockerfile Now that we've chosen an image, we will create our Dockerfile. This part is very easy since the FROM instruction is going to do most of the work for us. To get started, simply create a new file in your project's root folder called Dockerfile. To this file, we will add this one simple line: ` Now we have installed everything we need to run a basic NodeJS application along with any other system dependencies that come pre-installed in Debian Linux. Installing Additional Depenencies If your application is simple and only requires NodeJS binaries to be installed and run, congratulations! You get to skip this section. Many developers won't be so lucky. If you use a tool like Image Magick to process images or wkhtmltopdf for generating PDFs, or any other libraries or tools that are not included by your chosen language or don't come installed by default on Debian Linux, you will need to add instructions to your Dockerfile so that they will be installed by Docker when your image is built. We will primarily use the RUN instruction to specify the operating system commands required to install our desired packages. If you recall, RUN is used to give Docker commands to run when building your image. We will use RUN to issue the commands required to install our dependencies. You may choose to use a package management system like Debian's apt-get (or Alpine's apm) or you may install via source. Installing via package manager is always the simplest route, but thanks to the simplicity of the RUN instruction, it's fairly straightforward to install from source if your required package isn't available to install via package management. Installing Package Dependencies Using a package manager is the easiest way to install dependencies. The package manager handles most of the heavy lifting like installing dependencies. The node:15 image is based on Debian, so we will use the RUN instruction with the apt-get package manager to install ImageMagick for image processing. Add the following lines to the bottom of our Dockerfile: `` RUN apt-get update && \ apt-get install -y imagemagick `` This is all the code you need in your Dockerfile to use the RUN instruction to install ImageMagic via apt-get. It's really not very different from how you would install it by hand on an Ubuntu or Debian host. If you've done that before, you probably noticed that there are some unfamiliar instructions. Before we installed using apt-get install, we had to run apt-get update. This is required because in order to keep the docker images small, Debian linux containers don't come with any of the package manager metadata pre-downloaded. apt-get update bootstraps the OS with all the metadata it needs to install packages. We've also added the -y option to apt-get install. This option automatically answers affimatively to any yes/no prompts when apt-get would otherwise ask for a user response. This is necessary because you will not be able to respond to prompts when Docker is building your image. Finally, we use the && operator to run both commands within the same shell context. When installing dependencies, it's a good practice to combine commands that are part of the same procedure under the same RUN instruction. This will ensure that the whole procedure is contained in the same "layer" in the container image so that Docker can cache and reuse it to save time in future builds. Check out the official documentation for more information on image layering and caching. Installing Source Dependencies Sometimes, since they use pre-compiled binaries, package managers will contain a version of a dependency that doesn't line up with the version you need. In these cases, you'll need to install the dependency from source. If you've done it by hand before, the commands used will be familiar. As with package installs, it's only different in that we use && to combine the whole procedure into a single RUN instruction. Let's install ImageMagick from source this time. `` RUN wget https://download.imagemagick.org/ImageMagick/download/ImageMagick-7.0.10-60.tar.gz && \ tar -xzf ImageMagick-7.0.10-60.tar.gz && \ cd ImageMagick-7.0.10-60 && \ ./configure --prefix /usr/local && \ make install && \ ldconfig /usr/local/lib && \ cd .. && \ rm -rf ImageMagick* `` As you can see, there's a lot more going on in this instruction. First, we need Docker to download the code for the specific ImageMagic version we want to install with wget, and unpack it using tar. Once the source is unpacked, we have it navigate to the source directory with cd and use ./configure to prepare the code for compilation. Then, make install and ldconfig are used to compile and install the binaries from source. Afterward, we navigate back to the root directory and clean the source tarball and directory since they are no longer needed. Installing Your App Now that we've installed dependencies, we can start installing our own application into the container. We will use the COPY instruction to add our own node app's source code to the container, and RUN to install npm dependencies. We'll install NPM dependencies first. In order to get the most out of Docker's build caching, it's best to install external dependencies first, since your dependency tree will change less often than your application code. A single cache miss will disable caching for the remainder of the build instructions. Application code typically changes more often between builds, so we will apply it as late in the process as we possibly can. To install your application's NPM packages, add these lines to the end of your Dockerfile: `` WORKDIR /var/lib/my-app `` COPY package*.json . `` RUN npm install `` First, we use the WORKDIR instruction to change the Dockerfile's working directory to /var/lib/my-app. This is similar to using the cd command in a shell environment. It changes the working directory for all of the following Docker instructions. Then we use COPY to copy our package.json and package-lock.json from the local filesystem to the working directory within the container. We used the wildcard operator (*), to copy both files with a single instruction. After the package files have been copied, use RUN to execute npm install Finally, we will use COPY to bring the rest of our application code into the container: `` COPY * . `` This will copy the rest of your NodeJS app's source code to the container, again using COPY and a much more broad usage of the wildcard. However, since we're using * to copy everything, we need to introduce a new configuration file called .dockerignore to prevent some local files from being copied to the container at this time. For example, we want to make sure that we aren't copying the contents of our local node_modules folder so that the modules we installed previously don't get overwritten by the ones we've installed on our development machine. It's likely that your local build platform is different from the one in the container, so copying your local node_modules folder will likely cause your app to malfunction or not run at all. The .dockerignore file is very simple. Simply add the names of files or folders that Docker should ignore at build time. You can use the * character as a wildcard just like you can in COPY instructions. Create a .dockerignore with this content: `` node_modules/ `` You may wish to add additional entries to the .dockerignore. For example, if you're using git for version control, you'll want to add the .git/ folder since it's not needed and will unnecessarily increase the size of your image. Any file or directory name you add will be skipped over when copying files via COPY at build time. Running Your App Now that we've installed all our external dependencies and copied our application code into the container, we're ready to tell docker how to run our application. We will run our app using node index.js. Docker provides the ENTRYPOINT and CMD instructions for this purpose. Both instructions have nearly the same behavior of defining the command that the container should use to start the application when our container is run, but ENTRYPOINT is less straightforward to override. They can be used together and Docker will concatenate their values and run them as a single command. In this case, you would provide the main application command to ENTRYPOINT (in our case, node) and any arguments to CMD (index.js in our case). However, we're just going to use CMD. Using both would make sense if NodeJS was our main process, but really, our main command is the whole node index.js expression. We could use only ENTRYPOINT but it's more complicated to override an ENTRYPOINT instruction's value at runtime, and we will want to be able to override the main command simply so that it's easier to troubleshoot issues within the conatainer when they arise. With all that said, add the following to the end of your Dockerfile: `` CMD ["node", "index.js"] `` Now Docker understands what to do to start our application. We provide our command to CMD (and ENTRYPOINT if it's used) in a different way than we supply commands to the RUN instruction. The form we're using for CMD is called "exec form" and the form used for RUN is called "shell form". Using shell form for RUN allows you to access all of the power of the sh shell environment. You can use variable and wildcard substitution in shell form, in addition to other shell features like piping and chaining commands using && and ||. When using exec form, you do not have access to any of these shell features. When passing a command via exec form, each element within the square brackets is joined with a space in between and run exactly as is. Using shell form is preferred for RUN so that you can leverage build arguments and chaining (recall we did that above for better layering/caching). It's better to use exec form for CMD or ENTRYPOINT so that it's always straightforward to understand which action the container takes at runtime. Conclusion I hope this article has helped to demystify the process of getting your app into a container. Docker can have a steep learning curve if you're not already a seasoned systems administrator, but features like portability, distribution, and reproducible builds make getting over the hump totally worth it, especially for developers working in teams....

“ChatGPT knows me pretty well… but it drew me as a white man with a man bun.” – Angie Jones on AI Bias, DevRel, and Block’s new open source AI agent “goose” cover image

“ChatGPT knows me pretty well… but it drew me as a white man with a man bun.” – Angie Jones on AI Bias, DevRel, and Block’s new open source AI agent “goose”

Angie Jones is a veteran innovator, educator, and inventor with over twenty years of industry experience and twenty-seven digital technology patents both domestically and internationally. As the VP of Developer Relations at Block, she facilitates developer training and enablement, delivering tools for developer users and open source contributors. However, her educational work doesn’t end with her day job. She is also a contributor to multiple books examining the intersection of technology and career, including *DevOps: Implementing Cultural Change*, and *97 Things Every Java Programmer Should Know*, and is an active speaker in the global developer conference circuit. With the release of Block’s new open source AI agent “goose”, Angie drives conversations around AI’s role in developer productivity, ethical practices, and the application of intelligent tooling. We had the chance to talk with her about the evolution of DevRel, what makes a great leader, emergent data governance practices, women who are crushing it right now in the industry, and more: Developer Advocacy is Mainstream A decade ago, Developer Relations (DevRel) wasn’t the established field it is today. It was often called Developer Evangelism, and fewer companies saw the value in having engineers speak directly to other engineers. > “Developer Relations was more of a niche space. It’s become much more mainstream these days with pretty much every developer-focused company realizing that the best way to reach developers is with their peers.” That shift has opened up more opportunities for engineers who enjoy teaching, community-building, and breaking down complex technical concepts. But because DevRel straddles multiple functions, its place within an organization remains up for debate—should it sit within Engineering, Product, Marketing, or even its own department? There’s no single answer, but its cross-functional nature makes it a crucial bridge between technical teams and the developers they serve. Leadership Is Not an Extension of Engineering Excellence Most engineers assume that excelling as an IC is enough to prepare them for leadership, but Angie warns that this is a common misconception. She’s seen firsthand how technical skills don’t always equate to strong leadership abilities—we’ve all worked under leaders who made us wonder *how they got there*. When she was promoted into leadership, Angie was determined not to become one of those leaders: > “This required humility. Acknowledging that while I was an expert in one area, I was a novice in another.” Instead of assuming leadership would come naturally, she took a deliberate approach to learning—taking courses, reading books, and working with executive coaches to build leadership skills the right way. Goose: An Open Source AI Assistant That Works for You At Block, Angie is working on a tool called goose, an open-source AI agent that runs locally on your machine. Unlike many AI assistants that are locked into specific platforms, goose is designed to be fully customizable: > “You can use your LLM of choice and integrate it with any API through the Model Context Protocol (MCP).” That flexibility means goose can be tailored to fit developers’ workflows. Angie gives an example of what this looks like in action: > “Goose, take this Figma file and build out all of the components for it. Check them into a new GitHub repo called @org/design-components and send a message to the #design channel in Slack informing them of the changes.” And just like that, it’s done— no manual intervention required. The Future of Data Governance As AI adoption accelerates, data governance has become a top priority for companies. Strong governance requires clear policies, security measures, and accountability. Angie points out that organizations are already making moves in this space: > “Cisco recently launched a product called AI Defense to help organizations enhance their data governance frameworks and ensure that AI deployments align with established data policies and compliance requirements.” According to Angie, in the next five years, we can expect more structured frameworks around AI data usage, especially as businesses navigate privacy concerns and regulatory compliance. Bias in AI Career Tools: Helping or Hurting? AI-powered resume screeners and promotion predictors are becoming more common in hiring, but are they helping or hurting underrepresented groups? Angie’s own experience with AI bias was eye-opening: > “I use ChatGPT every day. It knows me pretty well. I asked it to draw a picture of what it thinks my current life looks like, and it drew me as a white male (with a man bun).” When she called it out, the AI responded: > “No, I don’t picture you that way at all, but it sounds like the illustration might’ve leaned into the tech stereotype aesthetic a little too much.” This illustrates a bigger problem— AI often reflects human biases at scale. However, there are emerging solutions, such as identity masking, which removes names, race, and gender markers so that only skills are evaluated. > “In scenarios like this, minorities are given a fairer shot.” It’s a step toward a more equitable hiring process, but it also surfaces the need for constant vigilance in AI development to prevent harmful biases. Women at the Forefront of AI Innovation While AI is reshaping nearly every industry, women are playing a leading role in its development. Angie highlights several technologists: > “I’m so proud to see women are already at the forefront of AI innovation. I see amazing women leading AI research, training, and development such as Mira Murati, Timnit Gebru, Joelle Pineau, Meredith Whittaker, and even Block’s own VP of Data & AI, Jackie Brosamer.” These women are influencing not just the technical advancements in AI but also the ethical considerations that come with it. Connect with Angie Angie Jones is an undeniable pillar of the online JavaScript community, and it isn’t hard to connect with her! You can find Angie on X (Twitter), Linkedin, or on her personal site (where you can also access her free Linkedin Courses). Learn more about goose by Block....

Let's innovate together!

We're ready to be your trusted technical partners in your digital innovation journey.

Whether it's modernization or custom software solutions, our team of experts can guide you through best practices and how to build scalable, performant software that lasts.

Prefer email? hi@thisdot.co