3/5/22

Microservices - the good and the not-so-good

A microservice can be considered as a self contained unit of functionality, that is usually small to medium in size.  Since the beginning of last year, I have been working on a large web application that is is powered by microservices. This was my first exposure to microservices and I have come to realize that they have their own advantages and disadvantages.

The good:

  • A microservice is a well defined unit of functionality. It does only one thing and does it well.
  • A microservice is usually small in size and is easy to develop and test and maintain.
  • A microservice is an independent unit of functionality, so any tech stack can be used to develop it
  • If you decide to use the same tech stack for new microservices, you can copy/paste an existing microservice and reuse the configuration and deployment scripts and maybe even some code.
  • Each microservice can have it's own security requirements
  • Each microservice can be developed by a small dedicated team of engineers
  • It is easy to make changes and deploy a microservice without fear of impacting other applications or services. A great deal of regression or smoke testing is not required like in the case of monolithic applications
The not-so-good:
  • Troubleshooting an error in an application powered by microservices is not straight-forward. Each call can potentially traverse across several microservices before the required results are returned. It becomes necessary to pass certain request related data (like a Request ID) from the application to all the services in the chain so that all the errors can be logged with the same request and can be used to identify and troubleshoot errors.
  • From a development point of view, each microservice is a separate project and a small dedicated team of engineers usually manage a few microservices. so it becomes hard to get up to speed and become familiar with many different projects. This becomes even more difficult if a different tech stack is used for different microservices
  • All interactions with microservices happen via Http and it is a totally different paradigm compared to regular application development. The testing and debugging is done via tools like Postman. It takes some getting used to. 
  • Since microservices are well defined units of functionality, they make calls to other microservices for additional information. As a result, there is a lot of dependency between microservices. This can get frustrating, especially during development stage, since some teams keep changing their interfaces constantly. This can also result in a lot of code rewrite.
  • Since microservices depend on other microservices, the capability of a given microservice is limited by those dependencies. For example, a microservice may be able to process 50 requests per minute (RPM), but if one of the other dependency microservice can only handle 20 RPM. then this microservice will need to work with that speed.
  • Since different microservices could potentially be developed and maintained by different teams in different time zones, a great deal of collaboration and understanding is required for completing a project. This can be very frustrating at times and can lead to a lot of friction.
  • You will need to use tools like Splunk (there are several others) for logging and querying information about various microservices. This represents a learning curve, especially if you need to write complex queries for displaying information on dashboards. Splunk, for example, uses regular expressions to query data for meaningful insights. This is again a totally different paradigm compared to typical monolithic applications.
  • You will need to document the functionality provided by each microservice via documentation tools like Swagger or Stoplight (there are others as well). This requires a good deal of work and good attention to detail since your documentation is your source of truth for the service users. Any changes need to be updated promptly as well.

2/22/22

Planning for Error Logging

The purpose of error logging is to log information about errors in a way which makes it easy to diagnose and troubleshoot issues when they arise. While error logging is necessary for all applications, it is indispensable when it comes to large distributed applications. 

In the past, I have written some error logging code but I had never worked on a extensive error logging requirement, that too, from scratch. Last month, as I started work on error logging for a large batch project, I quickly realized that a lack of planning and insight can lead to logs that are hard to track, analyze and troubleshoot and could have disastrous consequences on the outcome of the project. 

Here are some questions and thoughts to keep in mind while planning and coding for it:

  • Questions
    • Why do we want to log errors?
    • What do we want to do with the logs?
    • Are we going to just search and view them in a tool like Splunk?
    • Are we going to build dashboards on top of the logs? if yes, what types of statistics do we want to show on the dashboards?
    • what are the different types of errors that need to be logged to account for the various stats that we are interested in?
    • What information needs to be logged with each error?
    • If we are going to build dashboards, we need to be able to query the logs. So what format makes sense for querying?
  • Thoughts
    • Ensure that the format and structure of the data is concise and meaningful
    • Ensure that the messages and formats are consistent across the board
    • Try to use common libraries and classes to centralize the access to logging code for consistency and ease of maintenance.
    • Create specific exception classes for each scenario, so that we can track, analyze and troubleshoot errors faster.
    • Write extensive unit tests with excellent coverage including tests on specific error messages because it is very difficult to test all scenarios when code changes are made to the error logging code.
    • If you are dealing with microservices, make sure to log all information required to track errors that span several services. RequestId is an example of one such piece of information that can be used to connect a request across various services and help troubleshoot issues.