Sunday, April 7, 2024

Introduction to Docker: what is Docker and what problems it solves

Software development is not only about writing the code. From my experience I may tell that writing code itself is not that difficult (get some data from database, show it in UI, process this data, save back to DB, call some external services, etc - tasks are quite common). But that is not all. We also need to ensure that non-functional requirements are met. I.e. that:

  • app handles errors/exceptions correctly
  • works safely (authentication/authorization/policies etc)
  • has good performance
  • scales well for increased loads
  • allows to quickly find the cause of an error/problem in production environment
  • supports updates (preferably without downtime)
  • not expensive for maintenance and support during its life cycle
  • fault-tolerant to hardware failures
  • etc.

And if we will put it this together story becomes not that easy. Let's look at an example. Assume that we have some system which consists of several components which run on the same single server:

How to ensure scalability and fault tolerance of this system? The first thing that comes to mind is to move application components from one physical server to virtual machines, run then on multiple physical servers and provide orchestration between them (setup cluster):

If one VM or server will fall, the rest will continue to work. We won't go here into details how to organize the orchestration of such a cluster (routing of http requests with network load balancer, replication of database servers, distributed caching, logging, etc.) - it is out of scope of the current article. Here we just need to understand the problem so let's continue.

How we just saw with such a cluster scaling and fault tolerance of the system got improved. However if we will look inside virtual machines we'll find that over situation with components and dependencies didn't change much. I.e. different components use different dependencies and it is possible that one application needs certain version of some library, while another needs a different version of the same library. I.e. we need to keep several different versions of the same library in one system.

Also how to update such cluster? Upload updates to each VM and update all components one by one. Yes it is possible to do it that way but what will happen when number of components will grow and number of environments where these components need to be run will also increase? In this case dependencies for different applications are accumulated and we get so-called "matrix of hell":

Maintenance cost of such system will increase as much as many components and environments will be added.

How we can improve that? E.g. if we would be able to package component/service of our distributed system along with all the necessary dependencies, configuration, environment variables, etc. into "something" that would allow us to run this component/service in any environment on any OS (on development machine, standalone server, in cluster, on production stand), then we could simply transfer this "something" between environments:

Here containers and Docker come to the scene. When we talk about containers, first thing which we may imagine is a huge barge carrying cargo containers:

In context of software development this image is a pretty good analogy. As we will see below Docker image is kind of cargo container which contains components and all its dependencies inside. That's why Docker logo looks like a whale carrying cargo containers on its back:

The term "container" came from UNIX-based operating systems. Originally term "jail" was used, but "container" has become the preferred term since 2005 with the release of Sun Solaris 10 and Sun Containers. Container is isolated runtime environment for an app that prevents that app from accessing resources outside its container (allowing access only to those resources that are explicitly allowed).

However manual creation and configuration of containers is quite complex and error-prone process. Docker is used to solve this problem. In context of Docker containers are child processes of Docker background service (Docker daemon). Any software running with Docker runs inside a container.

Containers are launched from images. As it was mentioned above Docker image is good analogue of the cargo container. Images are stored in repositories, which in turn are organized into registries. The most well-known public image registry is DockerHub. Also it is possible to run own private local image registry within the company.

Docker consists from several parts:

  • CLI tool
  • background service (daemon)
  • set of remote services (DockerHub, JFrog, etc.)

Together they simplify management of containers and allow to build own container management infrastructure:

Docker is open source. Although it came from the Linux world, but also runs on Windows (on top of Hyper-V or WSL2) and MacOS. Note however that although it is quite easy to run Linux containers (containers with Linux runtime) in Docker running in Windows (as well as Windows containers of course) since under the hood WSL2 is a lightweight virtual machine with a real Linux Kernel, but running a Windows container in Docker on Linux is not that easy:

There are solutions for that but not that straightforward (e.g. you may run Windows Server Core OS inside VirtualBox, which in turn runs inside Docker container on Linux or use Wine shell). Also licensing issues should be solved since Windows is not free.

Note that container is not the same as virtual machine:

Virtual machines:

  • launch own OS in which the installed software runs
  • require more resources (an average PC can only run a few VMs)
  • start slower
  • support snapshots which is good. But snapshots have own problems: large size, issues with diffs tracking and versioning
  • from one set of VMX/VMDK files only one VM can be launched.

Containers from other side:

  • run on the same host OS kernel
  • require less resources (on an average PC you can run many containers at the same time)
  • start within a few seconds
  • changes are added as an additional layer in union file system: possible to track changes and view history
  • it is possible to start many containers from one image.

Now with this knowledge we may solve matrix of hell mentioned above using Docker containers:

But there is new question: how to manage this matrix? 🙂 Here we come to containers orchestration technologies like Kubernetes, Docker Swarm, etc. This topic is out of scope of the current article (plan to write about that later as well).

And at the end example of how Docker may help developers at everyday work. As developer you may need to run different versions of some database engine simultaneously in order to test functionalities on these versions. Docker is perfect tool for that. E.g. if you run PostgreSQL 16 on you host OS and want to test code on older PostgreSQL 10 you need only 2 commands for that:

docker pull postgres:10
docker run -d -p 5432:5432 --name postgres10 -e POSTGRES_USER=postgres -e POSTGRES_PASSWORD=postgres -it postgres:10

Here I used host port 5432 because don't have any Postgres version running on my host (to be true with Docker I don't want to install any db engines to my host anymore ðŸ™‚ ) i.e. this port is not busy. Otherwise just use different host port and map it to internal port 5432 used by Postgres inside container e.g. "-p 6432:5432"). After that you may connect to db engine and work with it as usual:

 

That's all I wanted to write about Docker here. Hope that this information will help you to understand this technology, will motive to learn it further and use it in your work.

Wednesday, March 20, 2024

Inject class to Python module dynamically

As you probably know in Python after you import some module you may call functions defined in this module. But if you work with more complex scenarios (e.g. when imported module is provided automatically by infrastructure (i.e. you don't control it) or when classes are generated in runtime using Python meta classes) you may face with situation when some class is missing in imported module. In this case Python will show the following error:

AttributeError: module 'SomeExternalModule' has no attribute 'SomeAPI'

(in this example we assume that module name is SomeExternalModule and class name is SomeAPI). Of course if you need functionality defined in missing class you have to resolve this issue properly to ensure that missing class will exist in imported module. But if it is not critical and you just want to pass through these calls you may use trick with injecting class to module dynamically. E.g. class with single static method (method without "self" first argument) can be added to module like that:

def Log(msg):
    print(msg)

SomeExternalModule.SomeAPI = type("SomeAPI", (object, ), {
    "Log": Log
})

Here we injected class SomeAPI to module SomeExternalModule. Class contains one method called Log which just prints message to console. In addition to static methods we may also add instance methods using similar technique - just add "self" first argument to the method in this case.

After that if you will run code which relies on SomeExternalModule.SomeAPI() calls error should gone.

Sunday, March 10, 2024

Cinnamon: first thing you may want to install when switched from Windows to Linux

If you worked in Windows and then switched to Linux it may be painful in beginning since user experience is a bit different (and here we are talking about graphical UX in Linux, not command line). E.g. this is how RHEL8 (Red Hat Enterprise Linux) with default GNOME desktop environment looks like:

Yes, there are windows also but no minimize/maximize icons nor taskbar. The good news is that there is more Windows-like desktop environment available called Cinnamon. Here you may find instructions how to install it on RHEL (you may find more complete list also here). After installation and reboot you will be able to select Cinnamon from available desktop environments list on login screen:


And system will look more familiar for those who worked with Windows:


There will be minimize/maximize icons, taskbar and other familiar things. Hopefully with them transition from Windows to Linux will go smoother.

Tuesday, February 6, 2024

Fix problem with Git client for Linux which asks for credentials on every push with installed SSH key

Recently I faced with the problem that Git client for Linux (CentOS) always asked for user credentials on every push even though SSH key was installed. In general SSH key is installed exactly for avoiding that. So what went wrong?

Let's briefly check whole process. First of all we need to install SSH key pair. On Linux it can be done with ssh-keygen tool. If you don't want to enter passphrase on every push just click Enter on each step. By default it will save public/private key files (id_rsa.pub and id_rsa) into ~/.ssh folder (where ~ means local user folder - usually under /home/local/...). After that copy content of public key file id_rsa.pub and go GitHub > your profile Settings > SSH and GPG keys > SSH keys and paste content there:

Installation of SSH key has been completed. But if you will try to clone some repository and try to push changes there (assuming that you have write permission in this repository) git may still ask for username/pw credentials and every push. As it turned out it depends on how repository was cloned. There are several ways to clone repositories: HTTPS, SSH and GitHub CLI (HTTPS tab goes first in UI)

Mentioned problem with credentials appears when repository is cloned via HTTPS. Solution here is to clone repository with SSH instead:

After that git should not ask you for credentials anymore.

Tuesday, January 9, 2024

Verify JWT tokens with EdDSA encryption algorithm

In my previous posts of this series I showed how to generate EdDSA private and public keys pair and how to sign JWT tokens using private EdDSA key. In this last post of the series I will show how to verify signed JWT token with public key.

Let's remind that EdDSA key pair may look like that in JSON format:

{
	"kty": "OKP",
	"alg": "EdDSA",
	"crv": "Ed25519",
	"x": "...",
	"d": "..."
}

where "x" property is used for public key and "d" is for private key. Private key (d) was used for signing. For verification we need to use public key (x).
For token validation we will use JsonWebTokenHandler.ValidateTokenAsync() method from Microsoft.IdentityModel.JsonWebTokens. Here is the code which decodes token:

string token = ...;
var jwk = ...; // get EdDSA keys pair
var pubKey = new EdDsaSecurityKey(new Ed25519PublicKeyParameters(Base64UrlEncoder.DecodeBytes(jwk.X), 0));
pubKey.KeyId = jwk.KeyId;
var result = await new JsonWebTokenHandler().ValidateTokenAsync(token, new TokenValidationParameters()
{
  ValidIssuer = JwtHelper.GetServiceName(jwk),
  AudienceValidator = (AudienceValidator) ((audiences, securityToken, validationParameters) => true), // or whatever logic is needed for verifying aud claimm
  IssuerSigningKey = (SecurityKey) pubKey
});
if (!result.IsValid)
  throw result.Exception;
json = JWT.Payload(token);

Here we use EdDsaSecurityKey class from ScottBrady.IdentityModel.Tokens.
If public key matches private key which was used for signing then result.IsValid will be true (otherwise code will throw exception). At the end we call JWT.Payload() from jose-jwt to get JSON token representation (from which we may get needed claims and other data).

With these techniques you may generate EdDSA keys, sign tokens and verify them. Hopefully information in these posts will help you.

Monday, December 25, 2023

Sign JWT tokens with EdDSA encryption algorithm

In my previous post of this series I showed how to generate key pair for EdDSA encryption algorithm. Let's now go further and use these keys to sign JWT token. If you remember from previous post "d" property of json object with keys pair belongs to private key. We will use this private key for signing our JWT token.

For creating JWT token we need to define claims. They are app/domain specific. We can add e.g. iss (issuer), exp (expired) and other standard claims (standard claims are defined in RFC 7519). Also we may add custom claims as we need in the app:

List<Claim> claims = ...; // fill claims

Then we need to load private key (from some secrets storage/vault usually):

var jwk = ...; // load private key

this jwk object may be json object showed in my previous post (plus it should have keyId string property for key identifier which may contain e.g. some guid).

Then we need to create EdDSA security key object and create signed token. We can do that using ScottBrady.IdentityModel nuget package (it uses Portable.BouncyCastle internally):

var edDsaSecurityKey = new EdDsaSecurityKey(new Ed25519PrivateKeyParameters(Base64UrlEncoder.DecodeBytes(jwk.d), 0));
edDsaSecurityKey.KeyId = jwk.keyId;
var securityTokenHandler = new JwtSecurityTokenHandler();
string token = securityTokenHandler.WriteToken(securityTokenHandler.CreateToken(new SecurityTokenDescriptor
{
    Subject = new ClaimsIdentity(claims),
    Issuer = ..., // define issuer (iss) claim as you need
    Expires = new DateTime(DateTime.UtcNow.AddMinutes(1)), // add expired date as you need
    SigningCredentials = new SigningCredentials(edDsaSecurityKey, "EdDSA")
}));

This code will create JWT token signed with EdDSA private key. In the next post I will show how to verify this token using public EdDSA key.

Update 2024-01-09: see also Verify JWT tokens with EdDSA encryption algorithm.

Tuesday, December 12, 2023

Fix Linq 2 NHibernate for MySQL

If you use NHibernate with MySQL and Linq 2 NHibernate to simplify fetching data you may face with problem: queries created by Linq2NH use square brackets by default. That is fine for SQL Server but won't work in MySQL which uses backticks ``.

For MySQL we need to instruct NHibernate to use backticks instead of square brackets. It can be done by setting interceptor in NH config:

public class NHConfiguration
{
    public static Configuration Build(string connStr)
    {
        var config = Fluently.Configure()
            .Database(
                MySQLConfiguration.Standard
                    .ConnectionString(connStr)
                    .AdoNetBatchSize(100)
                    .DoNot.ShowSql()
            )
            .Mappings(cfg =>
            {
                // add mappings
            })
            .ExposeConfiguration(x =>
            {
with backticks for MySQL
                x.SetInterceptor(new ReplaceBracesWithBackticksInterceptor());
            });

        return config.BuildConfiguration();
    }
}

in its OnPrepareStatement method we just replace square brackets with backticks:

public class ReplaceBracesWithBackticksInterceptor : EmptyInterceptor
{
    public override NHibernate.SqlCommand.SqlString OnPrepareStatement(NHibernate.SqlCommand.SqlString sql)
    {
        return sql.Replace("[", "`").Replace("]", "`");
    }
}

After that Linq2NH will start working in MySQL.