The Modern Craft Studio

A collection of ideas and experiences in Software Development by Rafael Caricio.

Today I was working on a project that uses GStreamer and reads content in using SRT protocol on my macOS. The SRT elements could not be found in my system after installing GStreamer using Homebrew. I’ve installed all the GStreamer sub-packages from Homebrew.

$ brew install gstreamer gst-plugins-base \
    gst-plugins-good gst-plugins-bad \
    gst-plugins-ugly gst-libav \
    gst-rtsp-server gst-editing-services

Still, I could not find the strsrc element, for example.

$ gst-launch-1.0 -v srtsrc uri="srt://127.0.0.1:7001" ! fakesink
WARNING: erroneous pipeline: no element "srtsrc"

For some reason, not clear to me at the time, I did not have the plugin installed. This triggered me to look at the source code of GStreamer to find how the element is enabled. I know that GStreamer contains many plugins that depend on different third-party libraries. The SRT set of elements, resides in the gst-plugins-bad bundle. Then it was clear to me that the SRT elements are only compiled if the libsrt is available in the host system at compilation time.

Ok, now I know what might be causing the SRT plugin to not be available on my GStreamer’s brew installation. In order to confirm that, I checked the Homebrew formula for its dependencies.

libsrt is not listed as dependency of the gst-plugins-bad bundle

As I was guessing, libsrt is not a dependency of that formula. This means that the meson configuration we saw earlier is not letting the SRT plugin in the compilation process.

The fix

We need to modify the gst-plugins-bad formula locally and then install from source.

First, we uninstall the gst-plugins-bad formula.

$ brew rm gst-plugins-bad

Then we can edit our local formula to add the libsrt dependency. This is not strictly required, as we could just install libsrt manually and then recompile from source. But we will add here anyway, so we make sure we will not override this next time the package update. The following command will open your default editor as defined in the EDITOR environment variable.

$ brew edit gst-plugins-bad 

Add the line below to the dependency list:

depends_on "srt"

We now install the package from source.

$ brew reinstall --build-from-source gst-plugins-bad

That's it. Now you should have the SRT plugin installed and all its elements will be available in your system. We can double-check that by inspecting the one element, like the srtsrc.

$ gst-inspect-1.0 srtsrc

You should see something like:

Shell showing the result of the gst-inspect-1.0 command above

Yay! That is it. Now I can continue my work and have all the SRT elements available on my macOS GStreamer installation from Homebrew.

Pipeline showing a test stream using SRT protocol


I'm open to hear from your experiences on things I write about. If you want to connect, you can find me at @rafaelcaricio@fosstodon.org or on other places on the internet. All posts from this site are also published to the Fediverse account @blog@caricio.com where you can receive ActivityPub updates about new posts.
The icons used by this site can be found at Hacker icons created by Freepik - Flaticon.

My partner likes to shop online more than physically going to shops. The fact that we live in a pandemic for the last years also helped to avoid going out in general.

This weekend I’ve decided to help my partner and set up a Huginn instance, so I could configure several web scrappers to notify her when prices change, or specific sizes become available. Some shops have the “subscription” concept, but most often when she gets to see the email, the product is gone already. That’s considering that the “subscription” feature even work.

Huginn let us configure many “agents” which work together to complete tasks. We connect agents in a pipeline where events (JSON objects) navigate from one agent to the next. There are many available agents already, scrapping websites, sending emails, checking weather, etc. You can find more details in the official wiki.

I’ve installed my Huginn instance using YuNoHost. YuNoHost is a super awesome project with the vision to simplify maintenance of self-hosted applications for the small web. If you want to have a personal VPS with low maintenance, I recommend you to consider YuNoHost as your Linux “distribution”. Many tasks and best practices are followed, and allows you to focus on using your self-hosted services in a shorter time.

Installation of YuNoHost on Debian 11 (bullseye)

As of today, the Debian bullseye version is not supported by the latest stable YuNoHost. Currently, YuNoHost for Debian 11 is in “beta” stage.

we consider that it should be okay to upgrade to or install a fresh Yunohost 11.0+ running on Bullseye for a production server if you are a tech-savvy person not afraid to debug stuff if needed.

I’m a tech-savvy person, so I went on and installed the YuNoHost 11.0.7 (testing). 🙂 Of course there is a risk of something to not work, and I’ve hit one issue. The domain I use with my YuNoHost server is managed by AWS Route53 and the boto3 library was missing after a fresh install of YuNoHost 11.0.7 .

ModuleNotFoundError: No module named 'boto3'

It was simple enough to fix, I just needed to install boto3 and things worked fine.

sudo apt install python3-boto3

I'm open to hear from your experiences on things I write about. If you want to connect, you can find me at @rafaelcaricio@fosstodon.org or on other places on the internet. All posts from this site are also published to the Fediverse account @blog@caricio.com where you can receive ActivityPub updates about new posts.
The icons used by this site can be found at Hacker icons created by Freepik - Flaticon.

I’ve never used Nix before last week. I’ve head about NixOS and the Nix package manager before, but I thought those things were the same, or it was required to use the NixOS to be able to take the advantages of Nix package management. What attracted me about Nix was the idea of isolated environments and the idea of non-global package management. Or the rollback in case you install something that later you regret. This sounds a lot like Docker layers, but for the whole OS.

I use macOS for both, my work computer and my personal machine. I have been using Homebrew since the first time I used a Mac computer in 2011. I know about Macports, but never used it since people tell me it is not very straight forward, and I personally don’t see many benefits over Homebrew. Even thought, Homebrew has been essential to my development workflow on the macOS. I have been growing frustrated with Homebrew. The fact that I upgrade one package or install one tool for something specific and then my whole system installation starts to upgrade is not convenient in most of the time. An installation that would have taken 5 seconds becomes a system upgrade that takes 10 minutes or more. Another issue with that is when upgrades breaks workflow. I am working on something that needs GStreamer 1.18 in one project and GStreamer 1.20 in another. I don’t know if it’s even possible to have two versions installed like that with Homebrew, what I saw before were two packages with different names but meaning the same software in different versions.

Some people, me included, would see here an opportunity to use Docker or similar solution to create isolated development environments. That works great on Linux, since you will be sharing the same running kernel with the Docker container, but on macOS that is not the case. macOS cannot run Docker natively, everything you run inside Docker actually is running in a VM. So all the system resources are pre-allocated to the VM and separated from macOS. You will never be using the full-power of your computer in that case. Another layer of complications of using Docker for an isolated environment on macOS is when you are, like me, using the M1 chip. You will be needing to compile a lot of things from scratch inside Docker containers that would be native to arm64 in order to create base systems for your workflows. All that to say that I don’t think Docker on a macOS makes the best choice for isolated development environments.

Until recently, I was sticking with Homebrew to maintain my development workflow. But a few days ago, I discovered that Nix package management can be used on macOS. This seems to be the case for a long time, but for some reason I have never heard of anyone using Nix on macOS in my bubble of friends and acquaintances. I started reading about Nix and discovered the `nix-shell` tool. Not only, I could keep my system installation clean of packages I don’t need, but I could also install some packages in a temporary shell session and not affect my global package installations. This sounded great to me, as it would fix my frustrations with Homebrew and provide environment isolation that I thought I could only get with Docker.

There are some excellent guides on how to migrate from Homebrew to Nix”'), I think it is out of scope for me to go over again in detail on what you can do. I’ve uninstalled Homebrew from my personal machine and followed the guides to install Nix package manager.

After uninstalling Homebrew completely, I ran the following command:

sh <(curl -L https://nixos.org/nix/install)

Then I checked if everything was working by running the command:

nix-shell -p nix-info --run "nix-info -m"

After this part is working, I have installed the Nix Darwin modules as described in their repository README:

nix-build https://github.com/LnL7/nix-darwin/archive/master.tar.gz -A installer
./result/bin/darwin-installer  

At this point, you can edit the file ~/.nixpkgs/darwin-configuration.nix to customize your system. That is the file where you can also include the system-wide packages you want to have installed. Here is where the fun started for me. I like to use Neovim to edit configuration files, but at this point I lost Neovim when I uninstalled Homebrew. Now I can use nix-shell to create a shell where I have Neovim and use that to edit my global Nix packages configuration to include Neovim.

nix-shell -p neovim
neovim ~/.nixpkgs/darwin-configuration.nix

It worked perfectly. I included Neovim in the global system packages:

environment.systemPackages = [
    pkgs.neovim
  ];

Then ran the command to update my system:

darwin-rebuild switch

This way I have Neovim available globally.

As of today, I have many other packages installed globally in my system. This is an illustrative list:

environment.systemPackages = [
    pkgs.gitFull
    pkgs.neovim
    pkgs.wget
    pkgs.curlFull
    pkgs.python310
    pkgs.hstr
    pkgs.gnupg
    pkgs.htop
    pkgs.jq
    pkgs.mosh
    pkgs.ripgrep
    pkgs.sshuttle
    pkgs.ffmpeg
  ];

So far, so good. But I would like now to use Nix shells to create an isolated development environment for some projects that I work on. That was one of the main reasons I wanted to try Nix in the first place.

Using Nix to bootstrap per-project development environments

Looking at the nix-shell documentation, I can create a file shell.nix in the root directory of the project I’m working on and then call:

nix-shell —run zsh

The —run zsh argument here is just because I prefer to use zsh instead of bash. I know the must be a better way of doing that. But this works for now. This command starts a new zsh session with the configuration present in shell.nix. Here is an example of shell.nix file for the gst-plugins-rs project.

let
  pkgs = import <nixpkgs> { overlays = [ (import ~/.nixpkgs/overlays/a52dex.nix) ]; };
in
  pkgs.mkShell {
    buildInputs = [
      pkgs.gst_all_1.gstreamer
      pkgs.gst_all_1.gst-plugins-base
      pkgs.gst_all_1.gst-plugins-good
      pkgs.gst_all_1.gst-plugins-ugly
      pkgs.gst_all_1.gst-plugins-bad
      pkgs.gst_all_1.gst-devtools
      pkgs.darwin.apple_sdk.frameworks.Security
      pkgs.pkg-config
      pkgs.cairo
    ];
  }

This is all I need to have an isolated development environment to work on gst-plugins-rs project in my M1 macOS.

target/debug/libgstvideofx.dylib: Mach-O 64-bit dynamically linked shared library arm64

This is compiling everything to arm64, which means it is running natively. Yay!

Caveats and Workarounds

If you looked closely, the shell.nix file that I used here for the gst-plugins-rs project contained some “overlays” custom parameter in the imports. This is overriding one dependency of the pkgs.gst_all_1.gst-plugins-ugly package which could not be originally compiled to the arm64 architecture. Luckily, I have found a Pull Request that had a new package definition that fixed issues when compiling the a52dec to the M1 processor.

Conclusion

From now on, whenever I need to work in some project, I can write a shell.nix file with the system dependencies for that project and not worry that I am breaking some other project’s workflow.

Not everything is perfect, though. I had to spend considerable time to learn a bit of the Nix language, which I think is essential in if you want to consider Nix package manager. Anyway, for now I am happy with Nix and I will keep using it for the foreseeable future.

If you start using Nix on your macOS please consider donating to maintain the work on having Nix support for macOS. I’m not involved in that project, but I think it’s nice to support this work if you benefit from it and can afford it.


I'm open to hear from your experiences on things I write about. If you want to connect, you can find me at @rafaelcaricio@fosstodon.org or on other places on the internet. All posts from this site are also published to the Fediverse account @blog@caricio.com where you can receive ActivityPub updates about new posts.
The icons used by this site can be found at Hacker icons created by Freepik - Flaticon.

I quite frequently stumble upon people in the Python community being misled to think that using async Python code will make their APIs “run faster”. Async Python is a great feature and should be used with care. One point that I constantly find being overseen is the mix of sync and async code. The general rule is that we should never mix blocking code with async code. I would like to present in this post a simplified example where we can observe the usage of async Python will hurt the performance of an API and then see how we can fix it.

Our example application is a FastAPI service that needs to call two operations from an external API within the handling of an HTTP request.

Those are all the dependencies we will use for the example:

# file requirements.txt
fastapi[all]==0.65.1
uvicorn[standard]==0.13.4
requests==2.25.1
httpx==0.18.2

Let's look at the example API code:

# file app/application.py
from fastapi import FastAPI
import requests
import uuid
import logging

logging.basicConfig(format="%(asctime)s %(message)s")
log = logging.getLogger("myapp")
log.setLevel(logging.DEBUG)

app = FastAPI()

EXTERNAL_API_ENDPOINT = "http://localhost:8888"


@app.get("/healthcheck")
async def healthcheck():
    return {"status": "ok"}


#
# Async mixed with blocking
#

def internal_signing_op(op_num: int, request_id: str) -> None:
    session = requests.Session()
    response = session.request("GET", EXTERNAL_API_ENDPOINT, timeout=2000)
    print(f"{request_id} {op_num}: {response}")


def sign_op1(request_id: str) -> None:
    internal_signing_op(1, request_id)


def sign_op2(request_id: str) -> None:
    internal_signing_op(2, request_id)


@app.get("/async-blocking")
async def root():
    request_id = str(uuid.uuid4())

    print(f"{request_id}: started processing")

    sign_op1(request_id)
    sign_op2(request_id)

    print(f"{request_id}: finished!")
    return {"message": "hello world"}

Here we have a simple application that tries to replicate the behavior that I'm trying to point out. We have mixed async code with the synchronous library requests. The code works fine, but there is one problem. To understand the problem, we need to recap on how Uvicorn works. Uvicorn executes our application server by spawning workers (OS sub-process) that handles the requests coming into our server. Every worker (sub-process) is a fully-featured CPython instance and has its own I/O loop that runs our FastAPI application.

Uvicorn diagram showing the main process and subprocess running different instances of our example application.

The Main Process holds a socket that is shared with the workers and accepts the HTTP requests that are handled by the workers to actually process the request. We can have as many workers as we want, usually the number of CPU cores. In our case, to make it easier to analyze the behavior, we are going to run only a single worker. We execute our server with the following command:

uvicorn app.application:app --workers 1

I've set up a fake external API that we will use for this example. Just a simple server that takes a long time to execute some obscure operation (sleep(20) 😄 ).

# file external_api.py
import asyncio
from fastapi import FastAPI

app = FastAPI()

@app.get("/")
async def root():
    await asyncio.sleep(20)
    return {"message": "Hello World"}

We spin up the external API server using this command:

uvicorn external_api:app --port 8888 --workers 1

We set 1 worker here for no good reason, the important part here is to make the external API run in the port 8888 which is the one we've hardcoded in our example application.

Full working tree of the example for reference:

.
├── app
│   ├── __init__.py
│   └── application.py
├── external_api.py
└── requirements.txt

1 directory, 4 files

Now we can call our application with mixed async and sync code and observe what is printed out. I used httpie to make the requests. I've opened two consoles and made distinct HTTP requests to our application within the 20 seconds timeframe. This is the output:

❯ uvicorn app.application:app --workers 1 --log-level error
2021-07-07 20:08:57,962 9631c187-8f46-402a-b8ea-a15496643b81: started processing
2021-07-07 20:09:17,978 9631c187-8f46-402a-b8ea-a15496643b81 1: <Response [200]>
2021-07-07 20:09:37,987 9631c187-8f46-402a-b8ea-a15496643b81 2: <Response [200]>
2021-07-07 20:09:37,987 9631c187-8f46-402a-b8ea-a15496643b81: finished!
2021-07-07 20:09:37,988 694ee4be-a15a-49f6-ad60-7c140135a1f6: started processing
2021-07-07 20:09:57,997 694ee4be-a15a-49f6-ad60-7c140135a1f6 1: <Response [200]>
2021-07-07 20:10:18,004 694ee4be-a15a-49f6-ad60-7c140135a1f6 2: <Response [200]>
2021-07-07 20:10:18,004 694ee4be-a15a-49f6-ad60-7c140135a1f6: finished!

As we can observe in the output that even though I've made both requests “in parallel” (same second) the server only accepted the request/started processing the second request (694ee4be-a15a-49f6-ad60-7c140135a1f6) after the full execution of the first request (9631c187-8f46-402a-b8ea-a15496643b81) which took a full 40 seconds. During the whole 40 seconds, there was no task switching and the worker event loop was completely blocked. All requests to the API are stale for the full 40 seconds, including requests to any other endpoints that might exist in other parts of the application. Even if the other requests don't call the external API, they cannot execute because the worker event loop is blocked. If we call the GET /healthcheck endpoint it will not execute either.

One way to hide this problem and have our server still accepting multiple requests when the workers are blocked is to increase the number of workers. But those new workers can also be blocked on sync calls and our API is suspicious of a DDoS attack. The way to solve this problem is by not let our workers get blocked. Our API should be fully async. For that, we need to replace the requests library with a library that supports async.

Let's now implement a “v2” version of our example API, still calling the same fake external API that takes 20 seconds to process. Furthermore, we will again run Uvicorn with a single worker.

Here is the code with the updated implementation:

#
# Async end-to-end
#


async def v2_internal_signing_op(op_num: int, request_id: str) -> None:
    """Calls external API endpoint and returns the response as a dict."""
    async with httpx.AsyncClient() as session:
        response = await session.request("GET", EXTERNAL_API_ENDPOINT, timeout=2000)
    log.debug(f"{request_id} {op_num}: {response}")


async def v2_sign_op1(request_id: str) -> None:
    await v2_internal_signing_op(1, request_id)


async def v2_sign_op2(request_id: str) -> None:
    await v2_internal_signing_op(2, request_id)


@app.get("/all-async")
async def v2_root():
    request_id = str(uuid.uuid4())

    log.debug(f"{request_id}: started processing")

    await v2_sign_op1(request_id)
    await v2_sign_op2(request_id)

    log.debug(f"{request_id}: finished!")
    return {"message": "hello world"}

Notice that I've replaced the requests library with the httpx library which supports async HTTP calls and has an API that is very similar to the one requests provide. The code is functionally equivalent to our previous mixed implementation, but now we implemented async fully. Let's execute our API using the same commands as before.

uvicorn app.application:app --workers 1

Then calling the API using httpie, but to the fully async endpoint:

http localhost:8000/all-async

The console output is:

2021-07-07 23:30:21,673 da97310b-1d20-4082-8f90-b2e163523b83: started processing
2021-07-07 23:30:23,768 291f556e-038d-4230-8b3b-8e8270383e62: started processing
2021-07-07 23:30:41,718 da97310b-1d20-4082-8f90-b2e163523b83 1: <Response [200 OK]>
2021-07-07 23:30:43,781 291f556e-038d-4230-8b3b-8e8270383e62 1: <Response [200 OK]>
2021-07-07 23:31:01,740 da97310b-1d20-4082-8f90-b2e163523b83 2: <Response [200 OK]>
2021-07-07 23:31:01,740 da97310b-1d20-4082-8f90-b2e163523b83: finished!
2021-07-07 23:31:03,801 291f556e-038d-4230-8b3b-8e8270383e62 2: <Response [200 OK]>
2021-07-07 23:31:03,801 291f556e-038d-4230-8b3b-8e8270383e62: finished!

We can observe in the output that both requests started processing immediately and they are still sequential in their own request lifecycle. The event loop of the Uvicorn worker is not blocked, that is why the second request could continue processing even though the external API did not finish its operation. Other requests, like the GET /healthcheck, are not impacted by the slow execution of the external API. Overall our application continues to serve other requests independently on the external API.

When using async Python one must be careful about what libraries to use. Even though a library might be very popular in the Python community, it doesn't mean that the library will play well in an async application. Choosing the right libraries will make the execution of the application more concurrent by not blocking the I/O loop. The overall throughput of the application will be better as more unrelated requests can be processed by the same Uvicorn worker.

I've used async Python in some applications I maintain and it was challenging to choose the right libraries to use. The team has to be on the watch for possible places in the code where the event loop may block. Even using the built-in Python logging library or a “print” statement is going to block the I/O loop. Usually, those blocking calls are negligible but it is important to understand that they are there. I highly recommend also reading the official documentation on other tips for developing async code in Python. Have you developed an async Python API, what was your experience?


I'm open to hear from your experiences on things I write about. If you want to connect, you can find me at @rafaelcaricio@fosstodon.org or on other places on the internet. All posts from this site are also published to the Fediverse account @blog@caricio.com where you can receive ActivityPub updates about new posts.
The icons used by this site can be found at Hacker icons created by Freepik - Flaticon.

Enter your email to subscribe to updates.