Skip to content

TIL Fix missing certificates in Python 3.6 and Up on MacOS

Imagine my surprise when writing a simple Python script on my Mac, I suddenly got SSL errors on every urllib request over HTTPS. I checked the site certificate, looked good. I even confirmed on the Apple help documentation that they included the CAs for this certificate (in this case the Amazon certificates). I was really baffled on what to do until I stumbled across this.

\f0\b0 \ulnone \
This package includes its own private copy of OpenSSL 1.1.1.   The trust certificates in system and user keychains managed by the 
\f2\i Keychain Access 
\f0\i0 application and the 
\f2\i security
\f0\i0  command line utility are not used as defaults by the Python 
\f3 ssl
\f0  module.  A sample command script is included in 
\f3 /Applications/Python 3.11
\f0  to install a curated bundle of default root certificates from the third-party 
\f3 certifi
\f0  package ({\field{\*\fldinst{HYPERLINK "https://pypi.org/project/certifi/"}}{\fldrslt https://pypi.org/project/certifi/}}).  Double-click on 
\f3 Install Certificates
\f0  to run it.\
Link

Apparently starting in Python 3.6, Python stopped relying on the Apple OpenSSL and started bundling their own without certificates. The way this manifests is:

URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:581)>

The fix is to run the following from the terminal(replace 3.10 with whatever version you are running):

/Applications/Python\ 3.10/Install\ Certificates.command

This will install the certifi package, which has all the Mozilla certificates. This solved the problem and hopefully will help you in the future. Really weird choice by the Mac Python team here since it basically breaks Python.


You can finally delete Docker Desktop

You can finally delete Docker Desktop

Podman Desktop is here and it works great. When Docker changed their license to the following, it was widely understood that its time as the default local developer tool was coming to an end.

Docker Desktop remains free for small businesses (fewer than 250 employees AND less than $10 million in annual revenue), personal use, education, and non-commercial open source projects.
Hope you never get acquired I guess

Podman, already in many respects the superior product, didn't initially have a one to one replacement for Docker Desktop, the commonly used local development engine. However now it does and it works amazingly well. Works with your existing Dockerfiles, has all the Kubernetes functionality and even allows you to use multiple container engines (like Docker) at the same time.

I'm shocked how good it is for a not 1.0 release, but for anyone out there installing Docker Desktop at work, stop and use this instead. Download it here.


Passkeys as a tool for user retention

Passkeys as a tool for user retention

With the release of iOS 16 and MacOS Ventura, we are now in the age of passkeys. This is happening through WebAuthn, a specification written by the W3C and FIDO with the involvement of all of the major vendors such as Google, Mozilla, etc. The basic premise is familiar to anyone who has used SSH in their career: you login through the distribution of public keys, keeping the private key on the device.

Like all security initiatives, someone took the problem of "users sometimes reuse passwords" and decided the correct solution is to involve biometric security when I log into my coffee shop website to put my order in. I disagree with the basic premise of the standard on a personal level. I think it probably would have been fine to implement an API where the service simply requested a random password and then stored it in the existing browser password sync. It would have solved the first problem of password reuse, allowed for export and didn't require a total change in how all authentication works. However I am not the kind of person who writes these whitepapers full of math so I defer to the experts.

How WebAuthn works is the server receives from the user a public key and randomly generated ID. This private key is distributed and stored in the client vendor sync process, meaning it is available to different devices as long as those devices exist within the same vendor ecosystem. This stuck out to me as a fascinating transition for passwords and one with long-term implications for user retention across the different platforms.

Imagine being Apple or Google and getting to tell users "if you abandon this platform, you are going to lose every login you have". This is a tremendous threat and since all platforms will effectively be able to share it at the same time, not a legal threat. Let's get into the details of what WebAuthn is and how it works, then walk through how this provides tremendous value to platform holders as a way of locking in users.

WebAuthn Properties

WebAuthn is a web-based API that allows web servers, called Relying Parties to communicate with authenticators on a users device. I'll refer to these as RPs from now on. To get started, the wserver creates new credentials by calling navigator.credentials.create() on the client.

const credential = await navigator.credentials.create({
    publicKey: publicKeyCredentialCreationOptions
});

This object has a number of properties.

  • challenge: a buffer of random bytes generated by the server to prevent replay attacks.
  • rp: basically the website, needs to be a subset of the domain currently in the browser.
  • user: information about the user. Suggested not to use PII data but even if you use name and displayName it doesn't appear that this is ever relayed to the rp source
  • pubKeyCredParams: what public keys are acceptable to the server
  • authenticatorSelection: do you want anything to be allowed to be an authenticator or do you want a cross-platform authenticator like only a YubiKey
  • timeout: self-documenting
  • attestation: information from the authenticator that could be used to track users.

Attestation

What you are getting back as the service is the attestation statement, which is a somewhat vague concept. It's a signed data object that includes info about the public key and some other pieces of information. You can see the generic object below

Source

This part is kind of interesting. There are actually 4 tiers of "information you get back about the user".

  • none: You don't want anything and is the default
  • indirect: the client is allowed how to obtain such a statement. The client may replace an authenticator-generated statement with one generated with an Anonymization CA.
  • direct: you want the statement
  • enterprise: you want the statement that may include uniquely identifying information. The authenticator needs to be configured to say "this rp ID is allowed to request this information", so presumably this will be for devices enrolled in some sort of MDM.

You get back an attestationObject at the end of all of this which basically allows you to parse metadata about the registration event as well as the public key, along with the fmt of the attestation which is effectively the type of authenticator you used. Finally you have the statement itself.

When a user wants to sign in, the process works pretty simply. We call navigator.credentials.get() on the client, which basically says go get me the credentials I specify in allowCredentials. You can say what the ID is and how to get the credentials (through usb, bluetooth, nfc, etc).

For further code samples and a great walkthrough I recommend: https://webauthn.guide/

The key to how this all works is that the private key is synced by the vendor to the different devices (allowing for portability) but also allows for phone delegation. So for instance if you are a Windows Chrome user and want to sign in using passkeys, you can, as long as you still have the Apple device that you originally created the passkey on.

Good diagram if you are using the Okta service, but still valuable either way

Passkey cross-device flow

  1. The vendor supplies a "Passkey from nearby devices" option when the user is logging in
  2. The web site displays a QR code
  3. The device which contains the passkey points its camera and starts an auth flow
  4. The two devices perform a Bluetooth handshake to ensure they are actually near each others and agree on a server to use as an exchange post
  5. The device with the passkey is used to perform the actual auth process.
  6. Now in theory the service should offer some sort of "would you like to make a new passkey on this device".

At no point did you transfer the private key anywhere. That entire sync process is controlled by the vendor, meaning your option for portable authentication is going to be a roaming authentiator (aka a YubiKey).

It's important to note here that a lot of assumptions have been made about developers around the requirement of local storage of private keys. This isn't necessarily the case. Authenticators have the option of not storing their private keys locally at all. You have the option of instead storing the private keys with the rp, encrypted under a symmetric key held by the authenticator. This elasticity to the spec comes up a lot, with many decisions deferred to the vendors.

Why Does This Matter?

WebAuthn has taken a strong "decentralized" design philosophy, which makes sense until you realize that the inability to export private keys isn't....really true. Basically by submitting an attestation, the vendor is saying "these devices private keys cannot be stolen". You can see the initial conversation on GitHub here. It's making the problem someone else's issue.

By saying, in essence, portability of private keys is not a concern of the spec and leaving it entirely in the hands of vendors, we have created one of the greatest opportunities for user lock-in in recent history. It is now on the individual services and the vendors to allow for users to seamlessly sign in using different devices. The degree by which platform owners want to allow these devices to work with each other is entirely within their own control.

We can see that vendors understand this to some extent. Apple has announced that once passkey is supported in the OS, device-bound keys will no longer be supported. You will not have the option of not using iCloud (or Chrome Sync services). Administrators will likely not love the idea of critical keys being synced to devices possibly beyond their control (although the enterprise attestation provides some protection against this). So already in these early days we see a lot of work being done to ensure a good user experience but at the cost of increased vendor buy-in.

Scenario

You are a non-technical user, who used their iPhone in the normal way. When presented with a login you let the default of "use passkeys" ride, without doing anything special. You lose your phone, but don't own any other Apple products. By default iCloud Keychain only works on iOS, iPadOS and macOS. In order to seamlessly log into any service that you registered through your phone with passkeys, you have to purchase another Apple product.

If you attempt to switch to Android, while it supports passkeys, it is between you and the RP on how controlling access to your original account will work. Since allowing users to reset their passkeys through requesting a passkey reset through an email link eliminates a lot of the value of said service, I'm not exactly sure how this is going to be handled. Presumably there will need to be some other out-of-band login.

Also remember that from the RPs side, what you are getting is almost no information about the user. You aren't even (confidently) getting what kind of authenticator. This is great from a GDPR perspective, not having to hold email addresses and names and all sorts of other PII in your database (and does greatly eliminate many of the security concerns around databases). However if I am a web developer who goes "all-in" with this platform, it's hard to see how at some point I'm not going to fall back to "email this user a link to perform some action" and require the email address for account registration.

On the RP side they'll need to: verify this is the right user (hopefully they got some other ID in the registration flow), remove the passphrase and then have the user enroll again. This isn't a terrible flow, but is quite a bit more complicated than the story today of: log in again through SSO or by resetting a password by sending a reset link to email.

What about Chrome?

Chrome supports the standard, but not in a cross-platform way.

Passkeys are synchronized across devices that are part of the same ecosystem. For example, if a user creates a passkey on Android, it is available on all Android devices as long as the user is signed in to the same Google account. However, the same passkey is not available on iOS, macOS or Windows, even if you are using the same browser, like Chrome.
Source

Ultimately it is the hardware you are generating the key on that matters for vendor syncing, not the browser or OS.

The result of this is going to be pretty basic: as years go on, the inertia required to switch platforms will increase as more services are added as passkeys. There exists no way currently that I'm able to find that would allow you to either: add a secondary device to the exchange process or to bulk transfer out of Vendor A and into Vendor B. Instead any user who wants to switch services will need to go through the process of re-enrolling in each services with a new user ID, presumably hoping that email was captured as part of the sign-up flow so that the website or app can tie the two values together.

There is a proposed spec that would allow for dual enrollment, but only from the start. Meaning you would need to have your iOS authenticator, then attach your Chromebook authenticator to it from the start. There is no way to go back through and re-sync all logins with the new device and you would need constant access to both devices to complete the gesture element of the auth.

Yubico proposal

You can read that proposal here.

Yubico has an interesting idea here based on ARKG or Asynchronous Remote Key Generation. The basic idea is that you have a primary authenticator and a secondary authenticator that has no data transfer between the two. The proposed flow looks as follows

  • Backup device generators a private-public key pair and transfers the public key to the primary authenticator
  • This is used by the primary authenticator to derive new public keys on behalf of the backup device
  • Then the primary generates a new pair for each backup device registered and sends this on to the RP along with its primary key.
  • If the primary disappears, the backup device can request the cred from the RP and use it to derive the key used. In order to retrieve the cred associated with a user, there needs to be some sort of identifier outside of the user ID in the spec which is a random value not surfaced to the user.

For more math information on the spec itself check this paper out.

ARKG functionality. ARKG allows arbitrary public keys pk′ to
be derived from an original pk, with corresponding sk′ being cal-
culated at a later time—requiring private key sk for the key pair
(sk, pk) and credential cred.
Definition 3.1 (ARKG). The remote key generation and recovery
scheme ARKG B (Setup, KGen, DerivePK, DeriveSK, Check) con-
sists of the following algorithms:
• Setup(1𝜆 ) generates and outputs public parameters pp =
((G, 𝑔, 𝑞), MAC, KDF1, KDF2) of the scheme for the security
parameter 𝜆 ∈ N.
• KGen(pp), on input pp, computes and returns a private-
public key pair (sk, pk).
• DerivePK(pp, pk, aux) probabilistically returns a new public
key pk′ together with the link cred between pk and pk′, for
the inputs pp, pk and auxiliary data aux. The input aux is
always required but may be empty.
• DeriveSK(pp, sk, cred), computes and outputs either the new
private key sk′, corresponding to the public key pk′ using
cred, or ⊥ on error.
• Check(pp, sk′, pk′), on input (sk′, pk′), returns 1 if (sk′, pk′)
forms a valid private-public key pair, where sk′ is the cor-
responding private key to public key pk′, otherwise 0.
Correctness. An ARKG scheme is correct if, ∀𝜆 ∈ N, pp ←
Setup(1𝜆 ), the probability Pr [Check(pp, sk′, pk′) = 1] = 1 if
(sk, pk) ← KGen(pp);
(pk′, cred) ← DerivePK(pp, pk, ·);
sk′ ← DeriveSK(pp, sk, cred).
Look at all those fun characters.

Challenges

The WebAuthn presents a massive leap forward for security. There's no disputing that. Not only does it greatly reduce the amount of personal information flowing around the auth flow, it also breaks the reliance on email address or phone numbers as sources of truth. The back-bone of the protocol is a well-understand handshake process used for years and extremely well-vetted.

However the spec still has a lot of usability challenges that need to be addressed especially as adoption speeds up.

Here are the ones I see in no particular order:

  • Users and administrators will need to understand and accept that credentials are backed up and synced across unknown devices employing varying levels of biometric security.
  • Core to the concept of WebAuthn is the idea of unlinkability. That means different keys must be used for every new account at the RP. Transferring or combining accounts is a problem for the RP which will require some planning on the part of service providers.
  • In order to use this feature, services like iCloud Sync will be turned on and the security of that vendor account is now the primary security of the entire collection of passwords. This is great for consumers, less great for other systems.
  • There currently exists no concept of delegation. Imagine I need to provide you with some subset of access which I can delegate, grant and revoke permissions, etc. There is an interesting paper on the idea of delegation which you can find here.
  • Consumers should be aware of the level of platform buy-in they're committing to. Users acclimated to Chromebooks and Chrome on Windows being mostly interchangeable should be made aware that this login is now tied to a class of hardware.
  • We need some sort of "vendor exchange" process. This will be a challenge since part of the spec is that you are including information about the authenticator (if the RP asks for it). So there's no reason to think a service which generated an account for you based on one type of authenticator will accept another one. Presumably since the default is no information on this, a sync would mostly work across different vendors.
  • The vendor sync needs to extend outside of OEMs. If I use iCloud passkeys for years and then enroll in 1Password, there's no reason why I shouldn't be able to  move everything to that platform. I understand not allowing them to be exposed to users (although I have some platform ownership questions there like 'isn't it my passkey'), but some sort of certified exchange is a requirement and one that should have been taken care of before the standard was launched.

Conclusion

This is a giant leap forward in security for average users. It is also, as currently implemented, one of the most effective platform lock-ins I've ever seen. Forget the "green text" vs "blue text", as years go on and users rely more and more on passkeys for logins (which they should), switching platforms entirely will go from "a few days of work" to potentially needing to reach out and attempt to undo every single one of these logins and re-enroll. For folks who keep their original devices or a device in the ecosystem, this is mostly time consuming.

For users who don't, which will be a non-trivial percentage (why would a non-technical user keep their iphone around and not sell it if they have a brand new android), this is going to be an immense time commitment. This all assumes a high degree of usage of this standard, but I have trouble imagining web developers won't want to use this. It is simple a better more secure system that shifts a lot of the security burden off of them.


Stop Optimizing for Tutorials

Can't believe that didn't work as advertised. 

It always starts the same way. A demo is shown where a previously complex problem set is reduced down to running one Magic Tool. It is often endorsed by one of the Big Tech Companies or maintained by them. Some members of the organization start calling for the immediate adoption of this technology.  Overnight this new tool becomes a shorthand for "solution to all of our problems".

By the time it gets to the C-level, this tool is surrounded by tons of free press. You'll see articles like "X technology is deprecated, it's time to get onboard with Magic Tool". Those of you new to working in tech probably assume some due-diligence was done to see if the Magic Tool was good or secure or actually a good fit for that particular business. You would be incorrect.

Soon an internal initiative is launched. Some executive decides to make it an important project and teams are being told they have to adopt the Magic Tool. This is typically where the cracks first start to show up, when teams who are relatively apathetic towards the Magic Tool start to use them. "This doesn't do all the things our current stuff does and accounting for the edge cases is a lot more difficult than we were told".

Sometimes the team attempting to roll out this new tech can solve for the edge cases. As time drags on though, the new Magic Tool starts to look as complicated and full of sharp edges as their current tooling. At the end of all this work and discussion, organizations are lucky to see a 2-5% gain in productivity and nothing like the quantum leap forward implied by the advocates for Magic Tool.

Why is this such a common pattern?

We are obsessed with the idea that writing software shouldn't be this hard. None of us really know what we're doing when we start. Your credentials for this job are you learned, at school or at home, how to make small things by yourself. Birdhouses of Java, maybe a few potato clocks of Python. Then you pass a series of interviews where people ask you random, mostly unrelated questions to your job. "How would you make a birdhouse that holds 100,000 birds?" Finally you start working and are asked to construct a fighter jet with 800 other people. The documentation you get when you start describes how to make an F-150 pickup.

So when we see a tutorial or demo which seems to just do the right thing all the time, it lights a fire in people. Yes, that's how this was always supposed to be. The issue wasn't me or my coworkers, the problem was our technology was wrong. The clouds part, the stars align, everything makes sense again. However almost always that's wrong. Large technology changes that touch many teams and systems are never going to be easy and they're never going to just work. The technology might still be superior to be clear, but easy is a different beast.

I'll present 3 examples of good technologies whose user experience for long-term users is worse because of a desire to keep on-boarding as simple as possible. Where the enthusiasm for adoption ultimately hurts the daily usage of these tools. Docker, Kubernetes and Node. All technologies wildly loved in the community whose features are compelling but who hide the true complexity of adoption behind fast and simple tutorials.

To be clear I'm not blaming the maintainers of this software. What I'm trying to suggest is we need to do a better job of frontloading complexity, letting people know the scope and depth of a thing before we attempt to sell it to the industry at large. Often we get into these hype bubbles where this technology is presented as "inevitable", only to have intense disappointment when the reality sinks in.

Your app shouldn't be root inside of a container

Nowhere I think is this more clear than how we have fundamentally failed the container community in fundamental security. Job after job, meetup after meetup, I encounter folks who are simply unaware that the default model for how we build containers is the wrong way to do it. Applications in containers should not be root by default.

But what does root in a container even mean? It's actually not as simple as it sounds. A lot of work has happened behind the scenes to preserve the simple user experience and hide this complexity from you. Let's talk about how Docker went at this problem and how Podman attempts to fix it.

However you need to internalize this. If you are running a stock configured container with a root user and someone escapes that container, you are in serious trouble. Kubernetes or no Kubernetes.

In the off-chance you aren't familiar with how containers work.

Docker

When Docker launched and started to become popular, a lot of eyebrows were raised at it running as root on the host. The explanation made sense though for two reasons: mounting directories without restriction inside of the container and for binding to ports below 1024 on Linux. For those that don't know, any port at or below 1024 on Linux is a privileged port. Running the daemon as root is no longer a requirement of Docker but there are still the same restrictions in terms of port and directory mounting along with host networking and a bevy of other features.

This caused a lot of debate. It seemed crazy to effectively eliminate the security of Linux users for the sake of ease of deployment and testing, but community pressure forced the issue through. We were told this didn't really matter because "root in containers wasn't really root". It didn't make a ton of sense at first but we all went with it. Hello to kernel namespaces.

Namespaces in Linux are, confusingly, nothing like namespaces in Kubernetes. They're an isolation feature, meaning one service running in one namespace cannot see or access a process running in another namespace. There are user namespaces allowing users to have root in a namespace without that power transferring over. Process ID namespaces, meaning there can be multiple PID 1s running isolated from one another. Network namespace with a networking stack, mount namespace which allows for mounting and unmounting without changing the host, IPC namespaces for distinct and isolated POSIX message queues and finally UTS namespaces for different host and domain names.

The sales pitch makes sense. You can have multiple roots, running a ton of processes that are isolated from each other in distinct networking stacks that change mounts and send messages inside of a namespace without impacting other namespaces. The amount of resources and the priority of those resources are controlled by cgroups, which is how cpu and memory limits work for containers.

Sound great. What's the issue?

Namespaces and cgroups are not perfect security and certainly not something you can bet the farm on without a lot of additional context. For instance, running a container with --privileged eliminates the namespace value. This is common in CI/CD stacks when you need to run Docker in Docker. Docker also doesn't use all the namespace options, most importantly supporting but not mandating user namespaces to remap root to another user on the host.

To be fair they're gone to great lengths to try and keep the Docker container experience as simple as possible while attempting to improve security. By default Docker drops some Linux capabilities and you can expand that list as needed. Here is a list of all of the Linux capabilities. Finally, here are the ones that Docker uses. You can see here a security team trying desperately to get ahead of the problem and doing a good job.

The problem is that none of this was communicated to people during the adoption process. It's not really mentioned during tutorials, not emphasized during demos and requires an understanding of Linux to really go at this problem space. If Docker had frontloaded this requirement into the tooling, pushing you to do the right thing from the beginning, we'd have an overall much more secure space. That would have included the following:

  • Warning users a lot when they run the --privileged flag
  • By default creating a new user or userspace with containers that isn't root and creating one common directory for that user to write to.
  • Having --cap-drop=all be the default and having users add each cap they needed.
  • Running --security-opt=no-new-privileges as default
  • Adding resource definitions by default as outlined here: https://docs.docker.com/engine/reference/run/#runtime-constraints-on-resources
  • Helping people to write AppArmor or some security tooling as shown here.
  • Turning User Namespaces on by default if you are root. This allows a container root user to be mapped to a non uid-0 user outside the container and reduces the value of escaping the container.

Instead we've all adopted containers as a fundamental building block of our technologies and mostly did it wrong. Podman solves a lot of these problems btw and should be what you use from now on. By starting fresh they skipped a lot of the issues with Docker and have a much more sustainable and well-designed approach to containers and security in general. Hats off to that team.

How do I tell if my containers are correctly made?

Docker made a great tool to help with this which you can find here. Called Docker Bench for Security, it allows you to scan containers, see if they're made securely and suggests recommendations. I'm a big fan and have learned a lot about the correct ways to use Docker from using it.  

Jeff aged 25 after attempting to brute-force Kubernetes adoption

Kubernetes

I was introduced to Kubernetes years ago as "the system Google uses". Now this turns out not to actually be true, the internal Google system branched off from the design of K8s a long time ago, but it still represented a giant leap forward in terms of application design. Overnight organizations started to adopt it aggressively, often not truly understanding what it was or the problem it tried to solve. Like all tools, Kubernetes is good at some problem areas and terrible at others.

I personally love k8s and enjoy working with it. But many of the early pitches promised things that, in retrospect, were insane. K8s is neither easy to understand or easy to implement, with many sharp edges and corner cases which you run into with surprising frequency. Minikube demos are not a realistic representation of what the final product looks like.

Here's the truth: k8s is a bad choice for most businesses.

The demos for Kubernetes certainly look amazing, right? You take your containers, you shove them into the cloud, it does the right thing, traffic goes to the correct place, it's all magical. Things scale up, they scale down, you aren't locked into one specific vendor or solution. There's an API for everything so you can manage every element from one single control plane. When you first start using k8s it seems magical, this overlay on top of your cloud provider that just takes care of things.

I like how routers, storage and LBs are just "other"

The Reality

The reality is people should have been a lot more clear when introducing k8s as to what problems it is trying to solve and which ones it isn't. K8s is designed for applications that are totally ephemeral, they don't rely on local storage, they receive requests, process them and can be terminated with very little thought. While there might (now) be options for storage, they're a bad idea to use generally due to how storage works with different AZs in cloud providers. You can do things like batch jobs and guaranteed runs inside of k8s, but that's a different design pattern from the typical application.

K8s is also not bulletproof. You'll often run into issues with DNS resolution at scale since the internal networking is highly dependent on it. It comes by default wide open, allowing you to make namespaces that do not isolate network traffic (common mistake I hear all the time). Implementing a service mesh is mandatory as is some sort of tracing solution as you attempt to find what route is causing problems. Control plane problems are rare but catastrophic when they happen. Securing API endpoints and traffic in general requires diligence.

You also need to do a ton of research before you even start k8s. What CNI is best for you, how is external DNS going to work, how are you going to load balance traffic, where is SSL termination happening, how are you going to upgrade the nodes actually running these containers, monitoring and graphs and alerting and logs all need to be mapped out and planned. Are your containers secure? Do you understand your application dependencies between apps?

In order to run k8s at scale across a few hundred developers, a few people in your org need to become extremely familiar with how it works. They will be the technical troubleshooting resource as application teams attempt to work backwards to the problem space. These people will need to be available for questions all the time. Remember you are taking:

  • A entire cloud environment with IAM, VPCs, security policies, Linux servers, buckets, message queues, everything else
  • Adding effectively an entirely new environment on top of that. These two systems still interact and can cause problems for each other, but they have very little visibility from one into the other. This can cause infuriatingly hard to diagnose problems.

So in order to do this correctly, you need people who understand both problem spaces and can assist with sorting where each step is happening.

So when should  I use kubernetes?

  • You are spending a ton of money on containers that are being underused or adding additional capacity to try and keep uptime during failures. K8s is really amazing at resource management and handling node or pod failure.
  • You are deeply committed to microservices. K8s meshes well with typical microservices, allowing you to bake in monitoring and tracing along with clear relationships between microservices through service meshes.
  • Multicloud is a requirement. Be super careful about making this a requirement. It's often a giant mistake.
  • Vendor lock-in is a serious concern. Only applicable if you are a Fortune 500. Otherwise don't waste brain cycles on this.
  • Uptime is critical above all. The complexity pays off in uptime.
  • You are trying to find a bridge technology between your datacenter and the cloud

Otherwise use literally one of a million other options. Starting a new company? Use Lightsail. Too big for Lightsail? Try ECS. You can grow to an extremely large size on these much simpler technologies. If you have maxed out Lightsail, good news, you are now big enough to hire a team. If you decide to go to k8s understand and map out exactly what you want from it. Don't assume turning it on and pushing out some yaml is good enough.

Planning for k8s correctly

The best guide I've seen and one I've used a few times is the NSA k8s hardening guide. It walks through all the common issues and concerns with using k8s at scale, offering good common-sense recommendations at each step. If you are an organization considering using k8s or currently using it and are unsure whether you have it correctly set up, walking through this guide and discussing the points it brings up should at least allow you to approach the project with the correct mindset.

Node.js

Node by itself I have no issues with. I have no particular dog in that race of "should we be writing the backend and frontend in the same language". No, my issue with Node is really with NPM. After working with teams relying on NPM extensively to provide both private, internal packages and to build applications using external third-party dependencies, I'm convinced it is unsafe at any speed.

NPM: Easy over Safe

For a long time application development has followed a pretty stable path in terms of dependencies. You try to write as much as possible with the standard library of your given programming language, only pulling in third-party dependencies when it saved a tremendous amount of time or when they were taking care of something normal people shouldn't be getting involved in (like hashing passwords or creating database connections).

Typically you moved these dependencies into your project for the rest of the projects life, so during a code review process there was often a debate over "does it make sense to import x lines of code forever" in order to save time or add this new functionality. Often the debate hinged on maintenance burdens, where writing functionality using only the standard library made it relatively easy to upgrade between versions.

However dependencies, especially complicated or big dependencies, introduce a lot of possible long term issues. They themselves often have dependencies, or take advantage of more esoteric aspects of a language in order to solve for problems that your project often doesn't have (since the library has a wider use case than just you).

Here's an example. I was working on a PHP application with someone else and I needed to do some junk with S3. I (of course) added the official AWS SDK for PHP, thinking this was the correct way to do things. A more experienced developer stopped me, pointing out correctly that I was adding a ton of new code to the repo forever and maybe there was a better way to do this.

curl -s https://api.github.com/repos/aws/aws-sdk-php | jq '.size' | numfmt --to=iec --from-unit=1024
190M
That's a lot of MB for just using S3

As it turns out this was relatively easy to do without loading the entire SDK into our codebase forever and introducing a lot of new dependencies. However this back and forth consumes time and often requires folks to becomes much more familiar with the services their applications consume. Client libraries for SaaS often hide this complexity behind very simple methods that allow you to quickly drop in an API key and keep going.

Rise of Node

Node emerged onto the scene with a very different promise as compared to most server-side languages. What if, instead of front and backend teams working on different applications forever waiting for new API endpoints to get added, they were maintained by the same group. Driven by these potential cost and effort savings, Node was introduce to the world and has (mostly) lived up to this core promise.

I remain impressed by the concurrency model for Node, which I feel I need to say in order to stop people from accusing me of attacking the platform simply because it is Javascript. I've worked with teams that have exclusively lived inside of Node for years, many of them have extremely positive things to say. There is certainly a part of me that is jealous of their ability to become deep experts on Javascript and using that skillset across a wide variety of tooling.

The real advantage of Node though was the speed by which teams could build complex functionality even with limited backend knowledge. The low barrier to entry to NPM meant anyone was able to very quickly take some classes and make them available to the entire internet. At first this seemed fine, with developers enjoying how easy it was to have reproducible builds across different workspaces and machines with NPM.

As time went on these endless lists of dependencies started to introduce real maintenance burdens well in excess of their value. By emphasizing speed over safety NPM (and by extension Node) has now become dangerous to recommend for anyone to use in handling customer data in virtually any capacity.

NPM Problems

My first large Node application, I was shocked when I measured the count of third-party code to code in the actual application. We had almost a million lines of external code to our several thousand lines of application code. Graphing out these dependencies was like attempting to catch a money-laundering operation. Unlike, say, the PHP applications I had worked on in the past, it was impossible to even begin to vet this many external vendors. Since they're all required, we were open to security issues from any of them.

Let's walk through how I attempted to fix the problem.

  • Obviously started with the basics like npm audit, which works but is a purely reactionary element. It relies on developers maintaining the package, reporting the issue, someone initially reporting the issue. In a Python world where I am using maybe 6 big common dependencies, that makes sense. However in NPM land when I have thousands, many of which are maintained by one person, this isn't practical
  • Fine, can we vet each package before we add it? Even assuming I had the technical ability to assess each package against some sort of "known-good" parameter, the scale of the task would require a thousand people working full-time. Look at the dependency graph for just NPM itself which, remember, is often called by the root user in many tutorials.
Mother of god. Link
  • Alright well that's an impossible task. Maybe we can at least mitigate the problem by sticking with packages that are scoped. At least then we know its from the right org and isn't the result of a typo. Turns almost almost none of the most-downloaded packages from NPM are scoped. React, lodash, express, the list goes on and on. You can see the top NPM packages here.
  • Fine, is there a way I can tell if the NPM package at least requires 2FA to upload the thing? As far as I can tell, you cannot. Package maintainers can enforce 2FA, but I can't check to see if its there.
  • Now NPM has actually taken some great steps here to ensure the package hasn't been tampered with through ECDSA which you can see here for an example: https://registry.npmjs.org/underscore/1.13.6 under the "signatures": [{    "keyid": "SHA256:{{SHA256_PUBLIC_KEY}}",    "sig": "a312b9c3cb4a1b693e8ebac5ee1ca9cc01f2661c14391917dcb111517f72370809..." section.
  • But I still have the same underlying issue. While I can verify that the person publishing the package signed it (which is great), I'm still importing thousands of them. Plus signing the package isn't required so while many of the big packages have it, many don't.

In Practice

What this means is Node apps start to "rot" at an extremely aggressive rate. While they're fast to write and get stood up, they require constant monitoring against security issues with dependency trees spanning thousands of packages. You also constantly run into nested dependency version conflicts (which NPM handles through nesting). In order to keep a Node application patched you need to be willing to override some package dependencies to get a secure version (however you don't know if the actual higher-level dependency will function with the new version or not).

Example of a common pattern in Node application dependencies

Source

You are installing the same thing over and over with slightly different versions. You need to constantly monitor the entire dependency graph and check for declared issues, then hope against hope that everyone is checking everything for you on the internet and reporting like they should. Finally you need to hope that none of these maintainers get their stuff hijacked or simply give up.

Sustainable model of Node

I'm not sure what can be done now to fix this problem. I suspect a great option might be to force packages to "flatten" the dependency space, only allowing for top-level dependencies and not allowing for nested dependencies. However at this point the train has likely left the station.

If you are looking to write safe, long-term Node code, I think your only option is to be extremely aggressive about not allowing packages into your codebase and trying to use the Node Core Modules as outlined here as much as possible. I want to praise the NPM team for switching to a more sustainable signing system and hope it gains widespread adoption (which it seems on track to). Otherwise you basically need to hope and pray that nobody in the thousands of people you rely on decides to blow up your app today.

However in the desire to demonstrate how fast it is to start writing Node by using NPM, we have doomed Node by ensuring that long-term maintenance is nearly impossible. If you fix packages with known-security issues manually and tests starting failing, it can be a massive task to untangle the ball of yarn and figure out what specifically changed.

Conclusion

All of these technologies mostly deliver real, measurable value. They are often advancements over what came before. However organizations were often misled by tutorials and demos that didn't accurately portray the realities of running this technology, instead attempting to encourage adoption by presenting the most positive light possible. I don't blame enthusiasts for this, but I think as a field we should be more honest when showing new technology to peers.

As for the maintainers of these technologies, it ultimately hurts you and your reputation when you don't frontload the complexity into either the tooling or the documentation. I'm going to hit those sharp edges eventually, so why not prepare me for it from the beginning? If at some point I'm going to need to rewrite all of my containers, why not help me do it correctly from the start? If you know I'm going to need to constantly patch my NPM dependencies, maybe add a more robust chain of dependencies so I can audit what is coming in and from where.

Ultimately we help each other by being more honest. I want you to use my tool and get the maximum value out of it, even if the maximum value for you is not to use my tool.


The Book of CP-System

The Book of CP-System

I adore classic arcade machines. They're interested to look at, fun to play and designed for a different era of hardware and software design. While I've certainly spent some time poking around some of the FPGA code for classic arcade cores, I'm still pretty novice when it comes to the low level technical specifications for these machines. The brand that I have the most fond memories of is the old Capcom machines.

Fabian Sanglard has written just this amazing book on the internals of those early Capcom classics like Street Fighter 2. Going into extreme detail on the legendary CPS-1 board, you get more information than I've ever seen before on how it worked, how arcades at the time overall worked and every gritty detail in-between. It's a joy to read with amazing art and diagrams.

Come on, you are curious about how this all works. 

The book is available for a pay what you want at his website. His system for generating the book is also pretty clever with the source code available here.

There's also a great post about the PS2. Both were a lot of fun to read. It's always a blast to learn more about these machines that I spent hundreds of hours with as a kid. Link to Playstation 2 Architecture


TIL About a comparison table of Google internal tools to external products

I've never worked at Google, but I've been impressed with a lot of the data manipulation and storage tools inside of GCP like BigQuery and Cloud Spanner. If you ever worked at Google, this might be a great way to find a replacement for an internal tool you loved. For the rest of us, it's a fascinating look at the internal technology stack of Google.

You can find the comparison table of services here.


Battle of the Text Editors

Battle of the Text Editors

I have always liked text editors. There is something promising about them, a blank canvas that you can quickly mess up with your random commented out lines. I went from BBEdit, my favorite Mac app of all time (seriously I love this app so much I bought a tshirt with their logo on it which you can get here) to Vim. I still use BBEdit all the time, mostly as a clipboard style storage location, notes or just outlining tasks. Vim + tmux has been my go-to now for 5+ years.

Now I know this is a contentious topic. If you like using a full IDE (Integrated Development Environment) know that I am not coming for you. It's not my preference but it is entirely valid and right. People interested in an ideological fight over this issue should remember you are asking people to spend 8 hours a day using this thing. There isn't a right or wrong answer to "what tool should you be using", it's more like picking a chair or a mattress. That said, I'm not qualified to really speak on what is the best IDE (although according to people I work with its VS Code with plugins by a mile).

However with the rise of Rust there has been an explosion of new entries into the text editor space, with a lot of cool and novel ideas. I thought it would be fun to download some and try them out for a few hours each to get some actual work done. These won't be incredibly detailed reviews, but if there is interest I'm more than happy to go back and do a deeper dive into any individual one.

Should I use any of these to pay my bills?

That's up to you, but I will outline a few common recommendations below that I've used extensively for mission critical work along with why I like them. If you are just here looking for what should be your daily workhorse, it would be safer to pick from those. However a lot of the projects I'll discuss later are certainly safe to use, just might not have all the bells and whistles.

Here are some of the tools I've used for at least a few months in a row as my most-used applications along with any tips or add-ons I liked.

If you are ok with a learning curve

  • Vim. I understand Neovim is better, I still use standard Vim. I have not encountered any issues in 5+ years of daily use that would prompt me to change. However if you are just starting go with Neovim. At some point I'm sure I'll find some issue that causes me to switch.  

    Vim plugins: My former coworkers at Braintree maintain the best Vim dotfiles that I have used at every job since which you can grab here.

    Tutorial: I learned with Vim Golf which was actually pretty fun, but I've also heard a lot of good things about Vimified. Either should be fine, also Vim has a built-in tutorial which is more than enough to get you started. If you install Vim from Homebrew or from a linux package, just run vimtutor in your terminal, go through it and you'll have what you need to get started.
  • Tmux. You typically want to leave Vim running in something like Tmux just so you can keep multiple things running at the same time and switch between them. My workflow looks something like: a dev box I am SSH'd into, Vim open and then maybe docs or a man page. I use this for my Tmux config.

You just need something right now

What about Emacs?

I don't really know it well enough to speak to it. I'm open to running through a tutorial but I just never worked at a place where its use was widespread enough to justify sitting down and learning it. If there's a good tutorial though hit me up on Twitter and I'm glad to run through it. Link

I'm here for the new stuff, not this Lifehacker 2011 list.

Look sometimes people stumble on these posts who shouldn't be reading the dribble I write and send me comments on Twitter. Disclaimer over, from here on we're here for the new and the weird.

Amp

Doesn't get more simple than that. Not loving the old copyright date.

Amp is a text editor with a focus on "batteries included". You open it with: amp [dir | file1 file2 ...] after installing it. First day using it, the big value is a great  file finder. Hit space after opening it, Amp indexes all the files in the path and allows you to very quickly search among them.

What takes a second to get used to is that this isn't a fuzzy file finder. Instead of typing full words, use parts of the path , separated by spaces. Here's what it looks like in practice:

I will say accuracy is spot on

This takes a second to get used to, but very quickly I felt like I was searching with more accuracy than ever. Weirdly by default this didn't visually scale, so even if there were more results I didn't see them. You can change this in the config file documented here. I feel like 5 results is kind of crazy low as a default

Typical text movement is , and m keys for scroll up and down. Movement is the  h,j,k,l  commands are there, along with w,b for word. Most of your interactions though will be through the jump mode, which is weird as hell.

What it seems to be doing is going through your file and changing the beginning of the word to be a two letter code that is unique. So you can very quickly jump to words. Now this would make sense if it prefixed the words with the characters, but on my installation it replaced the first two characters. This makes files pretty much impossible to work on.

Unfortunately, even though I really enjoyed the file search, this was the end of my experiment with Amp. It isn't clear whether the tool is still being updated and since so much of the mobility relies on Jump, this messing up the first two characters of every word makes my workflow impossible. I didn't see anything obviously wrong in my installation and I did install from source so I'm not sure what else could be done on my end to resolve this.

I'm really disappointed. I liked what I saw of Amp and I feel like there are some good ideas there. I love the focus on file management and searching. I'll keep an eye on the repo and revisit this if something new comes up.

Survived: .5 hours of my production work

Helix

Helix is maybe the first Rust application I've used where I had to look and see "wait this isn't Node right?" In case you think I'm being ridiculous:

Damn giving me that NPM energy

However upon starting Helix you instantly see why it's including every Rust crate. This is a robust text editor, aimed squarely at the Vims/Emacs of the world. The basic concept is one of multiple cursors, so you are attempting to remove one of the steps from Vim. Overall the goal is a batteries included text editor that handles syntax in a slightly different way.

Syntax Tree

One feature touted in Helix is an integration with Tree-sitter. The idea is to get a more concurrent representation of the structure of your code. You've seen this before, but it's not in Vim and is still an experimental feature in Neovim. It includes built-in support for a lot of languages and in my testing works well, allowing you to search through the tree and not just search text. Basically you can now select nodes vs raw text.

Language Server

Similar to Neovim (and available on Vim 8 through a plugin), there is language server support. I typically lump this more into IDE features but it is a nice touch, especially for folks who have gotten used to it. I found myself using it quite a bit, which is unusual since I don't have it installed in my Vim setup. It turns out I might have been wrong about language servers all these years. You can see the languages supported here.

Enough of the features, how does it work?

I walked into using Helix expecting not to like it. I was completely wrong. Everything about it is a complete delight to use in a way that is so effortless that I wondered in the back of my mind "have I actually used this product before and forgotten?" You start it with hx on the command line and access all commands with :. On first startup, I was already up and running, editing work files quickly.

Minute to minute it does feel a lot like Vim, which for me is a huge perk but might not be for you. The basic flow is reversed though, so instead of action -> object it's object -> action. To delete a word in Vim is dw, delete word. In Helix it is wd. There is a fair amount of muscle memory for me around this, so it isn't an instant change. However the payoff is pretty massive.

Multiple Cursors

You can obviously manipulate text on a global scale with Vim. Search for a word then edit the second selection with cgn. You can then iterate through the text with the . tool. For more complex situations you can run macros over the entire file. While this functionality exists, Helix with the multiple cursors makes it easier to pick up on. Go to the beginning of the text, hit the %, then s and you have multiple cursors to manipulate text.

Trying to help all the time

Hitting g in a file brings up all the goto commands, which is really nice when you are starting.

You'll also notice the language server commands called out here with d. This allows you to jump to the definition as determined by LSP. By default this didn't work with Terraform, but it provided a good chance to try out adding one. There was a slight issue with hcl not being included in the GitHub list of supported languages but it was on the list of languages in their docs. GitHub link here.

Anyway it looks like it uses the standard Terraform language server, so nbd. For some reason the language docs don't link to the language server they mean, but I'm glad to go through and make a PR for that later. We install the Terraform language server, run :config-open and write out a new language entry. Except maybe we don't need to?

Here is where it gets a little bit confusing. My initial read of the documentation here suggested that I need to write a new config to get the language server to work. However that actually isn't the case. I just happened to run hx --health and bam, Terraform is set up. I confirmed yes it does in fact just work with the default config. The Helix team explains I don't need to do anything here but I must have missed it.

Anyway, if its on the list just install the language server and then confirm with hx --health that it has picked it up.

Sure enough it just works

In terms of theme I actually like the purple, but that's extremely customizable. Find those docs here. However they also include most of the normal terminal themes in the box, meaning you can try out whatever moves you with :themes. Here's the ones that come with it:

Just type what you want and the theme instantly switches:

Large Files

Sometimes I'm a bad Vim user. I will admit that I, from time to time, will open giant log files and attempt to search them in Vim vs doing that I should do and use a tool for searching large bodies of txt. Typically Vim will chug while doing this, which is totally understandable, but I was curious how will Helix hold up.

Sample file is: 207M Sep  4 21:19 giant_file.txt and opening it in Vim does cause noticeable slow-downs. Helix opens instantly, like zero visible change to me from a tiny file. Making an even larger file, 879M Sep  4 21:22 huge_file.txt, I decide to test again. BEFORE YOU PING ME ON TWITTER: I know this is because I have a lot of plugins. Helix also comes with a lot of stuff, so I don't consider this to be that insane of a test. Don't agree? Run your own tests.

With the 800M file Vim absolutely chugs, taking forever to show the first line of text. Helix is once again instant. Everything works really quickly and if I didn't know the file was huge, it wouldn't occur to me. I'm super impressed.

Survived: 20 hours of my production work but really could be endless

I didn't walk into this expecting to find a Vim replacement for me, but Helix is really compelling. It's fast, the extension-first mentality shines since the editor comes ready to get to work, everything is really discoverable in a way that just isn't true with Vim. An expert might know how to do everything Helix does in Vim, but there's no way a new user would be able to find out that information as easily in Vim as you can in Helix.

Progress on the editor is steady and impressive, but even in the current state I didn't find a lot to dislike. Performance was extremely stable testing on MacOS Intel and M1 along with Linux ARM and x86. If you are interested in the terminal editor life but want something with a lot more of the modern features built in, Helix is a world class product. The team behind it should be super proud.

If there is interest I would be happy to do a longer in-depth thing about Helix. Just let me know on Twitter.

Kibi

How much text editor can you get into less than 1024 lines of code? More than you might think! Kibi is an interesting project, inspired by kilo. It's a pretty basic text editor, but with some nice features. Think of it more like nano as compared to Helix. It aspires to do a bit less but really exceeds in what it sets out to do.

Most of the main functionality is visible on the page, but you can set syntax highlighting manually by following the steps here. It is unfortunately missing a few features I would love, things like displaying the tree of a directory if you pass it a directory. Instead passing it a directory throws an IO error, which is understandable.

If you are just starting out and looking for a simple text editor, really easy to recommend this one. For people especially who work with one type of file a lot, it's very simple to add syntax highlighting. My hope would be at some point a community collection of common syntax highlighting is added along with some file path support. Unfortunately without those things, there isn't that much for me to test.

Some of the shortcuts didn't work for me either. CTRL-D duplicated but CTRL-R didn't seem to remove the line. The one feature I REALLY liked was CTRL-E, which allows you to execute a shell command and then inputs the value into the line with the cursor. This is really great for a lot of sysadmin tasks where you are switching between tmux panes, putting the output of a command into the file.

This little tool would really get to the next level with just a few adjustments. Maybe add CTRL+arrow keys for jumping to the next word, some common highlighting options. However it's not bad for a nano replacement, which is pretty high praise considering the legacy of that app.

Survived: 1 hour of my production work but not a totally fair comparison

Lapce

Now before you get into it with me, this is a BIT of a stretch. However because it supports modal editing I'm going to consider it. I was mostly curious what does a GUI Rust app look like in a text editor.

My first impression was....mixed.

I like the terminal at the bottom, but there didn't seem to be window resizing and it didn't support terraform (which is fine). However most baffling was after I selected this directory as my project I couldn't figure out how to get out of it. I know that's ironic as a Vim user complaining that I couldn't escape a project but I was a little baffled. Where is the file explorer?

Blue icon connects you to an SSH host

It's actually the file name drop-down centered at the top that is where you can navigate to other projects. You use : and you can access all sorts of commands within the editor.

Once you see it, it makes sense but it was a little baffling at first. 

This would be one of the many UI things I ran into which I thought was unusual. The app doesn't follow most of the MacOS conventions, a strange choice I see a lot with new desktop applications using Rust frameworks. Settings isn't where it should be, under the App name -> Preferences.

Instead it is under a gear icon in the top right corner.

Settings has everything you need but again not very MacOS, not really following normal layout. It's all here though.

Once I opened a codebase that the editor supported, it seemed to work fine. To be honest it's roughly the same experience as whenever I open an IDE. This one feels very snappy and I think there's a lot of good promise here, but it doesn't feel particulally Vim-like.

The primary value of Lapce, from what I could tell, is that it has a great SSH integration, just click the Blue cloud icon and enter the ssh path of username@hostname and it worked great. It also does really well with big projects, lots of text, not slowing down when processing them. However since I don't really work in big codebases like that, the SSH integration is nice but I'm not really able to put it through its paces.

However if you do work with a lot of big files and find other IDEs sluggish, it's probably worth trying. Once you get a handle on the UI it all works pretty well.

Survived: 8 hours

I think if I was more of an IDE person and willing to put in the time Lapce could work fine for my workflow. I just didn't see anything that really spoke to me that would justify the time invested.

Conclusion

It's really an exciting time to be a programmer for a lot of reasons. We have more mechanical keyboards, more monitors and more text editors. But in all seriousness I love that people are out there making this stuff and I think all of them should be proud. It's really exciting to be able to offer people new options in this space instead of repeating the five or so dominant platforms.

If you are looking for something new to try right now, Helix is really impressive and I think hard to beat.

I strongly encourage people to download these, try them out, or PLEASE let me know if there is a good one I missed. I'm more than happy to do a review of another text editor at pretty much any time. You can find me on Twitter.


Grief and Programming

Grief and Programming
Once again I asked, “Father, I want to know what a dying person feels when no one will speak with him, nor be open enough to permit him to speak, about his dying.”
The old man was quiet, and we sat without speaking for nearly an hour. Since he did not bid me leave, I remained. Although I was content, I feared he would not share his wisdom, but he finally spoke. The words came slowly.
“My son, it is the horse on the dining-room table. It is a horse that visits every house and sits on every dining-room table—the tables of the rich and of the poor, of the simple and of the wise. This horse just sits there, but its presence makes you wish to leave without speaking of it. If you leave, you will always fear the presence of the horse. When it sits on your table, you will wish to speak of it, but you may not be able to.
The Horse on the Dining-Room Table by Richard Kalish

The Dog

I knew when we walked into this room that they didn't have any good news for me. There's a look people have when they are about to give you bad news and this line of vets staring at me had it. We were at Copenhagen University Animal Hospital, the best veterinary clinic in Denmark. I had brought Pixel, my corgi here, after a long series of unlikely events had each ended with the worst possible scenario. It started with a black lump on his skin.

We had moved Pixel from Chicago to Denmark, a process that involved a lot of trips to the vet, so when the lump appeared we noticed quickly. A trip to our local vet reassured us though, who took a look, prescribed antibiotics and told us to take him home and keep it clear. It didn't take long for us to realize something worse was happening. He kept scratching the thing open, dripping blood all over the house. My wife and I would wrestle with him in the bathroom every night after we put our newborn baby to bed, trying to clean the wound with the angled squirt bottle we had purchased to help my wife heal after childbirth. He ended up bandaged up and limping around, blood slowly seeping through the bandages.

After a few days the wound started to smell and we insisted the vet take another look. She glanced at it, proclaimed it was extremely seriously and had us to go to the next tier of care. They took care of the lump but told me "we are sure it is cancer" after taking a look at a sample of what they removed. I did some research and was convinced that, yes while it might be cancer, the chances of it being dangerous were slim. It wasn't uncommon for dogs to develop these lumps and since it had been cleanly removed with good margins, we had a decent chance.

Except we didn't. This nice woman, the professor of animal oncology looked at me while Pixel wandered around the room glancing suspiciously at the vet students. I marveled how calm they seemed as she delivered the news. He wasn't going to get better. She didn't know how much time I had. It wasn't going to be a long time. "Think months, not years" she said trying to reassure me that I at least had some time.

For a month after I was in a fog, just trying to cope with the impending loss of my friend. This vision I had that my daughter would know this dog and get to experience him was fading, replacing with the sad reality of a stainless steel table and watching a dog breath his last breath, transforming into a fur rug before your eyes. It was something I had no illusions about, having just done it with our cat who was ill.

Certainly, this would be the worst of it. My best friend, this animal that had come with me from Chicago to Denmark and kept me company alone for the first month of my time in the country. This dog who I had purchased in a McDonald's parking lot from a woman who assured me he "loved cocktail wieners and Law and Order", this loss would be enough. I could survive this and recover.

I was wrong though. This was just the beginning of the suffering.

Over the next three months I would be hit with a series of blows unlike anything I've ever experienced. In fast succession I would lose all but one of my grandparents, my brother would admit he suffered from a serious addiction to a dangerous drug and my mother would get diagnosed with a hard-to-treat cancer. You can feel your world shrink as you endlessly play out the worst case scenario time after time.

I found myself afraid to answer the phone, worried that this call would be the one that gave me the final piece of bad news. At the same time, I became desperate for distractions. Anything I could throw myself into, be it reading or writing or playing old PS1 videogames and losing over and over. Finally I hit on something that worked and I wanted to share with the rest of you.

Programming is something that distracted me in a way that nothing else did. It is difficult enough to require your entire focus yet rewarding enough that you still get the small bursts of "well at least I got something done". I'll talk about what I ended up making and what I'm working on, some trial and error I went through and some ideas for folks desperate for something, anything, to distract them from the horrors.

Since this is me, the first thing I did was do a ton of research. I don't know why this is always my go-to but here we are. You read the blog of an obsessive, this is what you get.

What is Grief?

At a basic level, grief is simply the term that indicates one's reactions to loss. When one suffers a large loss, you experience grief. More specifically grief refers to the heavy toll that weighs on the bereaved. We often discuss it in the context of emotions, the outpouring of feelings that is the most pronounced sign of grief.

In popular culture grief is often presented as 5 stages which include denial, anger, bargaining, depression and acceptance. These are often portrayed as linear, with each person passing through one to get to the other. More modern literature presents a different model. Worden suggested in 2002 that we think of grief as being a collection of tasks, rather than stages. The tasks are as follows:

  1. Accept the reality of the loss
  2. Work through the pain of grief
  3. Adjust to an environment in which the deceased is missing
  4. To emotionally relocate the deceased and move on with life.

One thing we don't discuss a lot with grief is the oscillation between two different sets of coping mechanisms. This is something Stroebe and Schut (1999) discussed at length, that there is a loss oriented coping mechanism and a restoration oriented coping mechanism for loss.

What this suggests is that, contrary to some discourse on the topic, you should anticipate needing to switch between two different mental modes. There needs to be part of you that is anticipating the issues you are going to face, getting ready to be useful for the people you care about who are suffering.

Aren't these people still alive?

Yes, so why am I so mentally consumed with thinking about this? Well it turns out there is a term for this called "anticipatory grief". First introduced by Lindemann (1944), it's been a topic in the academic writings around grief ever since. At a basic level it is just grief that happens before but in relation to impending death. It is also sometimes referred to as "anticipatory mourning".

One thing that became clear talking to people who have been through loss and death and experienced similar feelings of obsession over the loss before it happened is it doesn't take the place of actual grieving and mourning. This shook me to my core. I was going to obsess over this for years and it was going to happen, then I was going to still go through the same process.

The other thing that made me nervous was how many people reported chronic grief reactions. These are the normal stages of grief, only they are prolonged in duration and do not lead to a reasonable outcome. You get stuck in the grieving process. This is common enough that it is being proposed as a new DSM category of Attachment Disorders. For those that are curious, here are the criteria:

  • Criterion A. Chronic and disruptive yearning, pining, and longing for the deceased
  • Criteria B. The person must have four of the following eight remaining symptoms at least several times a day or to a degree intense enough to be distressing and disruptive:
  1. Trouble accepting the death
  2. Inability to trust others
  3. Excessive bitterness or anger related to the death
  4. Uneasiness about moving on
  5. Numbness/detachment
  6. Feeling life is empty or meaningless without the deceased
  7. Envisioning a bleak future
  8. Feeling agitated
  • Criterion C. The above symptom disturbance must cause marked and persistent dysfunction in social, occupational, or other important domains
  • Criterion D. The above symptom disturbance must last at least six months

Clearly I needed to get ready to settle in for the long haul if I was going to be of any use to anyone.

After talking to a support group it was clear the process really has no beginning or end. They emphasized getting a lot of sleep and trying to keep it as regular as possible. Don't end up trapped in the house for days not moving. Set small goals and basically be cautious, don't make any giant changes or decisions.

We talked about what a lot of them did and much of the discussion focused around a spiritual element. I'm not spiritual in any meaningful way. While I can appreciate the architecture of a church or a place of worship, seeking a higher power has never been a large part of my life. For the less religious a lot of them talking about creative outlets. These seemed more reasonable to me. "You should consider poetry" one man said. I didn't want to tell him I don't think I've ever really read poetry, much less written it.

What even is "focus" and "distractions"

I'm not going to lie. I figured I would add this section as a fun "this is how distractions and focus works in the human mind" with some graphs and charts. After spending a few days reading through JSTOR I had no such simple charts or graphs. The exact science behind how our minds work, what is focus and what is a distraction is a complicated topic so far beyond my ability to give it justice that it is laughable.

So here is my DISCLAIMER. As far as I can tell, we do not know how information is processed, stored or recalled in the brain. We also aren't sure how feelings or thinking even works. To even begin to answer these questions, you would need to be an expert in a ton of disciplines which I'm not. I'm not qualified to teach you anything about any topic related to the brain.

Why don't we know more about the human brain? Basically neurons are extremely small and the brain is very complicated. Scientists have been working on mapping the brain for 40 years with smaller organisms with some success. The way scientists seem to study this is: define the behavior, identify neurons involved in that behavior, finally agitate those neurons to see their role in changing the behavior. This is called "circuit dynamics". You can see a really good write-up here.

As you add complexity to the brain being studied, drawing these connections get more and more complicated. Our tools have improved as well, allowing us to tag neurons with different genes so they show up as different colors when studied to better understand their relationship in more complex creatures.

The retinal ganglion cells of a genetically modified mouse to demonstrate the complexity of these systems through the Brainbow technique. 

Anyway brains are really complicated and a lot of the stuff we repeat back to each other about how they work is, at best, educated guesses and at worst deliberately misleading. The actual scientific research happening in the field is still very much in the "getting a map of the terrain" step.

Disclaimer understood get on with it

Our brains are constantly being flooded with information from every direction. Somehow our mind has a tool to enable us to focus on one piece of information and filter out the distractions. When we focus on something, the electrical activity of the neocortex changes. The neurons there stop signalling in-sync and start firing out of order.

Why? Seemingly to allow us to let the neurons respond to different information in different ways. This means I can focus on your voice and filter out the background noise of the coffee shop. It seems the cholinergic system (don't stress I also had never heard of it) plays a large role in allowing this sync/out of sync behavior to work.

It turns out the cholinergic system regulates a lot of stuff. It handles the sensory processing, attention, sleep, and a lot more. You can read a lot more about it here. This system seems to be what is broadcasting to the brain "this is the thing you need to be focused on".

So we know that there is a system, based in science, to how focus and distraction work. However that isn't the only criteria that our minds use to judge whether a task was good or bad. We also rely on dopamine source floating around in our skulls to tell us "was that thing worth the effort we put into it". In order for the distraction to do what I wanted it had to be engrossing enough to trigger my brain into focusing on it and also giving me the dopamine to keep me invested.

The Nature of the Problem

There is a reality when you start to lose your family that is unlike anything I've ever experienced. You are being uprooted, the world you knew before is disappearing. The jokes change, the conversation change, you and the rest of your family have a different relationship now. Grief is a cruel teacher.

There is nothing gentle about feeling these entities in your life start to fade away. Quickly you discover that you lack the language for these conversations, long pauses on the phone while you struggle to offer anything up that isn't a glib stupid greeting card saying.

Other people lack the language too and it is hard to not hate them for it. I feel a sense of obligation to give the people expressing their sadness to me something, but I already know as I start talking that I don't have it. People are quick to offer up information or steps, which I think is natural, but the nature of illness is you can't really "work the problem". The Americans obsess on what hospital network she got into, the Danes obsess on how is she feeling.

You find yourself regretting all the certainty you had before. There was an idea of your life and that idea, as it turns out, had very little propping it up. For all the years invested in one view of the universe, this is far bigger. I'm not in it yet, but I can see its edges and interacting with people who are churning in its core scares me. My mind used to wander endlessly, but no more. Left to its own devices it will succumb to intense darkness.

Denial helps us to pace our feelings of grief. There is a grace in denial. It is nature's way of letting in only as much as we can handle.
Elisabeth Kubler-Ross

Outside of the crisis you try and put together the memories and pieces into some sort of story. Was I good to Pixel? Did I pay enough attention to the stories my grandparents told me, the lessons and the funny bits? What kind of child was I? You begin to try and put them together into some sort of narrative, some way of explaining it to other people, but it falls apart. I can't edit without feeling dishonest. Without editing it becomes this jumble of words and ideas without a beginning or end.

I became afraid of my daughter dying in a way I hadn't before. Sure I had a lot of fear going into the birth process and I was as guilty as anyone of checking on her breathing 10 times a night at first. The first night with her in the hospital, me lying there on a bed staring at the ceiling while our roommate sobbed through the night, her boyfriend and baby snoozing away peacefully, I felt this fear for the first time. But I dealt with it, ran the numbers, looked at the stats, did the research.

Now I feel cursed with the knowledge that someone has to be the other percent. Never did I really think I was blessed, but I had managed to go so long without a disaster that it felt maybe like I would continue to skate forever. Then they hit all at once, stacked on each other. Children can die from small falls, choking on reasonable pieces of food or any of a million other things. Dropping her off at daycare for the first time, I felt the lack of control and almost turned around.

You can't curse your children to carry your emotional weight though. It is yours, the price of being an adult. My parents never did and I'm not going to be the first. A lot of stuff talks about solving grief but that's not even how this really works. You persist through these times, waiting for the phone call you know is coming. That's it, there isn't anything else to do or say about it. You sit and you wait and you listen to people awkwardly try to reassure you in the same way you once did to others.

Why Programming?

Obviously I need something to distract me. Like a lot of people in the tech field, I tend to hobby-bounce. I had many drawers full of brief interests where I would obsess over something for a month or so, join the online community for it, read everything I could find about it and then drop it hard. When I moved from Chicago to Denmark though, all that stuff had to go. Condense your life down to a cargo container and very soon you need to make hard choices about what comes and what gets donated. This all means that I had to throw out any idea which involved the purchase of a lot of stuff.

Some other obvious stuff was out unfortunately. Denmark doesn't have a strong volunteering movement, which makes sense in the context of "if the job is worth doing you should probably pay someone to do it". My schedule with a small child wasn't amendable to a lot of hours spent outside of the house with the exception of work.

My list of requirements looked like this:

  • Not a big initial financial investment
  • Flexible schedule
  • Gave me the feeling of "making something"
  • Couldn't take up a lot of physical space

How to make it different from work

From the beginning I knew I had to make this different from what I do at work. I write code at work, I didn't want this to turn into something where I end up working on stuff for work at all hours. So I laid out a few basic rules to help keep the two things separate from get.

  • Use the absolute latest bleeding beta release of the programming language. Forget stability, get all the new features.
  • Don't write tests. I love TDD but I didn't want this to be "a work product".
  • Do not think about if you are going to release it into the public. If you want to later, go for it, but often when I'm working on a small fun project I get derailed thinking "well I need tests and CI and to use the correct framework and ensure I have a good and simple process for getting it running" etc etc.
  • Try to find projects where there would be a good feedback loop of seeing progress.
  • If something isn't interesting, just stop it immediately.

I started with a tvOS app.The whole OS exists to basically play video content, it is pretty easy to write and test and plus it is fun. Very quickly I felt it working, this distraction where I felt subconsciously I was allowed to be distracted because "this is productive and educational". It removed the sense of guilt I had when I was playing videogames or reading a book, tortured by a sense of "you should be focusing more on the members of your family that are going through this nightmare".

The tutorial I followed was this one. Nothing but good things to say about the experience, but I found a lot of the layout stuff to not really do much for me. I've never been that interested in CSS or web frontend work, not that it isn't valuable work I just don't have an eye for it. On to the next thing then.

From there I switched to GUI apps based in large part on this post. I use CLI tools all the time, it might be nice to make simple Python GUIs of commonly used tools for myself. I started small and easy, since I really wanted to get a fast feeling of progress.

The only GUI framework I've ever used before is tkinter which is convenient because it happens to be an idiot-proof GUI design. Here's the basic Hello World example:

from tkinter import *
from tkinter import ttk
root = Tk()
frm = ttk.Frame(root, padding=10)
frm.grid()
ttk.Label(frm, text="Hello World!").grid(column=0, row=0)
ttk.Button(frm, text="Quit", command=root.destroy).grid(column=1, row=0)
root.mainloop()

Immediately you should see some great helpful hints. First it's a grid, so adding or changing the relative position of the label is easy. Second we have a lot of flexibility with things that can sometimes be a problem with GUIs like auto-resizing.

Adding window.columnconfigure(i, weight=1, minsize=75) and window.rowconfigure(i, weight=1, minsize=50) allows us to quickly add resizing and by changing the weights we can change which column or row grows at what rate.

For those unaware of how GUI apps work, it's mostly based on Events and Event Handlers. Sometimes happens (a click or the keyboard is pressed) and you execute something as a result. See the Quit button above for a basic example. Very quickly you can start making fun little GUI apps that do whatever you want. The steps look like this:

  1. Import tkinter and make a new window
import tkinter as tk

window = tk.Tk()

window.title("Your Sweet App")

window.resizable(width=False, height=False)

Make a widget with a label and assign it to a Frame
start = tk.Frame(master=window)
output_value = tk.Entry(master=start, width=10)
input = tk.Label(master=start, text="whatever")

Set up your layout, add a button to trigger an action and then define a function which you call on the Button with the command argument.

Have no idea what I'm talking about? No worries, I just did this tutorial and an hour later I was back up and running. You can make all sorts of cool stuff with Ttk and Tkinter.

Defining functions to run basically shell commands and output their value was surprisingly fun. I started with real basic ones like host.

  1. Check to make sure it exists with shutil.which link
  2. Run the command and format the output
  3. Display in the GUI.

After that I made a GUI to check DKIM, SPF and DMARC. Then a GUI for seeing all your AWS stuff and deleting it if you wanted to. A simple website uptime checker that made a lambda for each new website you added in AWS and then had the app check an SQS queue when it opened to confirm the status. These things weren't really useful but they were fun and different.

The trick is to think of them like toys. They're not projects like you have at work, it's like Legos in Python. It's not serious and you can do whatever you want with them. I made a little one to check if any of the local shops had a bunch of hard to find electronics for sale. I didn't really want the electronics but it burned a lot of time.

Checking in

After this I just started including anything random I felt like including. Config files in TOML, storing state in SQLite, whatever seemed interesting. All of it is super easy in Python and with the GUI you get that really nice instant feedback. I'm not pulling containers or doing anything that requires a ton of setup. It is all super fast. This has become a nice nightly ritual. I get to sit there, open up the tools that I've come to know well, and very slowly change these various little GUIs. It's difficult enough to fully distract, to suck my whole mind into it.

I can't really tell you what will work for you, but hopefully this at least provided you with some idea of something that might work for you. I wish I had something more useful to say to people reading this suffering. I'm sorry. It seems the only sentiment I have heard that is honest.


Consider using Kubernetes ephemeral debug container

I'll admit I am late to learning these exist. About once a month, maybe more, I need to spin up some sort of debug container inside of Kubernetes. It's usually for something trivial like checking networking or making sure DNS isn't being weird and it normally happens inside of our test stack. Up until this point I've been using a conventional deployment, but it turns out there is a better option. kukectl debug

It allows you to attach containers to a running pod, meaning you don't need to bake the troubleshooting tools into every container (because you shouldn't be doing that). The Kubernetes docs go into a lot more detail.

In combination with this, I found some great premade debug containers. Lightrun makes these Koolkits in a variety of languages. I've really found them to be super useful and well-made. You do need Kubernetes v1.23 or above, but the actual running is super simple.

kubectl debug -it  --image=lightrun-platform/koolkits/koolkit-node --image-pull-policy=Never --target=

In addition to node they have Golang, Python and JVM. With Python it's possible to debug most networking problems with simple commands. For example, a DNS lookup is just:

import socket

socket.getaddrinfo("matduggan.com")

`

I regularly use the socket library for tons of networking troubleshooting and ensuring ports and other resources are not blocked by a VPC or other security policy. You can find those docs here.

For those in a datacenter consider using the excellent netmiko library.


Programming in the Apocalypse

A climate change what-if 

Over a few beers...

Growing up, I was obsessed with space. Like many tech workers, I consumed everything I could find about space and space travel. Young me was convinced we were on the verge of space travel, imagining I would live to see colonies on Mars and beyond. The first serious issue I encountered with my dreams of teleporters and warp travel was the disappointing results of the SETI program. Where was everyone?

In college I was introduced to the Great Filter. Basically a bucket of cold water was dumped on my obsession with the Drake equation as a kid.

When I heard about global warming, it felt like we had plenty of time. It was the 90s, anything was possible, we still had time to get ahead of this. Certainly this wouldn't be the thing that did our global civilization in. We had so much warning! Fusion reactors were 5-7 years away according to my Newsweek for kids in 1999. This all changed when I met someone who worked on wind farm design in Chicago who had spent years studying climate change.

She broke it down in a way I'll never forget. "People talk about global warming in terms of how it makes them feel. They'll discuss being nervous or unsure about it. We should talk about it like we talk about nuclear weapons. The ICBM is on the launch pad, it's fueled and going to take off. We probably can't stop it at this point, so we need to start planning for what life looks like afterwards."

The more I read about global warming, the more the scope of what we were facing sunk in. Everything was going to change, religion, politics, economics and even the kinds of food we eat. The one that stuck with me was saffron for some odd reason. The idea that my daughter might not get to ever try saffron was so strange. It exists, I can see it and buy it and describe what it tastes like. How could it and so many other things just disappear? Source

To be clear, I understand why people aren't asking the hard questions. There is no way we can take the steps to start addressing the planet burning up and the loss of so much plant and animal life without impacting the other parts of our lives. You cannot consume your way out of this problem, we have to accept immense hits to the global economy and the resulting poverty, suffering and impact. The scope of the change required is staggering. We've never done something like this before.

However let us take a step back. I don't want to focus on every element of this problem. What about just programming? What does this field, along with the infrastructure and community around it, look like in a world being transformed as dramatically as when electric light was introduced?

In the park a few nights ago (drinking in public is legal in Denmark), two fellow tech workers and I discussed just that. Our backgrounds are different enough that I thought the conversation would be interesting to the community at large and might spark some good discussion. It was a Danish person, me an American and a Hungarian. We ended up debating for hours, which suggests to me that maybe folks online would end up doing the same.

The year is 2050

The date we attempted to target for these conversations is 2050. While we all acknowledged that it was possible some sort of magical new technology comes along which solves our problems, it felt extremely unlikely. With China and India not even starting the process of coal draw-down yet it doesn't appear we'll be able to avoid the scenario being outlined. The ICBM is going to launch, as it were.

It's quite a bit hotter

Language Choice

Language popularity based on GitHub as of Q4 2021 Source

We started with a basic question. "What languages are people using in 2050 to do their jobs". Some came pretty quickly, with all of us agreeing that C, C#, PHP were likely to survive in 2050. There are too many systems out that rely on those languages and we all felt that wide-scale replacement of existing systems in 2050 was likely to slow or possibly stop. In a world dealing with food insecurity and constant power issues, the eeappetite to swap existing working systems will, we suspect, be low.

These graphs and others like it suggest a very different workflow from today. Due to heat and power issues, it is likely that disruptions to home and office internet will be a much more common occurrence. As flooding and sea level rise disrupts commuting, working asynchronously is going to be the new norm. So the priorities in 2050 we think might look like this:

  • dependency management has to be easy and ideally not something you need to touch a lot, with caching similar to pip.
  • reproducibility without the use of bandwidth heavy technology like containers will be a high priority as we shift away from remote testing and go back to working on primarily local machines. You could also just be more careful about container size and reusing layers between services.
  • test coverage will be more important than ever as it becomes less likely that you will be able to simply ping someone to ask them a question about what they meant or just general questions about functionality. TDD will likely be the only way the person deploying the software will know whether it is ready to go.

Winners

Based on this I felt like some of the big winners in 2050 will be Rust, Clojure and Go. They mostly nail these criteria and are well-designed for not relying on containers or other similar technology for basic testing and feature implementation. They are also (reasonably) tolerant of spotty internet presuming the environment has been set up ahead of time. Typescript also seemed like a clear winner both based on the current enthusiasm and the additional capability it adds to JavaScript development.

There was a bit of disagreement here around whether languages like Java would thrive or decrease. We went back and forth, pointing to its stability and well-understood running properties. I felt like, in a world where servers wouldn't be endlessly getting faster and you couldn't just throw more resources at a problem, Clojure or something like it would be a better horse. We did agree that languages with lots of constantly updated packages like Node with NPM would be a real logistical struggle.

This all applies to new applications, which we all agreed probably won't be what most people are working on. Given a world flooded with software, it will consume a lot of the existing talent out there just to keep the existing stuff working. I suspect folks won't have "one or two" languages, but will work on wherever the contracts are.

Financial Support

I think we're going to see the field of actively maintained programming languages drop. If we're all struggling to pay our rent and fill our cabinets with food, we aren't going to have a ton of time to donate to extremely demanding after-hour volunteer jobs, which is effectively what maintaining most programming languages is. Critical languages or languages which give certain companies advantages will likely have actual employees allocated for their maintenance, while others will languish.

While the above isn't the most scientific study, it does suggest that the companies who are "good" OSS citizens will be able to greatly steer the direction of which languages succeed or fail. While I'm not suggesting anyone is going to lock these languages down, their employees will be the ones actually doing the work and will be able to make substantive decisions about their future.

Infrastructure

In a 2016 paper in Global Environmental Change, Becker and four colleagues concluded that raising 221 of the world’s most active seaports by 2 meters (6.5 feet) would require 436 million cubic meters of construction materials, an amount large enough to create global shortages of some commodities. The estimated amount of cement — 49 million metric tons — alone would cost $60 billion in 2022 dollars. Another study that Becker co-authored in 2017 found that elevating the infrastructure of the 100 biggest U.S. seaports by 2 meters would cost $57 billion to $78 billion in 2012 dollars (equivalent to $69 billion to $103 billion in current dollars), and would require “704 million cubic meters of dredged fill … four times more than all material dredged by the Army Corps of Engineers in 2012.”
“We’re a rich country,” Becker said, “and we’re not going to have nearly enough resources to make all the required investments. So among ports there’s going to be winners and losers. I don’t know that we’re well-equipped for that.”

Source

Running a datacenter requires a lot of parts. You need replacement parts for a variety of machines from different manufacturers and eras, the ability to get new parts quickly and a lot of wires, connectors and adapters. As the world is rocked by a new natural disaster every day and seaports around the world become unusable, it will simply be impossible to run a datacenter without a considerable logistics arm. It will not be possible to call up a supplier and have a new part delivered in hours or even days.

Basic layout of what we mean when we say "a datacenter"

It isn't just the servers. The availability of data centers depends on reliability and maintainability of many sub-systems (power supply, HVAC, safety systems, communications, etc). We currently live in a world where many cloud services have SLAs with numbers like 99.999% availability along with heavy penalties for failing to reach the promised availability. For the datacenters themselves, we have standards like the TIA-942 link and GR-3160 link which determine what a correctly run datacenter looks like and how we define reliable.

COVID-related factory shutdowns in Vietnam and other key production hubs made it difficult to acquire fans, circuit breakers and IGBTs (semiconductors for power switching), according to Johnson. In some cases, the best option was to look outside of the usual network of suppliers and be open to paying higher prices.
“We’ve experienced some commodity and freight inflation,” said Johnson. “We made spot buys, so that we can meet the delivery commitments we’ve made to our customers, and we have implemented strategies to recapture those costs.”
But there are limits, said Johnson.
“The market is tight for everybody,” said Johnson. “There’s been instances where somebody had to have something, and maybe a competitor willing to jump the line and knock another customer out. We’ve been very honorable with what we’ve told our customers, and we’ve not played that game.” Source
The map of areas impacted by Global Warming and who makes parts is pretty much a one to one. 

We already know what kills datacenters now and the list doesn't look promising for the world of 2050.

A recent report (Ponemon Institute 2013) states the top root causes of data center failures: UPS system failure, Accidental/human error, cybercrime, weather related, water heat or CRAC failure, generator failure and IT equipment failure.
Expect to see a lot more electronic storage cabinets for parts

In order to keep a service up and operational you are going to need to keep more parts in stock, standardize models of machines and expect to deal with a lot more power outages. This will cause additional strain on the diesel generator backup systems, which have 14.09 failures per million hours (IEEE std. 493 Gold Book) and average repair time of 1257 hours now. In a world where parts take weeks or months to come, these repair numbers might totally change.

So I would expect the cloud to be the only realistic way to run anything, but I would also anticipate that just keeping the basic cloud services up and running to be a massive effort from whoever owns the datacenters. Networking is going to be a real struggle as floods, fires and power outages make the internet more brittle than it has been since its modern adoption. Prices will be higher as the cost for the providers will be higher.

Expectations

  • You will need to be multi-AZ and expect to cut over often.
  • Prices will be insane compared to now
  • Expect CPU power, memory, storage, etc to mostly stay static or drop over time as the focus becomes just keeping the lights on as it were
  • Downtime will become the new normal. We simply aren't going to be able to weather this level of global disruption with anything close to the level of average performance and uptime we have now.
  • Costs for the uptime we take for granted now won't be a little higher, they'll be exponentially higher even adjusted for inflation. I suspect contractual uptime will go from being "mostly an afterthought" to a extremely expensive conversation taking place at the highest levels between the customer and supplier.
  • Out of the box distributed services like Lambdas, Fargate, Lightsail, App Engine and the like will be the default since then the cloud provider is managing what servers are up and running.

For a long time we treated uptime as a free benefit of modern infrastructure. I suspect soon the conversation around uptime will be very different as it becomes much more difficult to reliably staff employees around the world to get online and fix a problem within minutes.

Mobile

By 2050, under an RCP 8.5 scenario, the number of people living in areas with a nonzero chance of lethal heat waves would rise from zero today to between 700 million and 1.2 billion (not factoring in air conditioner penetration). Urban areas in India and Pakistan may be the first places in the world to experience such lethal heatwaves (Exhibit 6). For the people living in these regions, the average annual likelihood of experiencing such a heat wave is projected to rise to 14 percent by 2050. The average share of effective annual outdoor working hours lost due to extreme heat in exposed regions globally could increase from 10 percent today to 10 to 15 percent by 2030 and 15 to 20 percent by 2050.

Link

We are going to live in a world where the current refugee crises is completely eclipsed. Hundreds of millions of people are going to be roving around the world trying to escape extreme weather. For many of those people, we suspect mobile devices will be the only way they can interact with government, social services, family and friends. Mobile networks will be under extreme strain as even well-stocked telecoms will struggle with floods of users with people attempting to escape the heat and weather disasters.

Mobile telecommunications, already essential to life in the smartphone era, is going to be as critical as power or water. Not just for users, but for the usage of the various embedding of ICT devices into the world around us. 5G has already greatly increased our ability to service many users, with some of that work happening in the transition from FDD to TDD (explanation). So in many ways we're already getting ready for a much higher level of dependence on mobile networks through the migration to 5G. The breaking of the service into "network slices" allows for effective abstraction away from the physical hardware, but obviously doesn't eliminate the physical element.

Mobile in 2050 Predictions

  • Phones are going to be in service for a lot longer. We all felt that both Android, with their aggressive backwards compatibility and Apple with the long support for software updates were already laying the groundwork for this.
  • Even among users who have the financial means to buy a new phone, it might be difficult for companies to manage their logistics chain. Expect long waits for new devices unless there is domestic production of the phone in question.
  • The trick to hardware launches and maintenance is going to be securing multiple independent supply chains and less about innovation
  • Mobile will be the only computing people have access to and they'll need to be more aware of data pricing. There's no reason to think telecoms won't take advantage of their position to increase data prices under the guise of maintenance for the towers.
Current graph of mobile data per gig. I'd expect that to increase as it becomes more critical to daily life. Source
Products like smartphones possess hundreds of components whose raw materials are transported from all over the world; the cumulative mileage traveled by all those parts would “probably reach to the moon,” Mims said. These supply chains are so complicated and opaque that smartphone manufacturers don’t even know the identity of all their suppliers — getting all of them to adapt to climate change would mark a colossal achievement. Yet each node is a point of vulnerability whose breakdown could send damaging ripples up and down the chain and beyond it. source

Developers will be expected to make applications that can go longer between updates, can tolerate network disruptions better and in general are less dependent on server-side components. Games and entertainment will be more important as people attempt to escape this incredibly difficult century of human life. Mobile platforms will likely have backdoors in place forced in by various governments and any concept of truly encrypted communication seem unlikely.

It is my belief (not the belief of the group) that this will strongly encourage asyncronous communication. Email seems like the perfect tool to me for writing responses and sending attachments in a world with shaky internet. It also isn't encrypted and can be run on very old hardware for a long time. The other folks felt like this was a bit delusional and that all messaging would be happening inside of siloed applications. It would be easier to add offline sync to already popular applications.

Work

Hey, Kiana,
Just wanted to follow up on my note from a few days ago in case it got buried under all of those e-mails about the flood. I’m concerned about how the Eastern Seaboard being swallowed by the Atlantic Ocean is going to affect our Q4 numbers, so I’d like to get your eyes on the latest earnings figures ASAP. On the bright side, the Asheville branch is just a five-minute drive from the beach now, so the all-hands meeting should be a lot more fun this year. Silver linings!
Regards,
Kevin

Source

Offices in 2050 for programmers will have been a thing of the past for awhile. It isn't going to make sense to force people to all commute to a single location, not out of a love of the employees but as a way to decentralize the businesses risk from fires, floods and famines. Programmers, like everyone, will be focused on ensuring their families have enough to eat and somewhere safe to sleep.

Open Source Survival?

Every job I've ever had relies on a large collection of Open Source software, whose survival depends on a small number of programmers having the available free time to maintain and add features to them. We all worried about how, in the face of reduced free time and increased economic pressures, who was going to be able to put in the many hours of unpaid labor to maintain this absolutely essential software. This is true for Linux, programming languages, etc.

To be clear this isn't a problem just for 2050, it is a problem today. In a world where programmers need to work harder just to provide for their families and are facing the same intense pressure as everyone else, it seems unreasonable to imagine. My hope is that we figure out a way to easily financially support OSS developers, but I suspect it won't be a top priority.

Stability

We're all worried about what happens to programming as a realistic career when we hit an actual economic collapse, not just a recession. Like stated above, it seems like a reasonable prediction that just keeping the massive amount of software in the world running will employ a fair amount of us. This isn't just a programming issue, but I would expect full-time employment roles to be few and far between. There will be a lot of bullshit contracting and temp gigs.

The group felt that automation would have likely gotten rid of most normal jobs by 2050 and we would be looking at a world where wage labor wasn't the norm anymore. I think that's optimistic and speaks to a lack of understanding of how flexible a human worker is. You can pull someone aside and effectively retrain them for a lot of tasks in less than 100 hours. Robots cannot do that and require a ton of expensive to change their position or function even slightly.

Daily Work

Expect to see a lot more work that involves less minute to minute communication with other developers. Open up a PR, discuss the PR, show the test results and then merge in the branch without requiring teams to share a time zone. Full-time employees will serve mostly as reviewers, ensuring the work they are receiving is up to spec and meets standards. Corporate loyalty and perks will be gone, you'll just swap freelance gigs whenever you need to or someone offers more money. It wouldn't surprise me if serious security programs were more common as OSS frameworks and packages go longer and longer without maintenance. If nothing else, it should provide a steady stream of work.

AI?

I don't think it's gonna happen. Maybe after this version of civilization burns to the ground and we make another one.

Conclusion

Ideally this sparks some good conversations. I don't have any magic solutions here, but I hope you don't leave this completely without hope. I believe we will survive and that the human race can use this opportunity to fundamentally redefine our relationship to concepts like "work" and economic opportunity. But blind hope doesn't count for much.

For maintainers looking to future proof their OSS or software platforms, start planning for easy of maintenance and deployment. Start discussing "what happens to our development teams if they can't get reliable internet for a day, a week, a month". It is great to deploy an application 100 times a day, but how long can it sit unattended? How does on-call work?

A whole lot is going to change in every element of life and our little part is no different. The market isn't going to save us, a startup isn't going to fix everything and capture all the carbon.