matduggan.com

The AI Bubble Is Bursting

January 20, 2025 in AI

Last week I awoke to Google deciding I hadn't had enough AI shoved down my throat. With no warning they decided to take the previously $20/user/month Gemini add-on and make it "free" and on by default. If that wasn't bad enough, they also decided to remove my ability as an admin to turn it off. Despite me hitting all the Off buttons I could find:

Users were still seeing giant Gemini chat windows, advertisements and harassment to try out Gemini.

This situation is especially frustrating because we had already evaluated Gemini. I sat through sales calls, read the documentation, and tested it extensively. Our conclusion? It’s a bad service, inferior to everything on the market including free LLMs. Yet, now, to disable Gemini, I’m left with two unappealing options:

upgrade my account to the next tier up and pay Google more money for less AI garbage.
Beg customer service to turn it off, after enduring misleading responses from several Google sources who claimed it wasn’t possible.

Taking how I feel about AI out of the equation, this is one of the most desperate and pathetic things I've ever seen a company do. Nobody was willing to pay you for your AI so, after wasting billions making it, you decide to force enable it and raise the overall price of Google Workspaces, a platform companies cannot easily migrate off of. Suddenly we've repriced AI from $20/user/month to "we hope this will help smooth over us raising the price by $2/user/month".

Plus, we know that Google knew I wouldn't like this because they didn't tell me they were going to do it. They didn't update their docs when they did it, meaning all my help documentation searching was pointless because they kept pointing me to when Gemini was a per-user subscription, not a UI nightmare that they decided to force everyone to use. Like a bad cook at a dinner party trying to sneak their burned appetizers onto my place, Google clearly understood I didn't want their garbage and decided what I wanted didn't matter.

If it were just Google, I might dismiss this as the result of having a particularly lackluster AI product. But it’s not just Google. Microsoft and Apple seem equally desperate to find anyone who wants these AI features.

Google’s not the only company walking back its AI up-charge: Microsoft announced in November that its own Copilot Pro AI features, which had also previously been a $20 monthly upgrade, would become part of the standard Microsoft 365 subscription. So far, that’s only for the Personal and Family subscriptions, and only in a few places. But these companies all understand that this is their moment to teach people new ways to use their products and win new customers in the process. They’re betting that the cost of rolling out all these AI features to everyone will be worth it in the long run. Source

Despite billions in funding, stealing the output of all humans from all time and being free for consumers to try, not enough users are sufficiently impressed that they are going to work and asking for the premium package. If that isn't bad enough, it also seems that these services are extremely expensive to offer, with even OpenAI's $200 a month Pro subscription losing money.

Watch The Bubble Burst

None of this should be taken to mean "LLMs serve no purpose". LLMs are real tools and they can serve a useful function, in very specific applications. It just doesn't seem like those applications matter enough to normal people to actually pay anyone for them.

Given the enormous cost of building and maintaining these systems, companies were faced with a choice. Apple took its foot off the LLM gas pedal with the following changes in the iOS beta.

When you enable notification summaries, iOS 18.3 will make it clearer that the feature – like all Apple Intelligence features – is a beta.
You can now disable notification summaries for an app directly from the Lock Screen or Notification Center by swiping, tapping “Options,” then choosing the “Turn Off Summaries” option.
On the Lock Screen, notification summaries now use italicized text to better distinguish them from normal notifications.
In the Settings app, Apple now warns users that notification summaries “may contain errors.”
Additionally, notification summaries have been temporarily disabled entirely for the News & Entertainment category of apps. Notification summaries will be re-enabled for this category with a future software update as Apple continues to refine the experience.

This is smart, it wasn't working that well and the very public failures are a bad look for any tech company. Microsoft has decided to go pretty much the exact opposite direction and reorganize their entire developer-centric division around AI. Ironically the Amazon Echo teams seems more interested in accuracy than Apple and have committed to getting hallucinations as close to zero as possible. Source

High level though, AI is starting to look a lot like executive vanity. A desperate desire to show investors that your company isn't behind the curve of innovation and, once you have committed financially, doing real reputational harm to some core products in order to be convincing. I never imagined a world where Google would act so irresponsibly with some of the crown jewels of their portfolio of products, but as we saw with Search, they're no longer interested in what users want, even paying users.

Stop Trying To Schedule A Call With Me

January 10, 2025 in humor

One of the biggest hurdles for me when trying out a new service or product is the inevitable harassment that follows. It always starts innocuously:

“Hey, I saw you were checking out our service. Let me know if you have any questions!”

Fine, whatever. You have documentation, so I’m not going to email you, but I understand that we’re all just doing our jobs.

Then, it escalates.

“Hi, I’m your customer success fun-gineer! Just checking in to make sure you’re having the best possible experience with your trial!”

Chances are, I signed up to see if your tool can do one specific thing. If it doesn’t, I’ve already mentally moved on and forgotten about it. So, when you email me, I’m either actively evaluating whether to buy your product, or I have no idea why you’re reaching out.

And now, I’m stuck on your mailing list forever. I get notifications about all your new releases and launches, which forces me to make a choice every time:

• “Obviously, I don’t care about this anymore.”

• “But what if they’ve finally added the feature I wanted?”

Since your mailing list is apparently the only place on Earth to find out if Platform A has added Feature X (because putting release notes somewhere accessible is apparently too hard), I have to weigh unsubscribing every time I see one of your marketing emails.

And that’s not even the worst-case scenario. The absolute worst case is when, god forbid, I can actually use your service, but now I’m roped into setting up a “series of calls.”

You can't just let me input a credit card number into a web site. Now I need to form a bunch of interpersonal relationships with strangers over Microsoft Teams.

Let's Jump On A Call

Every SaaS sales team has this classic duo.

First, there’s the salesperson. They’re friendly enough but only half paying attention. Their main focus is inputting data into the CRM. Whether they’re selling plastic wrap or missiles, their approach wouldn’t change much. Their job is to keep us moving steadily toward The Sale.

Then, there’s their counterpart: the “sales engineer,” “customer success engineer,” or whatever bastardized title with the word engineer they’ve decided on this week. This person is one of the few people at the company who has actually read all the documentation. They’re brought in to explain—always with an air of exhaustion—how this is really my new “everything platform.”

“Our platform does everything you could possibly want. We are very secure—maybe too secure. Our engineers are the best in the world. Every release is tested through a 300-point inspection process designed by our CTO, who interned at Google once, so we strongly imply they held a leadership position there.”

I will then endure a series of demos showcasing functionality I’ll never use because I’m only here for one or two specific features. You know this, but the rigid demo template doesn’t allow for flexibility, so we have to slog through the whole thing.

To placate me, the salesperson will inevitably say something like,

“Mat is pretty technical—he probably already knows this.”

As if this mild flattery will somehow make me believe that a lowly nerd like me and a superstar salesperson like you could ever be friends. Instead, my empathy will shift to the sales engineer, whose demo will, without fail, break at the worst possible time. Their look of pure despair will resonate with me deeply.

“Uh, I promise this normally works.”

There, there. I know. It’s all held together with tape and string.

At some point, I’ll ask about compliance and security, prompting you to send over a pile of meaningless certifications. These documents don’t actually prove you did the things outlined in them; they just demonstrate that you could plausibly fake having done them.

We both know this. If I got you drunk, you’d probably tell me horror stories about engineers fixing databases by copying them to their laptops, or how user roles don’t really work and everyone is secretly an admin.

But this is still the dating phase of our relationship, so we’re pretending to be on our best behavior.

“Very impressive SOC-2.”

via GIPHY

Getting Someone To Pay You

We’ve gone through the demos. You’ve tried to bond with me, forming a “team” that will supposedly work together against the people who actually matter and make decisions at my company. Now you want to bring my boss’s boss into the call to pitch them directly.

via GIPHY

Here’s the problem: that person would rather be set on fire than sit through 12 of these pitches a week from various companies. So, naturally, it becomes my job to “put together the proposal.”

This is where things start to fall apart. The salesperson grows increasingly irritated because they could close the deal if they didn’t have to talk to me and could just pitch directly to leadership. Meanwhile, the sales engineer—who, for some reason, is still forced to attend these calls—stares into the middle distance like an orphan in a war zone.

“Look, can we just loop in the leadership on your side and wrap this up?” the salesperson asks, visibly annoyed.

“They pay me so they don’t have to talk to you,” I’ll respond, a line you first thought was a joke but have since realized was an honest admission you refused to hear early in our relationship.

If I really, really care about your product, I’ll contact the 300 people I need on my side to get it approved. This process will take at least a month. Why? Who knows—it just always does. If I work for a Fortune 500 company, it’ll take a minimum of three months, assuming everything goes perfectly.

By this point, I hate myself for ever clicking that cursed link and discovering your product existed. What was supposed to save me time has now turned into a massive project. I start to wonder if I should’ve just reverse-engineered your tool myself.

Eventually, it’s approved. Money is exchanged, and the salesperson disappears forever. Now, I’m handed off to Customer Service—aka a large language model (LLM).

The Honeymoon Is Over

It doesn’t take long to realize that your “limitless, cloud-based platform designed by the best in the business” is, in fact, quite limited. One day, everything works fine. The next, I unknowingly exceed some threshold, and the whole thing collapses in on itself.

I’ll turn to your documentation, which has been meticulously curated to highlight your strengths—because god forbid potential customers see any warnings. Finding no answers, I’ll engage Customer Service. After wasting precious moments of my life with an LLM that links me to the same useless documentation, I’ll finally be allowed to email a real person.

The SLA on that support email will be absurdly long—72 business hours—because I didn’t opt for the Super Enterprise Plan™. Eventually, I’ll get a response explaining that I’ve hit some invisible limit and need to restructure my workflows to avoid it.

As I continue using your product, I’ll develop a growing list of undocumented failure modes:

“If you click those two buttons too quickly, the iFrame throws an error.”

I’ll actually say this to another human being, as if we’re in some cyberpunk dystopia where flying cars randomly explode in the background because they were built by idiots. Despite your stack presumably logging these errors, no one will ever reach out to explain them or help me fix anything.

Account Reps

Then, out of the blue, I’ll hear from my new account rep. They’ll want a call to “discuss how I’m using the product” and “see how they can help.” Don’t be fooled—this isn’t an attempt to gather feedback or fix what’s broken. It’s just another sales pitch.

After listening to my litany of issues and promising to “look into them,” the real purpose of the call emerges: convincing me to buy more features. These “new features” are things that cost you almost nothing but make a huge difference to me—like SSO or API access. Now I’m forced to decide whether to double down on your product or rip it out entirely and move on with my life.

Since it’s not my money, I’ll probably agree to give you more just to get basic functionality that should’ve been included in the first place.

Fond Farewell

Eventually, one of those open-source programmers—the kind who gleefully release free tools and then deal with endless complaints for life—will create something that does what your product does. It’ll have a ridiculous name like CodeSquish, Dojo, or GitCharm.

I’ll hear about it from a peer. When I mention I use your product, they’ll turn to me, eyes wide, and say, “Why don’t you just use CodeSquish?”

Not wanting to admit ignorance, I’ll make up a reason on the spot. Later, in the bathroom, I’ll Google CodeSquish and discover it does everything I need, costs nothing, and is 100x more performant—even though it’s maintained by a single recluse who only emerges from their Vermont farm to push code to their self-hosted git repo.

We’ll try it out. Despite the fact that its only “forum” is a Discord server, it’ll still be miles ahead of your commercial product.

Then comes the breakup. I’ll put it off for as long as possible because we probably signed a contract. Eventually, I’ll tell Finance not to renew it. Suddenly, I’ll get a flurry of attention from your team. You’ll pitch me on why the open-source tool is actually inferior (which we both know isn’t true).

I’ll tell you, “We’ll discuss it on our side.” We won’t. The only people who cared about your product were me and six others. Finally, like the coward I am, I’ll break up with you over email—and then block your domain.

Review of Orbit by Mozilla

January 06, 2025 in AI

Recently Mozilla announced the first of what will presumably be a number of LLM-powered tools designed to assist them with advancing a future with "trustworthy AI". You can read their whole thing here. This first stab at the concept is a browser extension called "Orbit". It's a smart, limited risk approach to AI, unlike what Apple did which was serve everyone raw cake and tell you it's cooked with Apple Intelligence. Or Google destroying their search results, previously the "crown jewels" of the company.

Personally I'm not a huge fan of LLMs. I don't really think there is something like a "trustworthy LLM". But I think this is an interesting approach by Mozilla setting up what seems like a very isolated from their core infrastructure LLM appliance running Mistral LLM (Mistral 7B).

Taking Orbit Apart

You can install the Add-On here: https://orbitbymozilla.com/. I figured we should grab it, rip it apart and see what it does. First I grabbed the file with wget https://addons.mozilla.org/firefox/downloads/file/4392515/orbit_summarizer-1.2.2.xpi` , renamed to .zip and peeked around.

background.js

Going through this file, it seems to be mostly doing what you would expect a "background.js" file does. I was originally thrown off by the chrome. thing until I saw that this is a convention for the WebExtensions API and many WebExtensions APIs in Firefox use the chrome.* namespace for compatibility with Chrome extensions, even though they also support the browser.* namespace as an alias.

Sentry: I'm surprised a Mozilla add-on uses Sentry. However it does have a _sentryDebugIdIdentifier.
Event listeners: The code sets up event listeners for various WebExtensions API events:
- chrome.runtime.onConnect: Listens for incoming connections from the background script.
- chrome.runtime.onMessage: Listens for messages from the browser.
- chrome.runtime.onInstalled: Listens for installation and update events.
- chrome.contextMenus: Listens for context menu clicks.
- chrome.tabs: Listens for tab updates and removals.
Background script: The code creates a background script that listens for incoming messages from the browser and executes various tasks, including:
- Handling popup initialization.
- Updating the "isEnabled" setting when the extension is installed or updated.
- Creating new tabs with a welcome page URL (or an onboarding page).
- Aborting video playback when the tab is removed.

Context menu items: The code creates two context menu items: "Summarize selection with Orbit" and "Summarize page with Orbit". These items trigger messages to be sent to the browser, which are then handled by the background script.

Server Elements

So it looks like Mozilla has a server set up to run the LLM powering Orbit.

            De = function (t, e) {
                return fetch("https://orbitbymozilla.com/v1/orbit/prompt/stream", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ prompt: t, chat_token: e }) });
            },
            $e = function (t, e) {
                return fetch("https://orbitbymozilla.com/v1/orbit/chat_history/reinstate_session", { method: "POST", headers: { "Content-Type": "application/json", Authorization: e }, body: JSON.stringify({ token: t.sessionToken }) });
            },
            Me = function (t) {
                return fetch("https://orbitbymozilla.com/v1/orbit/chat_history/clear_session", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ token: t }) });
            },
            Re = function (t, e, n) {
                return fetch("https://orbitbymozilla.com/v1/orbit/chat_history/update_context_history", {
                    method: "POST",
                    headers: { "Content-Type": "application/json", Authorization: e },
                    body: JSON.stringify({ prev_resp: t, token: n }),
                });
            },
            Ge = function (t, e) {
                return fetch("https://orbitbymozilla.com/v1/orbit/chat_history/index", { method: "POST", headers: { "Content-Type": "application/json", Authorization: e }, body: JSON.stringify({ page: t }) });
            },
            Ue = function (t, e, n, r, o, i, a, s) {
                return fetch("https://orbitbymozilla.com/v1/orbit/prompt/update", {
                    method: "POST",
                    headers: { "X-Orbit-Version": chrome.runtime.getManifest().version, "Content-Type": "application/json", Authorization: a },
                    body: JSON.stringify({ prompt: t, ai_context: o, context: e, title: n, chat_token: i, type: s, icon_url: r }),
                });
            },
            ze = function (t, e, n) {
                return fetch("https://orbitbymozilla.com/v1/orbit/prompt/store_result", { method: "POST", headers: { "Content-Type": "application/json", Authorization: n }, body: JSON.stringify({ ai_context: t, chat_token: e }) });
            },
            Fe = function (t) {
                return fetch("https://orbitbymozilla.com/v1/users/show", { method: "GET", mode: "cors", headers: { "Content-Type": "application/json", Authorization: t } });
            },
            qe = function (t, e) {
                return fetch("https://orbitbymozilla.com/v1/users/sign_in", { method: "POST", mode: "cors", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ user: { email: t, password: e } }) });
            },
            He = function (t, e) {
                return fetch("https://orbitbymozilla.com/v1/users", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ user: { email: t, password: e } }) });
            },
            Ye = function (t) {
                return fetch("https://orbitbymozilla.com/v1/users/provider_auth", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify({ provider_name: "google", token: t }) });
            },
            We = function (t) {
                return fetch("https://orbitbymozilla.com/v1/users/sign_out", { method: "DELETE", headers: { "Content-Type": "application/json", Authorization: t } });
            },
            Be = function (t) {
                return fetch(t, { method: "GET", headers: { Host: "docs.google.com", origin: "https://docs.google.com", Accept: "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8" } });
            },
            Je = function (t) {
                return fetch("https://orbitbymozilla.com/v1/orbit/feedback", { method: "POST", headers: { "Content-Type": "application/json" }, body: JSON.stringify(t) });
            },
            Ke = function (t, e) {
                return fetch("".concat("https://orbitbymozilla.com/", "v1/orbit/chat_history/").concat(t), { method: "DELETE", headers: { "Content-Type": "application/json", Authorization: e } });
            },
            Ve = function (t) {
                return fetch("".concat("https://orbitbymozilla.com/", "v1/orbit/chat_history/delete_all"), { method: "DELETE", headers: { "Content-Type": "application/json", Authorization: t } });
            };
        function Xe(t) {
            return (
                (Xe =
                    "function" == typeof Symbol && "symbol" == typeof Symbol.iterator
                        ? function (t) {
                              return typeof t;
                          }
                        : function (t) {
                              return t && "function" == typeof Symbol && t.constructor === Symbol && t !== Symbol.prototype ? "symbol" : typeof t;
                          }),
                Xe(t)
            );
        }

Nothing too shocking here. I don't fully understand what the Google Docs is doing there.

content.js

This is where Orbit gets initialized. The code sets up several event listeners to handle incoming messages from the browser's runtime API

This code sets up event listeners for several message types, including "startup", "updatedUrl", and "responseStream". The chrome.runtime.onMessage function is used to listen for incoming messages from the browser's runtime API.

Slightly Mysterious Stuff

So multiple times in the code you see reference to gmail, youtube, outlook specifically. It seems that this plugin is geared towards providing summaries on those specific websites. Not to say that it won't attempt it on other sites, but those are the ones where it is hardcoded to work. The google.json is as follows:

{
  "main": "div[jsaction*=\"CyXzrf:.CLIENT\"]",
  "emailThread": "*[data-thread-perm-id*=\"thread-f\"], *[data-thread-perm-id*=\"thread-a\"]",
  "emailChains": ".h7[role*=\"listitem\"], .kv[role*=\"listitem\"]"
}

My guess about how this works is as follows:

main: "div[jsaction*=\"CyXzrf:.CLIENT\"]"

This selector targets div elements that have an attribute called jsaction with a value that starts with CyXzrf: and ends with .CLIENT. This suggests that the plugin is trying to select elements that contain a specific JavaScript action.

emailThread: "*[data-thread-perm-id*=\"thread-f\"]", "[data-thread-perm-id*=\"thread-a\"]"

This selector targets two types of elements:
* Elements with an attribute called data-thread-perm-id that has a value starting with thread-f.
* Elements with an attribute called data-thread-perm-id that has a value starting with thread-a.

The double quotes around the selectors indicate that they are using the CSS attribute selector syntax, which allows selecting elements based on their attributes.

emailChains: " .h7[role*=\"listitem\"], .kv[role*=\"listitem\"]"

This selector targets two types of elements:
* Elements with a class called h7 that have an attribute called role with a value containing the substring "listitem".
* Elements with a class called kv that have an attribute called role with a value containing the substring "listitem".

Weirdly the outlook.json and the youtube.json are both empty. I'm guessing for YouTube it grabs the text transcript and ships that off to the server. I don't see any reference to Outlook in the JS code, so I'm not clear how that works.

Orbit Network Traffic

So going into about:debugging#/runtime/this-firefox I grabbed the Network Inspector for this plugin.

Running a summary on a random Ars Technica article looked like this:

Inspecting the requests, it looks like this is how it works.

POST to https://orbitbymozilla.com/v1/orbit/prompt/update

It has null ai_context and a context of:

'Here is a page: Skip to content Story text * Subscribers only Learn more As many of us celebrated the year-end holidays, a small group of researchers worked overtime tracking a startling discovery: At least 33 browser extensions hosted in Google’s Chrome Web Store, some for as long as 18 months, were surreptitiously siphoning sensitive data from roughly 2.6 million devices.The compromises came to light with the discovery by data loss prevention service Cyberhaven that a Chrome extension used by 400,000 of …Worlds 10 Most Breathtakingly Beautiful Women.OMGIFactsUndoShe Was Everyones Dream Girl In 90s, This Is Her Recently.InvestructorUndoShe Was Everyones Dream Girl In 90s, This Is Her RecentlyBoite A ScoopUndo20 Classic Cars That Remain In High Demand For a ReasonChillingHistory.comUndo2024 Latest Stair Lifts: Ideal for Elderly, No Installation NeededStair Lift | Search AdsUndoCopenhagen - StairLift Elevators Could Be A Dream Come True For SeniorsStair Lift | Search AdUndo Search dialog... Sign in dialog...'

The response is clearly a reference for the plugin to wait and grab the result later: "orbit_message_store:488b357b-20cc-408d-a55e-bfa5219b102f_e057453954543a1733eaaa2996904b81d0fa3cef6fe15efeccdfe644228173f0"

Then we POST to https://prod.orbit-ml-front-api.fakespot.prod.webservices.mozgcp.net/invoke

{"config":{"configurable":{"key":"488b357b-20cc-408d-a55e-bfa5219b102f_e057453954543a1733eaaa2996904b81d0fa3cef6fe15efeccdfe644228173f0"},"metadata":{"key":"488b357b-20cc-408d-a55e-bfa5219b102f_e057453954543a1733eaaa2996904b81d0fa3cef6fe15efeccdfe644228173f0"},"recursionLimit":25,"tags":[]},"input":{"human_input":"generate_suggested_justquestions"},"kwargs":{}}

This service clearly generates the "suggested questions" in the plugin as seen in the response.

" Question 1: Which Chrome extensions were found to have surreptitiously collected sensitive data from users' devices, and how many users were affected?\n\nQuestion 2: How did the attackers exploit the Google OAuth permission system to upload malicious versions of the Chrome extensions to the Chrome Web Store?\n\nQuestion 3: What types of sensitive data were stolen by the malicious Chrome extensions, and which websites' authentication credentials were targeted?\n\nQuestion 4: What steps should organizations take to manage and secure the use of browser extensions in their security programs?\n\nQuestion 5: What is the potential impact on users who ran one of the compromised Chrome extensions, and what actions should they take to protect themselves?"

Finally we run a GET to prod.orbit-ml-front-api.fakespot.prod.webservices.mozgcp.net which seems to get the actual summary.

Finally we store_result. https://orbitbymozilla.com/v1/orbit/prompt/store_result is what we POST to with the following.

" At least 33 malicious Chrome extensions, some of which had been available in Google's Chrome Web Store for up to 18 months, were discovered to have been stealing sensitive data from approximately 2.6 million devices. The compromises came to light when a Chrome extension used by 400,000 of Cyberhaven's customers was found to have been updated with malicious code. The malicious version, available for 31 hours between Christmas Day and Boxing Day, automatically downloaded and installed on Chrome browsers running Cyberhaven during that window. The attacker gained access to the extension by sending a spear phishing email to the developers, posing as Google and warning that the extension would be revoked unless immediate action was taken. The attacker then used the granted permission to upload the malicious version. Other extensions were also targeted using the same method, with at least 19 identified as of Thursday afternoon. The earliest known compromise occurred in May 2023."

orbit_message_store:bb1d4e09-0ff9-4ba1-93e5-172f77257022_cfe62c87f51e4a81c5f0bbb85b3938bb4a7513a3ac9f11b3e537d35918d97f5b

I'm a little confused by this. I thought the summary wasn't stored server-side as explained by this:

Does Orbit save the content of the pages I visit or summaries generated?

No, it does not.

When you use Orbit, we receive a payload back that contains the contents of your query; information about the model queried (such as the name and version number); information about technical problems with processing the query, if any; the number of tokens required to process the query; and the model outputs in response to the query. We do not store this data beyond temporarily caching it to process your query and return the outputs to you.

Orbit summaries are only available on the page that you are actually on. As soon as you navigate away from that page, Orbit erases the session.

I'm not exactly clear how this can be true if it seems like I'm sending the summary back to Mozilla to an endpoint called store_result. They do say they "temporarily cache" the results so I guess they are covered in terms of how it works. It doesn't appear to me like an endpoint called store_result is a temporary storage endpoint, but who knows.

Does it work well?

So now that I've checked the extension and installed it, agreeing to this somewhat alarming permission screen:

It gives me a strange....hovering black hole icon?

When you click on the hovering black hole you get the following options:

You can do the following:

Ask Orbit, which is asking the plugin about anything on the specific page you are on.
Settings.

I strongly recommend "Minimal" to get rid of the large Bonzi Buddy type floating icon on the screen.

Summarize

Finally Snooze Orbit.

Testing

So let's try it out on some junk! We'll start with the Ars article from before. https://arstechnica.com/security/2025/01/dozens-of-backdoored-chrome-extensions-discovered-on-2-6-million-devices/

The summary is ok, if a little brief.

At least 33 malicious Chrome extensions, some of which had been available in Google's Chrome Web Store for up to 18 months, were discovered to have been stealing sensitive data from approximately 2.6 million devices. The compromises came to light when a Chrome extension used by 400,000 of Cyberhaven's customers was found to have been updated with malicious code. The malicious version, available for 31 hours between Christmas Day and Boxing Day, automatically downloaded and installed on Chrome browsers running Cyberhaven during that window. The attacker gained access to the extension by sending a spear phishing email to the developers, posing as Google and warning that the extension would be revoked unless immediate action was taken. The attacker then used the granted permission to upload the malicious version. Other extensions were also targeted using the same method, with at least 19 identified as of Thursday afternoon. The earliest known compromise occurred in May 2023.

Switching the settings to have it produce "paragraphs" resulted in a much more robust answer.

At least 33 malicious Chrome extensions, some of which had been available in Google's Chrome Web Store for up to 18 months, were discovered to have been stealing sensitive data from approximately 2.6 million devices. The compromises came to light when data loss prevention service Cyberhaven discovered that their own extension, used by 400,000 of its customers, had been updated with malicious code that stole their sensitive data. The malicious version was available for just 31 hours, from December 25 at 1:32 AM UTC to Dec 26 at 2:50 AM UTC, and automatically downloaded and installed on Chrome browsers running Cyberhaven during that window.
The malicious code was configured to work with different payloads that were downloaded from cyberhavenext.pro, a malicious site registered to give the appearance it was affiliated with the company. One recovered payload scoured user devices for browser cookies and authentication credentials for the facebook.com domain, while another stole cookies and credentials for chatgpt.com.
The attacker gained access to the Cyberhaven extension by sending a spear phishing email to the developers on Christmas Eve, warning that the extension wasn't in compliance with Google terms and would be revoked unless immediate action was taken. The email contained a link to a Google consent screen requesting access permission for an OAuth application named Privacy Policy Extension. A Cyberhaven developer granted the permission, unknowingly giving the attacker the ability to upload new versions of Cyberhaven's Chrome extension to the Chrome Web Store.
Other extensions were also targeted using the same spear phishing campaign, with at least 19 other Chrome extensions compromised as of Thursday afternoon. In every case, the attacker used spear phishing to push a new malicious version and custom, look-alike domains to issue payloads and receive authentication credentials. Collectively, the 20 extensions had 1.46 million downloads.
The earliest compromise occurred in May 2023. The full list of affected extensions and their details is provided in the article.
Browser extensions have long been a weak link in the security chain, with malicious extensions stealing sensitive data from 4 million devices in 2019, including those of companies like Tesla, Blue Origin, FireEye, Symantec, TMobile, and Reddit.
To mitigate the risk of malicious extensions, organizations can compile a browser asset management list that allows only selected extensions to run and blocks all others. However, this would not have prevented the Cyberhaven customers from installing the malicious extension version unless a specific version was trusted and all others were blocked.
Anyone who ran one of these compromised extensions is advised to carefully consider changing their passwords and other authentication credentials. Additional indicators of compromise can be found in the article and other posts.

The suggested questions are as follows:

According to the text, at least 33 Chrome extensions were found to have surreptitiously collected sensitive data from roughly 2.6 million devices. Some of the affected extensions include Cyberhaven, Uvoice, Internxt VPN, and Castorus. The malicious code was pushed out through a spear phishing campaign that targeted the developers of these extensions.

That's the answer to the first one. Since the article has a list of impacted extensions I would have expected it to return the full list of extensions.

The attackers exploited the Google OAuth permission system by tricking a Cyberhaven developer into granting them permission to upload new versions of the Chrome extension to the Chrome Web Store. They did this by sending a spear phishing email on Christmas Eve, posing as a Google representative and warning that the extension would be revoked unless immediate action was taken. The link in the email led to a Google consent screen, where the developer unknowingly granted the attacker the ability to upload new versions.

Second answer is a little bit stronger.

OpenAI Comparison

A recent investigation has uncovered that at least 33 malicious Chrome extensions, collectively installed on approximately 2.6 million devices, have been covertly extracting sensitive user data.

Key Findings:
• Discovery and Impact: The issue came to light when Cyberhaven, a data loss prevention service, found that a Chrome extension used by 400,000 of its customers had been compromised. The malicious version was available for 31 hours between December 25 and 26, during which it harvested sensitive information from users’ devices.
• Method of Compromise: Attackers employed spear-phishing emails targeting extension developers, tricking them into granting permissions that allowed the attackers to upload malicious versions of legitimate extensions to the Chrome Web Store. These versions were configured to download additional payloads from domains mimicking legitimate services, enabling the theft of browser cookies and authentication credentials.
• Scope of Affected Extensions: Beyond Cyberhaven’s extension, at least 19 other Chrome extensions were similarly compromised through this method, affecting a total of 1.46 million users. The attackers used custom, look-alike domains to distribute payloads and collect stolen credentials.

Recommendations for Users:
• Review Installed Extensions: Users should audit their browser extensions and remove any that are unnecessary or unfamiliar.
• Stay Informed: Regularly check for updates from trusted sources regarding compromised extensions to ensure your browser remains secure.
• Be Cautious with Permissions: Before installing extensions, scrutinize the permissions they request and be wary of granting access that seems excessive for the extension’s functionality.

This incident underscores the importance of vigilance when managing browser extensions, as they can become vectors for significant security breaches if compromised.

Review: Honestly they're both pretty bad. The first summary is missing a lot of relatively critical information. The second summary has most of that information, but is so long you might as well read the original source material. OpenAI decides it doesn't want to even read the article and sorta goes off on its own there.

Another One: https://lwn.net/Articles/1001783/

I was hoping the plain web style would help the plugin, but it seemed to have trouble right away. I suspect the comments section threw it off.

This one was a complete failure.

provider of plugins for a CMS that has a 62% market share.
That's not quite accurate. Automattic is the provider of the official plugin repository for WordPress, which is the most popular CMS. However, there are other plugin repositories and marketplaces, such as GitHub, where plugins can be distributed and installed directly.
The issue here is that Automattic has taken actions to prevent WP Engine from distributing plugins through the official repository, which WP Engine argues is damaging their business. WP Engine is seeking an injunction to restore their access to the repository.
The court has found that WP Engine is likely to succeed on the merits of their claim for tortious interference with contractual relations, as they have demonstrated that Automattic intentionally interfered with their contractual relationship with their customers by blocking their access to the repository. The court has also found that WP Engine is likely to suffer irreparable harm if the injunction is not granted, as they will continue to lose customers and revenue.
However, it's important to note that this is just a preliminary injunction, and it does not necessarily mean that WP Engine will ultimately prevail in the case. The trial has not yet begun, and there will be further proceedings where both sides will present their evidence and arguments.
Legal jargon is significant.

Review: Disaster.

Try again! https://www.theverge.com/2025/1/5/24336154/samsung-the-frame-pro-features-specs

Ok this is at least a readable summary.

Samsung has announced the release of The Frame Pro, an upgraded version of its popular TV that aims to deliver both aesthetics and improved performance. The Frame Pro features Mini LED technology, which promises enhanced contrast, brightness, and black levels, although it does not offer the same level of local dimming as traditional Mini LED TVs. Additionally, the maximum refresh rate has been boosted to 144Hz for smoother visuals for PC gamers. However, Dolby Vision support is still not available. The Frame Pro also eliminates the need for a wire between the TV and Samsung's breakout box, resulting in less cable clutter. The TV is expected to ship in the spring, with pricing details yet to be announced.

The Frame Pro TV by Samsung introduces two major improvements compared to the regular model. First, it uses Mini LED technology along the bottom of the screen, which Samsung claims produces some level of local dimming. Second, it offers a higher maximum refresh rate of 144Hz for smoother visuals for PC gamers.

The Frame Pro by Samsung utilizes Mini LEDs placed along the bottom of the screen, which Samsung claims still produces some level of local dimming. However, this approach seems more similar to edge-lit TVs as it doesn't involve the large number of small dimming zones typically found in traditional Mini LED TVs.

This is fine. I think it's missing a few important things from the article. First it doesn't really convey any of the skepticism of the author.

The Frame has always been something of a compromise; maybe you’ve got a significant other who refuses to allow a dull black rectangle into the living room. So you, being the good and considerate person you are, ultimately agree to “settle” on The Frame. After first hearing about The Frame Pro, I was hopeful that it would be much less of a compromise.

But this asterisk around Mini LED has me a little less excited. Like recent models, The Frame Pro’s display has a matte finish to give your preferred art a more authentic appearance and mask the reality that you’re looking at a screen. But matte screens can sometimes lessen a display’s punch, so genuine Mini LED backlighting could’ve helped quite a bit in that regard.

It's also a pretty bare-boned summary. Here is the OpenAI comparison:

Samsung has announced The Frame Pro, an enhanced version of its popular TV model, The Frame, which combines home decor aesthetics with television functionality.

Key Features of The Frame Pro:
• Mini LED Technology: The Frame Pro transitions to Mini LED technology, aiming to improve contrast, brightness, and black levels. Unlike traditional Mini LED setups with backlighting, Samsung places the Mini LEDs along the bottom of the screen, which may still result in an edge-lit TV feel.
• Increased Refresh Rate: The maximum refresh rate is boosted from 120Hz to 144Hz, appealing to PC gamers seeking smoother visuals.
• Wireless One Connect Box: The wired connection to the breakout box is eliminated, replaced by the Wireless One Connect Box supporting Wi-Fi 7. This box can function from up to 10 meters away, reducing cable clutter and enhancing the TV’s seamless integration into home decor.
• AI-Powered Features: The Frame Pro includes AI-driven functionalities such as “Click to Search,” which provides information about actors, scenes, or clothing displayed on-screen; “Samsung Food,” recognizing dishes onscreen and offering recipes; and “Live Translate,” instantly translating closed captions on live broadcasts in up to seven languages.

Design Considerations:
• Matte Finish Display: The TV maintains a matte finish to give displayed art a more authentic appearance, aligning with its design ethos of blending seamlessly into home decor.

Availability and Pricing:
• Samsung plans to announce pricing details in the spring, coinciding with the product’s shipment schedule. The cost will be a significant factor in determining The Frame Pro’s appeal compared to the regular model.

The Frame Pro aims to enhance the balance between aesthetic appeal and advanced television functionality, building upon the success of its predecessor by introducing technological upgrades and user-friendly features.

This is still missing some of the subtext from the author about why the product is good or bad, but it does provide much more detail.

Review: Meh, it's pretty ok? I guess if you are really pressed for time. It seems to be missing a lot of important context about what this product is and what people might have wanted from it.

Last Test: email

I have my email on Fastmail, so we'll see how it does there. Here is the email.

At the heart of conservation are people like you — individuals who bring unique strengths to sustain and grow this vital work. This year, we’ve shared stories of the incredible women and communities driving change on the ground, and your support has been an integral part of those stories and our mission. Thanks to your generosity, Akashinga achieved remarkable milestones in 2024:

• Expanded our reach into key conservation areas, such as the Kunene Highlands in Namibia, safeguarding even more critical habitats.
• Trained over 70 new female rangers, equipping them to protect wildlife and support their communities.
• Protected 13 million acres of wilderness, ensuring a safe haven for countless species.
• Grew our global workforce to over 900 individuals, with 98% of staff being local and/or indigenous to the countries they are working in.
• Advanced community healthcare and infrastructure by implementing mobile health clinics and initiating additional solar-powered borehole projects within local communities.
Every dollar you gave directly strengthened conservation efforts and empowered communities across Southern and East Africa. Your commitment doesn’t just fund our mission — it makes you part of the Akashinga family, united in protecting our planet and fostering sustainability.

Thank you for being a vital voice in the Akashinga story.

Here is the summary:

Mathew received an email from Akashinga acknowledging his support in 2024, which significantly contributed to their conservation efforts. Akashinga expanded their reach into new areas, trained new female rangers, protected 13 million acres of wilderness, and grew their global workforce. Mathew's donation also funded community healthcare and infrastructure projects. Akashinga expressed gratitude for his commitment and considered him part of their family. The email was a thank you note and update on their achievements in 2024.

Review: I mean again like it's not wrong, it's just such a short summary. It seems to be missing a lot of important information. I'm not clear why the tool doesn't give me a bit more info even when I have "paragraphs" selected.

Conclusion

I get what Mozilla is going for here, but frankly this is a miss. I'm not really clear what it is doing with my summaries of websites, but it feels like they're storing them in a cache so they don't need to redo the summary every time. Outside of privacy, the summary is just too short to provide almost any useful information.

If you are someone who is drowning in websites and emails and just needs a super fast extremely high level overview of their content, give it a shot. It works pretty well for that and you can't beat the price. But if you are looking for a more nuanced understanding of articles and emails where you can ask meaningful follow-up questions and get answers, keep looking. This isn't there yet, although since most of the magic seems to be happening in their server, I guess there's nothing stopping them from improving the results over time.

Kubernetes as a Distro

January 05, 2025 in kubernetes

I recently read this great piece by Timur Tukaev discussing how to approach the growing complexity of Kubernetes. You can read it here. Basically as Kubernetes continues to expand to be the everything platform, the amount of functionality it contains is taking longer and longer to learn.

Right now every business that adopts Kubernetes is basically rolling their own bespoke infrastructure. Timur's idea is to try and solve this problem by following the Linux distro model. You'd have groups of people with similar needs work together to make an out-of-the-box Kubernetes setup geared towards their specific needs. I wouldn't start from a blank cluster, but a cluster already configured for my specific usecase (ML, web applications, batch job processing).

I understand the idea, but I think the root cause of all of this is simply a lack of a meaningful package management system for Kubernetes. Helm has done the best it can, but practically speaking it's really far from where we would need to be in order to have something even approaching a Linux package manager.

More specifically we need something between the very easy to use but easy to mess up Helm and the highly bespoke and complex to write Operator concept.

Centralized State Management
- Maintain a robust, centralized state store for all deployed resources, akin to a package database.
- Provide consistency checks to detect and reconcile drifts between the desired and actual states.
Advanced Dependency Resolution
- Implement dependency trees with conflict resolution.
- Ensure dependencies are satisfied dynamically, including handling version constraints and providing alternatives where possible.
Granular Resource Lifecycle Control
- Include better support for orchestrating changes across interdependent Kubernetes objects.
Secure Packaging Standards
- Enforce package signing and verification mechanisms with a centralized trust system.
- Include better support for orchestrating changes across interdependent Kubernetes objects.
Native Support for Multi-Cluster Management
- Allow packages to target multiple clusters and namespaces with standardized overrides.
- Provide tools to synchronize package versions across clusters efficiently.
Rollback Mechanisms
- Improve rollback functionality by snapshotting cluster states (beyond Helm’s existing rollback features) and ensuring consistent recovery even after partial failures.
Declarative and Immutable Design
- Introduce a declarative approach where the desired state is managed directly (similar to GitOps) rather than relying on templates.
Integration with Kubernetes APIs
- Directly leverage Kubernetes APIs like Custom Resource Definitions (CRDs) for managing installed packages and versions.
- Provide better integration with Kubernetes-native tooling (e.g., kubectl, kustomize).

Again a lot of this is Operators, but Operators are proving too complicated for normal people to write. I think we could reuse a lot of that work, keep that functionality and create something similar to what the Operator allows you to do with less maintenance complexity.

Still I'd love to see the Kubernetes folks do anything in this area. The current state of the world is so bespoke and frankly broken there is a ton of low hanging fruit in this space.

Career Advice for New Tech Workers in 2025

December 20, 2024 in tech

Giving advice is always tricky—especially now, in a world where everything seems to change every week. Recently, I had the chance to chat with some folks who’ve either just graduated or are about to graduate and are looking for jobs in tech. It was an informal, unstructured conversation, which, frankly, was refreshing.

A lot of the exchange surprised me in good ways. But one thing stood out: how little they actually knew about working for a tech company.

Most of what they knew came from one of three places:

1. Corporate blog posts gushing about how amazing life is for employees.

2. LinkedIn (because those who don't work post on LinkedIn about work).

3. Random anecdotes from people they’d bumped into after graduating and starting their first job.

Spoiler alert: Much of this information is either lies, corporate fantasy, or both.

If history is written by the victors, then the narrative about life inside tech companies is written (or at least approved) by the marketing department. It’s polished, rehearsed, and utterly useless for preparing you for the reality of it all. That’s what I’m here to do: share a bit of that reality.

I’ve been fortunate enough to work at both Fortune 500 companies and scrappy startups with 50 people, across the US and Europe. I’m not claiming to know everything (far from it). In fact, I’d strongly encourage others to chime in and add their own insights. The more voices, the better.

That said, I think I’ve been around long enough to offer a perspective worth sharing.

One last thing: this advice is engineering-specific. I haven’t worked in other roles, so while some of this might apply to other fields, don’t hold me accountable if it doesn’t. Consider yourself warned.

Where Should I Apply?

A lot of newcomers gravitate toward companies they’ve heard of—and honestly, that’s not the approach I’d recommend. The bigger and fancier a tech company is, the less their tech stack resembles anything you’ll ever see anywhere else.

Take it from someone who’s watched ex-Googlers and former Facebook folks struggle to understand GitHub: starting at the top isn’t always the blessing it sounds like. (Seriously, just because you worked on a world-class distributed system doesn’t mean you know how to use Postgres.)

Finding your “sweet spot” takes time. For me, it’s companies with 200 to 1,000 employees. That size means they’re big enough to have actual resources but not so bloated with internal politics that all your time is spent in meetings about meetings. Bonus: you might even get to ship something users will see!

Here’s a pro tip: Don’t assume name recognition = better place to work. Plenty of people at Apple hate their jobs. Sometimes, smaller companies will give you way better experience and way fewer existential crises.

And remember: your first job isn’t your forever job. The goal here is just to get something solid on your resume and learn the ropes. That’s it.

Interview

Congratulations! You’ve slogged through a million online applications and finally landed your first interview. Exciting, right? Before you dive in, here are some tips for navigating this bizarre ritual, especially if you’ve never done it before.

Interviews ≠ The Job
- Let’s get this straight: interviews have almost nothing to do with the job you’ll actually be doing. They’re really just opportunities for engineers to ask whatever random questions they think are good “signals.”
- Someone once said that technical interviews are like “showing you can drive a race car so you can get a job driving a garbage truck.” They weren’t wrong.
Protect Your Time
- Here’s the deal: some companies will waste your time. A lot of it.
- I’ve been through this circus—take-home tests, calls with unrelated questions, and in-person interviews where no one has even glanced at my earlier work. It’s maddening.
- • Take-Home Assignments: Fine, but they should replace some in-person tests, not stack on top of them.
- • In-Person Interviews: If they insist on multiple rounds, the interviewers better be sharing notes. If not, walk away.
- Remember: the average interview doesn’t lead to a job. Don’t ditch other opportunities because you’re on round three with a “Big Company.” You’re not as close to an offer as you might think.
Watch for the Sunk-Cost Fallacy
- If you find yourself doing two take-homes, six interviews, and a whiteboard session, ask yourself: Is this really worth it?
- I’ve ended interview processes midstream when I realized they were wasting my time—and I’ve never regretted it. On the flip side, every time I’ve stuck it out for those marathon processes, the resulting job was…meh. Not worth it.
Ask the Hard Questions
- This is your chance to interview them. Don’t hold back:
  - On-Call Work: Ask to see recent pages. Are they fixing real problems, or are you signing up for a never-ending nightmare?
  - Testing: What’s their test coverage like? Are the tests actually useful, or are they just there to pass “sometimes”?
  - Turnover: What’s the average tenure? If everyone’s left within 18 months, that’s a big, waving red flag.
Embrace the Leetcode Circus
- I know, I know—Leetcode is frustrating and ridiculous, but there’s no way around it. Companies love to make you grind through live coding problems. Just prepare for it, get through it, and move on.
Failure Is Normal
- Failing an interview means absolutely nothing. It stings, sure, especially after investing so much time. But rejection is just part of the process. Don’t dwell on it.

Common Questions

Why are interviews so stupid if they're also really expensive to do?
- Because interview time is treated like “free time.” Your time, their time—it’s all apparently worthless. Why? Probably something they heard in a TED talk.
Should I contribute to an open-source project to improve my resume?
- No. Companies that rely on open-source rarely contribute themselves, so why should you? Unless you want to be an unpaid idealist, skip it.
I’ve always dreamed of working at [Insert Company]. Should I show my enthusiasm?
- Absolutely not. Recruiters can smell enthusiasm, and it works against you. For some reason, acting mildly disinterested is the best way to convince a company you’re a great fit. Don’t ask me why—it just works.

Getting Started

One of the biggest mistakes newcomers make in tech is believing that technical decisions are what truly matter. Spoiler: they’re not.

Projects don’t succeed or fail because of technical issues, and teams don’t thrive or collapse based on coding prowess alone. The reality is this: human relationships are the most critical factor in determining your success. Everything else is just noise that can (usually) be worked around or fixed.

Your Manager Is Key

When starting out, the most important relationship you’ll have is with your manager. Unfortunately, management in tech is…well, often terrible.

Most managers fall into two categories:

1. The Engineer Turned Manager: An engineer who got told, “Congrats, you’re a manager now!” without any training.

2. The People Manager™: Someone who claims to be an “expert” at managing people but doesn’t understand (or care about) how software development actually works.

Both types can be problematic, but the former engineer is usually the lesser evil. Great managers do exist, but they’re rare. Teams with good managers tend to have low turnover, so odds are, as a junior, you’re more likely to get stuck with a bad one.

The way you know you have a good manager is they understand the role is one of service. It's not about extracting from you, it's about enabling you to do the thing you know how to do. They do the following stuff:

They don't shy away from hard tasks
They understand the tasks their team is working on and can effortlessly answer questions about them
Without thinking about it you would say they add value to most interactions they are in
They have a sense of humor. It's not all grim productivity.

Let’s meet the cast of bad managers you might encounter—and how to survive them.

1800s Factory Tycoon

This manager runs their team like an industrial assembly line and sees engineers as interchangeable cogs. Their dream? Climbing the corporate ladder as quickly as possible.

Signs You Work for a Tycoon

Obsession with “Keeping Things Moving”: Complaints or objections? Clearly, someone’s trying to sabotage the conveyor belt.
“Everyone Is Replaceable”: They dismiss concepts like institutional knowledge or individual skill levels. An engineer is just someone who “writes code at a computer.”
No Onboarding: “Here’s the codebase. It’s self-documenting.” Then they wander off.
Fear of Change: Nothing new gets approved unless it meets impossible criteria:
- Zero disruption to current work.
- Requires no training.
- Produces a shiny metric showing immediate improvement.
- No one else needs to do anything differently.
I Am The Star. They're the face of every success and the final word in every debate. You exist to implement their dream and when the dream doesn't match what they imagined it's because you suck.

Tips for Dealing With The Tycoon

Stay out of their way.
Expect zero career support—build your own network and contacts.
Be patient. Tycoons often self-destruct when their lack of results catches up to them. A new manager will eventually step in.

Sad Engineer

This person was a good engineer who got promoted into management…without any guidance. They mean well but often struggle with the non-technical aspects of leadership.

• Can’t Stay Out of the Code: They take on coding tasks, but no one on the team feels comfortable critiquing their PRs.

• Poor Communication: They’ll multitask during meetings, avoid eye contact, or act distracted—leaving their team feeling undervalued.

• Technical Debates > Team Conflicts: Great at discussing system architecture, useless at resolving interpersonal issues.

• No Work/Life Balance: They’re often workaholics, unintentionally setting a toxic example for their team.

Tips For Dealing with the SE

Be direct. Subtle hints about what you need or dislike will go over their head.
Use their technical expertise to your advantage—they love helping with code.
Don’t get attached. SEs rarely stick with management roles for long.

Jira as a Religion (JaaR)

The JaaR manager believes in the divine power of process. They dream of turning the chaos of software development into a predictable, assembly-line-like utopia. Spoiler: it never works.

• Obsessed with Estimates: They treat software delivery like Amazon package tracking, demanding updates down to the minute.

• Can’t Say No: They agree to every request from other teams, leaving their own team drowning in extra work.

• The Calendar Always Wins: Quality doesn’t matter; deadlines are sacred. If you ship a mess, that’s fine—as long as it’s on time.

• Meetings. Endless Meetings.: Every decision requires a meeting, a slide deck, and 20 attendees. Progress crawls.

Tips for Surviving the JaaR

• Find the Real Decision-Makers: JaaRs defer to others. Identify who actually calls the shots and work with them directly.

• Play Their Game: Turn everything into a ticket. They love tickets. Detailed ones make them feel productive, even if nothing gets done.

Your Team

The Core Reality of Technical Work

Technical work is often driven by passion. Many engineers enjoy solving problems and building things, even in their free time. However, one universal truth often emerges: organizational busywork expands to fill the available working day. No matter how skilled or motivated your team is, you will face an inevitable struggle against the growing tide of meetings, updates, and bureaucracy.

Every action you and your teammates do is designed to try and delay this harsh reality for as long as possible. Nobody escapes it, eventually you will be crushed by endless amounts of status updates, tickets, syncs and Slack channels. You'll know it's happened when you are commuting home and can't remember anything you actually did that week.

Teams will often pitch one of the following in an attempt to escape the looming black hole of busywork. Don't let yourself get associated with one of these ideas that almost never works out.

Common Losing Concepts in Team Discussions

“If we adopt X, all our problems go away.”
Tools and technologies can solve some problems but always introduce new challenges. Stay cautious when someone pitches a “silver bullet” solution.
“We’re being left behind!”
Tech loves fads and you don't have to participate in every one. Don't stress about "well if I don't work with Y technology then nobody will ever hire me". It's not true. Productivity in software development doesn't grow by leaps and bounds year over year regardless of how "fast" you adopt new junk
“We need to tackle the backlog.” Backlogs often consist of:
1. Important but complex tasks requiring effort and time.
2. Low-value tasks that remain because no one has questioned their necessity.

If a thing brings a lot of value to users and isn't terribly painful to write, developers will almost always snap it up for the dopamine rush of making it. This means addressing a backlog without critically evaluating its contents is a waste of resources.

4. “We need detailed conventions for everything.”

While consistency is good, overengineering processes can be counterproductive. Practical examples and lightweight guidelines often work better than rigid rules.

5. “Let’s rewrite everything in [new language].”

Rewrites are rarely worth the cost. Evolutionary changes and refactoring are almost always more effective than starting over.

Team Dynamics

The Golden Rule

People can either be nice or good at their jobs, both are equally valuable.

Teams need balance. While “10x engineers” may excel at writing code, they often disrupt team dynamics by overengineering or pursuing unnecessary rewrites. A harmonious team with diverse strengths typically achieves more than one overloaded with “superstars.”

Tips for Thriving on a Team

1. Quietly Grind on the Problem

As a junior developer, expect to spend a lot of time simply reading code and figuring things out. While asking for help is encouraged, much of your growth comes from independently wrestling with problems.

Start with small, low-risk changes to learn the codebase and deployment process.
Accept that understanding a system takes time, and no explanation can replace hands-on exploration.

2. Understand How Your Team Communicates

Some teams use Slack for everything; others rely on email or tools like GitHub. Learn the preferred methods of communication and actively participate.

3. Create a “How We Make Money” Diagram

Understanding your company’s business model helps you prioritize your work effectively. Map out how the company generates revenue and identify critical systems.

• Focus on less-sensitive parts of the system for experimentation or learning.

• Warn teammates when working on the “make money” areas to avoid unnecessary risks.

4. Plan Before Coding

The more complex a problem, the more time you should spend planning. Collaboration and context are key to successful implementation.

• Discuss proposed changes thoroughly.

• Understand the historical context behind existing decisions.

Assume most of the decisions you are looking at were made by a person as smart or smarter than you and work from there has been a good mental framework when joining a team for me.

Take Care Of Yourself

Tech work can be unpredictable—projects get canceled, layoffs happen, or sometimes a lucky break might land you a new opportunity. Regardless, this is your career, and it’s essential to make sure you’re learning and advancing at each job, because you never know when you might need to move on.

All Offices Suck
- Open-office layouts are a disaster for anyone who values focus, especially for new employees. You’re constantly bombarded by conversations and interruptions, often with no warning. The office environment may look sleek and modern, but it’s rarely conducive to concentration or productivity.
- If you need to drown out the sound of people talking or noise with headphones, your leadership cares more about how an office looks to a visitor than how it functions. Try to find a quiet spot where you can sit and work on your tasks.
Monitor Turnover
High employee turnover can be one of the most damaging issues for a team. Not only does it drain time and resources through offboarding, interviewing, and training, but it also kills morale and focus.
- Turnover, through voluntary leavings or layoffs, is one of the most destructive things to your team dynamics and overall morale. Turnover by itself costs about 20% of all the hours invested by a team (offboarding, interviewing, bringing someone new on), but there's a more sinister angle.
  - Teams with high turnover don't care about the future. You'll often get trapped in a death spiral of watching a looming problem grow and grow but unable to get anybody invested because they're already planning their exit.
  - Now high-turnover teams also promote really quickly, so this could be a good opportunity to get some more senior titles on your resume. But be warned that these teams are incredibly chaotic and often are shut down with little warning by upper management.
  - High turnover also says management is shitty at cost-benefit analysis. It takes forever to onboard engineers, especially in niche or high-complexity sectors.
- Try to find out why people are leaving.
  - Is this "just a job" to them? Fine, let it be that way to you.
  - Do they feel disposable or ignored? That suggests more serious management issues that go up the chain.
  - "Loyalty is for idiots". People naturally want to be loyal to organizations and teams, so if the sentiment is "only a sucker would be loyal to this place", understand you don't likely have a long career here.
Corporate Goals Don't Matter
- A lot of companies assume that whatever their stated high-level goals are somehow matter to you at the bottom. They don't and it's a little crazy to think they would.
- Managers align more strongly with these high-level goals because that is the goal of their team. They don't matter to you because that's not what drives your happiness. Doing work that feels fulfilling, interesting and positive matters. Seeing the profit numbers go up doesn't do anything for you.
Programming Is Lonely
- It can be hard for people to transition from university to a silent tech office where people rarely speak outside of meetings.
- Try to make friends across departments. I personally always loved hanging out during breaks with the sales folks, mostly because their entire lives are social contact (and they're often naturally funny people). It's important to escape your group from time to time.

Desired End State

Our goal with all of this is to get ourselves onto a good team. You'll know you are on a good team when you spend most of your time just clearing the highway of problems, such is your pace of progress. People are legitimately happy, even if the subject matter is boring. Some of my favorite teams in my career worked on mind-numbingly mundane things but we just had a great time doing it.

Once you find one of these, my advice is to ride it out for as long as possible. They don't last forever, management will come in and destroy it if given enough time. But these are the groups where you will do the most learning and develop the most valuable connections. These are the people who you will hire (or will hire you) five or ten years after you stop working together. That's how incredible the bond of a high-functioning group is and that's what you need to find.

But it takes awhile. Until you do, you just need to keep experimenting. Don't stay at a job you don't like for more than 2-3 years. Instead take those learnings and move on to the next chance.

Ultimately, your career is about learning, growing, and being part of a team that values you. The road can be long, and it’s okay to experiment with different roles and environments until you find the right fit. Stay focused on what matters to you, take care of yourself, and don’t be afraid to move on if the job isn’t fulfilling your needs.

Making Fun Flask Apps for No Reason

December 04, 2024 in Python

Sometimes life gets you down. Maybe it's a crushing political situation in your home country, perhaps you read the latest scientific data about global warming or hey sometimes you just need to make something stupid to remind yourself why you ever enjoyed doing this. Whatever the reason, let's take a load off and make a pointless Flask app. You can do it too!

Pokemon TCG Pocket Friend Website

I want to find some friends for the mobile game Pokemon TCG Pocket, but I don't want to make a new Reddit account and I don't want to join a Discord. So let's make one. It's a pretty good, straightforward one-day kind of problem.

Why Flask?

Python Flask is the best web framework for dumb ideas that you want to see turned into websites with as little work as possible. Designed for people like me who can hold no more than 3 complex ideas in their heads at a time, it feels like working with Rails if Rails didn't try to constantly wrench the steering wheel away from you and drive the car.

It's easy to start using, pretty hard to break and extremely easy to troubleshoot.

We're gonna try to time limit this pretty aggressively. I don't want to put in a ton of time on this project, because I think a critical part of a fun project is to get something out onto the Internet as quickly as possible. The difference between fun projects and work projects is the time gap between "idea" and "thing that exists for people to try". We're also not going to obsess about trying to get everything perfectly right. Instead we'll take some small steps to try and limit the damage if we do something wrong.

Let me just see what you made and skip the tutorial

Source code here: https://gitlab.com/matdevdug/pokemontcg-friend-finder

Feel FREE to use this as the beginning template for anything fun that you make and please let me know if you make something cool I can try.

Note:

This is not a "how do I Flask" tutorial. This is showing you how you can use Flask to do fun stuff quickly, not the basics of how the framework operates. There's a good Flask tutorial you'll have to do in order to do the stuff I'm talking about: https://flask.palletsprojects.com/en/stable/tutorial/

Getting Started

Alright let's set this bad-boy up. We'll kick it off with my friend venv. Assuming you got Python from The Internet somewhere, let's start writing some routes.

python3.12 -m venv venv
source venv/bin/activate
pip install flask

Make a hello.py with

from flask import Flask

app = Flask(__name__)


@app.route('/')
def hello():
    return 'Hello, World!'

Run it with python hello.py and enjoy your hello world.

Let's start writing stuff

Basically Flask apps have a few parts. There's a config, the app, templates and static. But before we start all that, let's just quickly define what we actually need.

We need an index.html as the /
We need a /sitemap.xml for search engines
Gonna need a /register for people to add their codes
Probably want some sort of /search
If we have user accounts you probably want a /profile
Finally gonna need a /login and /logout

So to store all that junk we'll probably want a database but not something complicated because it's friend codes and we're not looking to make something serious here. SQLite it is! Also nice because we're trying to bang this out in one day so easier to test.

At a basic level Flask apps work like this. You define a route in your app.py (or whatever you want to call it.

@app.route('/')
def main():
    return render_template("index.html")

Then inside of your templates directory you have some Jinja2 templates that will get rendered back to the client. Here is my index.html

{% extends "base.html" %}

{% block content %}
<div class="container mt-4">
    <h1 class="text-center text-danger">Pokémon TCG Friend Finder</h1>
    <p>Welcome to the Pokémon TCG Friend Finder, where you can connect with players from all over the world!</p>

    <div class="mt-4">
        <h4>How to Find Friend Codes:</h4>
        <p>To browse friend codes shared by other players, simply visit our <a class="btn btn-primary btn-sm" href="{{ url_for('find_friends') }}">Find Friends</a> page. No registration is required!</p>
    </div>

    <div class="mt-4">
        <h4>Want to Share Your Friend Code?</h4>
        <p>If you'd like to share your own friend code and country, you need to register for an account. It's quick and free!</p>
        <p>
            {% if current_user.is_authenticated %}
                <a class="btn btn-primary" href="{{ url_for('find_friends') }}">Visit Find Friends</a>
            {% else %}
                <a class="btn btn-success" href="{{ url_for('register') }}">Register</a> or
                <a class="btn btn-info" href="{{ url_for('login') }}">Log in</a> to get started!
            {% endif %}
        </p>
    </div>

    <div class="mt-4">
        <h4>Spread the Word:</h4>
        <p>Let others know about this platform and grow the Pokémon TCG community!</p>
        <div class="share-buttons">
            <a href="#" onclick="shareOnFacebook()" title="Share on Facebook">
                <img src="{{ url_for('static', filename='images/facebook.png') }}" alt="Share on Facebook" style="width: 64px;">
            </a>
            <a href="#" onclick="shareOnTwitter()" title="Share on Twitter">
                <img src="{{ url_for('static', filename='images/twitter.png') }}" alt="Share on Twitter" style="width: 64px;">
            </a>
            <a href="#" onclick="shareOnReddit()" title="Share on Reddit">
                <img src="{{ url_for('static', filename='images/reddit.png') }}" alt="Share on Reddit" style="width: 64px;">
            </a>
        </div>
    </div>
</div>

<!-- JavaScript for sharing -->
<script>
    const url = encodeURIComponent(window.location.href);
    const title = encodeURIComponent("Check out Pokémon TCG Friend Finder!");

    function shareOnFacebook() {
        window.open(`https://www.facebook.com/sharer/sharer.php?u=${url}`, '_blank');
    }

    function shareOnTwitter() {
        window.open(`https://twitter.com/intent/tweet?url=${url}&text=${title}`, '_blank');
    }

    function shareOnReddit() {
        window.open(`https://www.reddit.com/submit?url=${url}&title=${title}`, '_blank');
    }
</script>
{% endblock %}

Some quick notes:

I am using Bootstrap because Bootstrap let's people who are not good at frontend do one of those really quickly: https://getbootstrap.com/
You'll notice the base.html which lets you import the base template across all the other templates so you don't need to redo the basic menu navigation. You can see that here: https://gitlab.com/matdevdug/pokemontcg-friend-finder/-/blob/main/templates/base.html?ref_type=heads

Basically that's it. You make a route on Flask that points to a template, the template is populated from data from your database and you proudly display it for the world to see.

Instead let me run you through what I did that isn't "in the box" with Flask and why I think it helps.

Recommendations to do this real fast

Start with it inside of a container from the beginning.

FROM python:3.12-slim

# Create a non-root user
RUN groupadd -r nonroot && useradd -r -g nonroot nonroot
WORKDIR /app

COPY requirements.txt .

RUN pip3 install -r requirements.txt

COPY . .

RUN chown -R nonroot:nonroot /app

USER nonroot

ENTRYPOINT ["./gunicorn.sh"]

You are going to have to use a different HTTP server for Flask anyway, gunicorn is.....one of those. So you might as well practice like you play. Here is the compose file

version: '3.8'

services:
  app:
    build: .
    ports:
      - "9000:9000"
    volumes:
      - ~/Documents/data:/data
    environment:
      - FLASK_ENV=development
      - DATABASE_PATH=/data/users.db
      - SECRET_KEY="pleasechangeme"
      - API_KEY="secrettttsss"

Change the volumes to be wherever you want the database mounted. This is for local development but switching it to "prod" should be pretty straight forward.

Move config and models to their own thing.

Inside of the app.py (available here: https://gitlab.com/matdevdug/pokemontcg-friend-finder/-/blob/main/app.py?ref_type=heads) you'll see from config import Config and from models import db, bcrypt, User. This isn't required but I find mentally it helps to have them in their own files.

config is just "the stuff you are using to configure your application"

import os

class Config:
    SECRET_KEY = os.environ.get("SECRET_KEY") or "secretttssss"
    SQLALCHEMY_DATABASE_URI = f"sqlite:///{os.getenv('DATABASE_PATH', '/data/users.db')}"
    SQLALCHEMY_TRACK_MODIFICATIONS = False
    WTF_CSRF_ENABLED = True
    if os.getenv('FLASK_ENV') == 'development':
        DEBUG = True
    else:
        DEBUG = False
        SERVER_NAME = "poketcg.club"

Finally the models stuff is just the database things broken out to their own file.

from flask_sqlalchemy import SQLAlchemy
from flask_bcrypt import Bcrypt
from flask_login import UserMixin, LoginManager

db = SQLAlchemy()
bcrypt = Bcrypt()
login_manager = LoginManager()

class User(db.Model, UserMixin):
    id = db.Column(db.Integer, primary_key=True)
    username = db.Column(db.String(150), unique=True, nullable=False)
    password = db.Column(db.String(150), nullable=False)
    friend_code = db.Column(db.String(50))
    country = db.Column(db.String(50))
    friend_requests = db.Column(db.Integer, default=0)

You'll probably want to do a better job of defining the data you are inputting into the database than I did, but move fast break things etc.

Logs are pretty much all you are gonna have

Python logging library is unfortunately relatively basic, but important to note that this is going to be pretty much the only way you will know if something is working or not.

# Logging configuration
logging.basicConfig(level=logging.INFO,
                    format='%(asctime)s %(levelname)s: %(message)s [in %(pathname)s:%(lineno)d]',
                    handlers=[logging.FileHandler("app.log"), logging.StreamHandler()])

That's writing out to a log file and also stdout. You can choose either/or depending on what you want, with the understanding that it's more container-y to run them just as stdout.

Monitor Basic Response Times

So when I'm just making the app and I want to see "how long it takes to do x" I'll add a very basic logging element to track "how long did Flask thing the request took".

# Performance tracking
@app.before_request
def before_request():
    request.start_time = time.perf_counter()

@app.after_request
def after_request(response):
    if hasattr(request, 'start_time'):
        duration = time.perf_counter() - request.start_time
        logging.info(
            f"Request to {request.path} took {duration:.2f} seconds. "
            f"{response.status_code}."
        )
        return response

This doesn't tell you a lot but it usually tells me "whoa that took WAY too long something is wrong". It's pretty easy to put OpenTelemetry into Flask but that's sort of overkill for what we're talking about.

Skipping Emails and Password Resets

One thing that consumes a ton of time when working on something like this is coming up with the account recovery story. I've written a ton on this before so I won't bore you with that again, but my recommendation for fun apps is just to skip it.

In terms of account management make it super easy for the user to delete their account.

Deploying to Production

The most straightforward way to do this is Docker Compose with a file that looks something like the following:

services:
  app:
    build: .
    volumes:
      - /mnt/data:/data
    environment:
      - FLASK_ENV=production
      - DATABASE_PATH=/data/users.db
      - SECRET_KEY="make a good secret here please"

  caddy:
    image: caddy:latest
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - caddy_data:/data
      - caddy_config:/config
      - ./Caddyfile:/etc/caddy/Caddyfile
    environment:
      - CADDY_AUTOHTTPS=on

volumes:
  caddy_data:
  caddy_config:

Then you need the Caddyfile which looks something like this:

example.com {
    reverse_proxy app:9000
}

There are even easier options I outline here: https://matduggan.com/easier-alternative-to-nginx-lets-encrypt-with-caddy/

Some quick checks

Set up seccomp profiles on your container: https://docs.docker.com/reference/cli/docker/container/run/#security-opt --security-opt="no-new-privileges=true"
Setting up the network correctly

networks:
  internal:
    driver: bridge

services:
  app:
    networks:
      - internal
  caddy:
    networks:
      - internal
      - default

You can decide how complicated or simple you want to make this, but you should be able to (pretty easily) set this up on anything from a Pi to a $5 a month server.

See? Wasn't that hard!

So is my website a giant success? Absolutely not. I've only gotten a handful of users on it and I'm not optimistic anyone will ever use it. But I did have a ton of fun making it, so honestly mission success.

Questions/comments/concerns: https://c.im/@matdevdug

Self-Hosting Isn't a Solution; It's A Patch

November 22, 2024 in gdpr

For as long as I’ve been around tech enthusiasts, there has been a recurring “decentralization dream.” While the specifics evolve, the essence remains the same: everyone would own a domain name and host their digital identity. This vision promises that people, liberated from the chore of digital maintenance, would find freedom in owning their slice of the internet. The basic gist is at some point people would wake up to how important online services are to them and demand some ownership over how they work.

This idea, however, always fails. From hosting email and simple HTML websites in my youth to the current attempts at decentralized Twitter- or YouTube-like platforms, the tech community keeps waiting for everyday people to take the baton of self-hosting. They never will—because the effort and cost of maintaining self-hosted services far exceeds the skill and interest of the audience. The primary “feature” of self-hosting is, for most, a fatal flaw: it’s a chore. It’s akin to being “free” to change the oil in your car—it’s an option, but not a welcome one for most.

Inadvertently, self-hosting advocates may undermine their own goal of better privacy and data ownership. By promoting open-source, self-hosted tools as the solution for those concerned about their privacy, they provide an escape valve for companies and regulators alike. Companies can claim, “Well, if users care about privacy, they can use X tool.” This reduces the pressure for legal regulation. Even Meta’s Threads, with its integration of ActivityPub, can claim to be open and competitive, deflecting criticism and regulation—despite this openness being largely flowing from Threads to ActivityPub and not the other way around.

What people actually need are laws. Regulations like the GDPR must become the international standard for platforms handling personal data. These laws ensure a basic level of privacy and data rights, independent of whether a judge forces a bored billionaire to buy your favorite social network. Suggesting self-hosting as a solution in the absence of such legal protections is as naive as believing encrypted messaging platforms alone can protect you from government or employer overreach.

What do users actually deserve?

We don't need to treat this as a hypothetical. What citizens in the EU get is the logical "floor" of what citizens around the world should demand.

Right to access
- What data do you have on me?
- How long do you keep it?
- Why do you have it? What purpose does it serve?
Right to Rectification
- Fix errors in your personal data.
Right to be Forgotten
- There's no reason when you leave a platform that they should keep your contribution forever.
Right to Data Portability
- Transfer your data to another platform in a standardized machine-readable format.
Right to Withdraw Consent
- Opt out of data collection whenever you want, even if you originally agreed.

These are not all GDPR rights, but they form the backbone of what allows users to engage with platforms confidently, knowing they have levers to control their data. Regulations like these are binding and create accountability—something neither self-hosting nor relying on tech billionaires can achieve.

Riding this roller coaster of "I need digital platforms to provide me essential information and access" and trying to balance it with "whatever rich bored people are doing this week" has been a disaster. It's time to stop pretending these companies are our friends and force them to do the things they say they'll do when they're attempting to attract new users.

The fallacies of decentralization as a solution

The decentralization argument often assumes that self-hosted platforms or volunteer-driven networks are inherently superior. But this isn’t practical:

Self-hosting platforms are fragile.
1. Shutting down a small self-hosted platform running on a VPS provider is pretty trivial. These are basically paid for by one or two people and they would be insane to fight any challenge, even a bad one. How many self-hosted platforms would stand up to a threatening letter from a lawyer, much less an actual government putting pressure on their hosting provider?
2. Even without external pressure there isn't any practical way to fund these efforts. You can ask for donations, but that's not a reliable source of revenue for a cost that will only grow over time. At a certain size the maintainer will need to form a nonprofit in order to continue collecting the donations, a logistical and legal challenge well outside of the skillset of the people we're talking about.
3. It's effectively free labor. You are taking a job, running a platform, removing the pay for that job, adding in all the complexity of running a nonprofit and adding in the joys of being the CPA, the CEO, the sysadmin, etc. At some point people get sick, they lose interest, etc.
Decentralization doesn’t replace regulation.
1. While decentralization aligns with the internet’s original ethos, it doesn’t negate the need for legal protections. Regulations like GDPR raise the minimum level of privacy and security, while decentralization remains an optional enhancement. You lose nothing by moving the floor up.
Regulation is not inherently bad.
1. A common refrain among technical enthusiasts is a libertarian belief that market pressures and having a superior technical product will "win out" and legislation is bad because it constrains future development. You saw this a lot in the US tech press over the EU move from proprietary chargers to USB-C, a sense of "well when the next big thing comes we won't be able to use it because of silly government regulation".
  1. Global legislation forces all companies—not just a niche few catering to privacy enthusiasts—to respect users’ rights. Unlike market-driven solutions or self-hosting, laws are binding and provide universal protections.
  2. It is impossible for an average user to keep track of who owns which platforms and what their terms of service are now. Since they can be changed with almost no notice, whatever "protections" they can provide are laughably weak. In resisting legislation you make the job of large corporations easier, not harder.

The reality of privacy as a privilege

Right now, privacy often depends on technical skills, financial resources, or sheer luck:

• I value privacy and have money: You can pay for premium platforms like Apple or Fastmail. These platforms could change the rules whenever they want to but likely won't because their entire brand is based on the promise of privacy.

• I value privacy and have technical skills: You can self-host and manage your own services.

• I value privacy but lack money and technical skills: You’re left hoping that volunteers or nonprofits continue offering free tools—and that they don’t disappear overnight. Or you try and keep abreast of a constant churning ecosystem where companies change hands all the time and the rules change whenever they want.

This is a gatekeeping problem. Privacy should not be a luxury or dependent on arbitrary skill sets. Everyone deserves it.

It actually makes a difference

As someone who has experienced the difference between the U.S. and the EU’s approach to privacy, I can attest to how much better life is with stronger regulations. GDPR isn’t perfect, but it provides a foundation that improves quality of life for everyone. Instead of treating regulation as burdensome or unrealistic, we should view it as essential.

The dream of a decentralized internet isn’t inherently wrong, but waiting for it to materialize as a universal solution is a mistake. Laws—not utopian ideals—are the only way to ensure that users everywhere have the protections they deserve. It’s time to stop pretending companies will prioritize ethics on their own and instead force them to.

Post-Quantum Cryptography Basics

October 04, 2024

Every few years I will be on a team and the topic of quantum computing will come up. Inevitably the question will get asked "well is there something we are supposed to be doing about that or is it just a looming threat?" We will all collectively stare at each other and shrug, then resume writing stuff exactly like we were writing it before.

In 2024 it would be hard to make a strong justification for worrying a lot about post-quantum cryptography in a world where your most likely attack vector is someone breaking into your company Slack and just asking for access to something. However it is a question developers like to worry about because it involves a lot of math and cool steampunk looking computers. It's definitely a more interesting problem than how to get everyone to stop blindly approving access to the company Confluence.

In a universe where Facebook decided it was a good idea to store hundreds of millions of passwords in plain text, it's clear nobody is even getting the basics right. But it's more exciting to talk about post-quantum so let's do that.

Looks like something in Star Trek someone would trip and pull a bunch of wires out of.

Since I get asked the question every few year and I basically have no idea what I'm talking about, I figured I'll do the research now and then refer back to this in the future when someone asks and I need to look clever in a hurry.

TL/DR: The tooling to create post-quantum safe secrets exists and mostly works, but for normal developers dealing with data that is of little interest 12 months after it is created, I think this is more a "nice to have". That said, these approaches are different enough from encryption now that developers operating with more important data would be well-served in investing the time in doing the research now on how to integrate some of these. Now that the standard is out I suspect there will be more professional interest in supporting these approaches and the tooling will get more open source developer contributions.

There is basically one implementation: https://github.com/PQClean/PQClean and then that is being ported over to other languages: https://pypi.org/project/quantcrypt/. If you are interested in exploring this technology I recommend starting with PQClean since it is easy to read.

Libraries I tested and seemed to work:

quantcrypt - Python
CIRCL (Cloudflare Interoperable Reusable Cryptographic Library) - Golang
pqcrypto crate - Rust

Quantum Computers Explained Through Pokemon Cards

About the level of math my brain can handle

Think of a conventional computer like a regular Pokémon player. This player makes decisions based on clear rules, one at a time, and can only do one move at a time.

In the Pokémon card game, you have:

A limited number of cards (like a computer’s memory)
You play one card at a time (like a computer performing one calculation at a time)
You follow a clear set of rules (like how classical computers follow step-by-step instructions)

Every time you want to pick a card, attack, or use a move, you do it one by one in a specific order, just like a classical computer processes 0s and 1s in a step-by-step manner. If you want to calculate something or figure out the best strategy, you would test one option, then another, and so on, until you find the right solution. This makes conventional computers good at handling problems that can be broken down into simple steps.

Quantum Computers:

Now, imagine a quantum computer is like a player who can somehow look at all the cards in their deck at the same time and choose the best one without flipping through each card individually.

In the quantum world:

Instead of playing one card at a time, it’s like you could play multiple cards at once, but in a way that combines all possibilities (like a super-powered move).
You don’t just pick one strategy, you could explore all possible strategies at once. It’s as if you’re thinking of all possible moves simultaneously, which could lead to discovering new ways to win the game much faster than in a regular match.
Quantum computers rely on something called superposition, which is like having your Pokémon be both active and benched at the same time, until you need to make a decision. Then, they “collapse” into one state—either active or benched.

This gives quantum computers the ability to solve certain types of problems much faster because they are working with many possibilities at once, unlike classical computers that work on problems step-by-step.

Why Aren't Quantum Computers More Relevant To Me?

We'll explain this with Pokemon cards again.

The deck of cards (which represents the quantum system) in a quantum player’s game is extremely fragile. The cards are like quantum bits (qubits), and they can be in many states at once (active, benched, etc.). However, if someone bumps the table or even just looks at the cards wrong, the whole system can collapse and go back to a simple state.

In the Pokémon analogy, this would be like having super rare and powerful cards, but they’re so sensitive that if you shuffle too hard or drop the deck, the cards get damaged or lost. Because of this, it’s hard to keep the quantum player’s strategy intact without ruining their game.

In real life, quantum computers need extremely controlled environments to work—like keeping them at near absolute zero temperatures. Otherwise, they make too many errors to be useful for most tasks.

The quantum player might be amazing at playing certain types of Pokémon battles, like tournaments that require deep strategy or involve many complex moves. However, if they try to play a quick, casual game with a simple strategy, their special abilities don’t help much. They may even be worse at simple games than regular players.

Got it, so Post-Quantum Cryptography

So conventional encryption algorithms often work with the following design. They select 2 very large prime numbers and then multiply them to obtain an even larger number. The act of multiplying the prime numbers is easy, but it's hard to figure out what you used to make the output. These two numbers are known as the prime factors and are what you are talking about obtaining when you are talking about breaking encryption.

Sometimes you hear this referred to as "the RSA problem". How do you get the private key with only the public key. Since this not-yet-existing quantum computer would be good at finding these prime numbers, a lot of the assumptions we have about how encryption works would be broken. For years and years the idea that it is safe to share a public key has been an underpinning of much of the software that has been written. Cue much panic.

But since it takes 20 years for us to do anything as an industry we have to start planning now even though it seems more likely in 20-30 years we'll be struggling to keep any component of the internet functional through massive heat waves and water wars. Anywho.

So the NIST starting in 2016 asked for help selecting some post-quantum standards and ended up settling on 3 of them. Let's talk about them and why they are (probably) better to solve this problem.

FIPS 203 (Module-Lattice-Based Key-Encapsulation Mechanism Standard)

You can read the full standard here: https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.203.pdf

Basically we have two different things happening here. We have a Key-Encapsulation Mechanism, which is a known thing you have probably used. Layered on top of that is a Module-Lattice-Based KEM.

Key-Encapsulation Mechanism

You and another entity need to establish a private key between the two of you, but only using non-confidential communication. Basically the receiver generates a key pair and transmits the public key to the sender. The sender needs to ensure they got the right public key. The sender, using that public key, generates another key and encrypted text and then sends that back to the receiver over a channel that could either be secure or insecure. You've probably done this a lot in your career in some way or another.

More Lattices

There are two common algorithms that allow us to secure key-encapsulation mechanisms.

Ring Learning with Errors
Learning with Errors

Ring Learning with Errors

So we have three parts to this:

Ring: A set of polynomials where the variables are limited to a specific range. If you, like me, forgot what a polynomial is I've got you.

Modulus: The maximum value of the variable in the ring (e.g., q = 1024).
Error Term: Random values added during key generation, simulating noise.

How Does It Work?

Key generation:
1. Choose a large prime number (p)
2. Generate two random polynomials within the ring: a and s. These will be used to create the public and private keys.
Public Key Creation
1. Compute the product of a and a fixed polynomial x, which is part of the key generation algorithm (ax = s + e). The error term e represents the "noise" added to simulate real-world conditions.
Private Key: Keep s secret. It's used for decryption.
Public Key: Publish a. This is how others can send you encrypted messages.

Assuming you have a public key, in order to encrypt stuff you need to do the following:

Generate a random polynomial r
Encrypt the message using a, r, and some additional computation (c = ar + e'). The error term e' represents the "noise" added during encryption.

To decrypt:

Computing the difference between the ciphertext and a multiple of the public key (d = c - as). This eliminates the noise introduced during encryption.
Solving for r: Since we know that c = ar + e', subtracting as from both sides gives us an equation to solve for r.
Extracting the shared secret key: Once you have r, use it as a shared secret key.

What do this look like in Python?

Note: This is not a good example to use for real data. I'm trying to show how it works at a basic level. Never use a randos Python script to encrypt actual real data.

import numpy as np

def rlsr_keygen(prime):
    # Generate large random numbers within the ring
    poly_degree = 4

    # Create a polynomial for key generation
    s = np.random.randint(0, 2**12, size=poly_degree)

    # Compute product of a and x, adding an error term (noise) during key generation
    A = np.random.randint(0, 2**12, size=(poly_degree, poly_degree))
    e = np.random.randint(-2**11, 2**11, size=poly_degree)

    return s, A

def rlsr_encapsulate(A, message):
    # Generate random polynomial to be used for encryption
    modulus = 2**12

    r = np.random.randint(0, 2**12, size=4)

    # Compute ciphertext with noise
    e_prime = np.random.randint(-modulus//2, modulus//2, size=4)
    c = np.dot(A, r) + e_prime

    return c

def rlsr_decapsulate(s, A, c):
    # Compute difference between ciphertext and a multiple of the public key
    d = np.subtract(c, np.dot(A, s))

    # Solve for r (short vector in the lattice)
    # In practice, this is done using various algorithms like LLL reduction

    return d

def generate_shared_secret_key():
    prime = 2**16 + 1  # Example value
    modulus = 2**12

    s, A = rlsr_keygen(prime)

    # Generate a random message (example value)
    message = np.random.randint(0, 256, size=4)

    c = rlsr_encapsulate(A, message)

    # Compute shared secret key
    d = rlsr_decapsulate(s, A, c)

    return d

shared_secret_key = generate_shared_secret_key()
print(shared_secret_key)

FIPS 204 (Module-Lattice-Based Digital
Signature Standard)

You can see the link here: https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.204.pdf

Digital Signatures

Link: https://www.geeksforgeeks.org/digital-signature-algorithm-dsa/

A digital signature is a way to verify the authenticity and integrity of electronic documents, messages, or data. This is pretty important for software supply chains and packaging along with a million other things.

How It Works

Key Generation: Two large prime numbers p and q are chosen, along with their product n. A random matrix A is generated within this lattice. This process creates two keys:
- Public Key (A): Published for others to use when verifying a digital signature.
- Private Key (s): Kept secret by the sender and used to create a digital signature.
Message Hashing: The sender takes their message or document, which is often large in size, and converts it into a fixed-size string of characters called a message digest or hash value using a hash function (e.g., SHA-256). This process ensures that any small change to the original message will result in a completely different hash value.
Digital Signature Creation: The sender encrypts their private key (s) with the public key of the recipient (A) and then combines it with the message digest using a mathematical operation like exponentiation modulo n. This produces a unique digital signature for the original message.
Message Transmission: The sender transmits the digitally signed message (message + digital signature) to the recipient.

Digital Signature Verification:

When receiving the digitally signed message, the recipient can verify its authenticity using their public key (A). Here's how:

Recover Private Key (s): The recipient uses their public key (A) and the received digital signature to recover the private key used by the sender.
Message Hashing (Again): The recipient recreates the message digest from the original message, which should match the one obtained during the digital signature creation process.
Verification: If the two hash values match, it confirms that the original message hasn't been tampered with and was indeed signed by the sender.

Module-Lattice-Based Digital Signature

So a lot of this is the same as the stuff in FIPS 203. I'll provide a Python example for you to see how similar it is.

import numpy as np

def rlsr_keygen(prime):
    # Generate large random numbers within the ring
    modulus = 2**12
    poly_degree = 4
    
    s = np.random.randint(0, 2**12, size=poly_degree)
    
    A = np.random.randint(0, 2**12, size=(poly_degree, poly_degree))
    e = np.random.randint(-2**11, 2**11, size=poly_degree)

    return s, A

def rlsr_sign(A, message):
    # Generate random polynomial to be used for signing
    modulus = 2**12
    
    r = np.random.randint(0, 2**12, size=4)
    
    # Compute signature with noise
    e_prime = np.random.randint(-modulus//2, modulus//2, size=4)
    c = np.dot(A, r) + e_prime
    
    return c

def rlsr_verify(s, A, c):
    # Compute difference between ciphertext and a multiple of the public key
    d = np.subtract(c, np.dot(A, s))
    
    # Check if message can be recovered from signature (in practice, this involves solving for r using LLL reduction)
    return True

def generate_signature():
    prime = 2**16 + 1  # Example value
    modulus = 2**12
    
    s, A = rlsr_keygen(prime)
    
    message = np.random.randint(0, 256, size=4)
    c = rlsr_sign(A, message)
    
    signature_validity = rlsr_verify(s, A, c)
    if signature_validity:
        print("Signature is valid.")
        return True
    else:
        print("Signature is invalid.")
        return False

generate_signature()

Basically the same concept as before but for signatures.

FIPS 205 (Stateless Hash-Based Digital Signature Standard)

Link: https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.205.pdf

The Stateful-Light Hash-based Digital Signature Scheme (SLH-DSA) is a family of digital signature schemes that use hash functions and do not require any intermediate computations or stored state. SLH-DSAs are designed to be highly efficient and secure, making them suitable for various applications.

Basically because they use hashes and are stateless they are more resistant to quantum computers.

Basic Parts

Forest of Random Subsets (FORS): A collection of random subsets generated from a large set.
Hash Functions: Used to compute the hash values for the subsets.
Subset Selection: A mechanism for selecting a subset of subsets based on the message to be signed.

How It Works

Key Generation: Generate multiple random subsets from a large set using a hash function (e.g., SHA-256).
Message Hashing: Compute the hash value of the message to be signed.
Subset Selection: Select a subset of subsets based on the hash value of the message.
Signature Generation: Generate a signature by combining the selected subsets.

Multi-Time Signature Scheme: Extended Merkle Signature Scheme (XMSS)

The Extended Merkle Signature Scheme (XMSS) is a multi-time signature scheme that uses the Merkle Tree technique to generate digital signatures. It is basically the following 4 steps to use.

Key Generation: Generate a Merkle Tree using multiple levels of random hash values.
Message Hashing: Compute the hash value of the message to be signed.
Tree Traversal: Traverse the Merkle Tree to select nodes that correspond to the message's hash value.
Signature Generation: Generate a signature by combining the selected nodes.

Can I have a Python example?

Honestly I really tried on this. But there was not a lot on the internet on how to do this. I will give you what I wrote, but it doesn't work and I'm not sure exactly how to fix it.

Python QRL library. This seems like it'll work but I couldn't get the package to install successfully with Python 3.10, 3.11 or 3.12.
Quantcrypt: This worked but honestly the "example" doesn't really show you anything interesting except that it seems to output what you think it should output.
Standard library: I messed part of it up but I'm not sure exactly where I went wrong.

import hashlib
import os

# Utility function to generate a hash of data
def hash_data(data):
    return hashlib.sha256(data).digest()

# Generate a pair of keys (simplified as random bytes for the demo)
def generate_keypair():
    private_key = os.urandom(32)  # Private key (random 32 bytes)
    public_key = hash_data(private_key)  # Public key derived by hashing the private key
    return private_key, public_key

# Create a simplified Merkle tree with n leaves
def create_merkle_tree(leaf_keys):
    # Create parent nodes by hashing pairs of leaf nodes
    current_level = [hash_data(k) for k in leaf_keys]
    
    while len(current_level) > 1:
        next_level = []
        # Pair nodes and hash them to create the next level
        for i in range(0, len(current_level), 2):
            left_node = current_level[i]
            right_node = current_level[i+1] if i + 1 < len(current_level) else left_node  # Handle odd number of nodes
            parent_node = hash_data(left_node + right_node)
            next_level.append(parent_node)
        current_level = next_level
    
    return current_level[0]  # Root of the Merkle tree

# Sign a message using a given private key
def sign_message(message, private_key):
    # Hash the message and then "sign" it by using the private key
    message_hash = hash_data(message)
    signature = hash_data(private_key + message_hash)
    return signature

def verify_signature(message, signature, public_key):
    message_hash = hash_data(message)
    # Instead of using public_key, regenerate what the signature would be if valid
    expected_signature = hash_data(public_key + message_hash)  # Modify this logic
    return expected_signature == signature

# Example of using the above functions

# 1. Generate key pairs for leaf nodes in the Merkle tree
tree_height = 4  # This allows for 2^tree_height leaves
num_leaves = 2 ** tree_height
key_pairs = [generate_keypair() for _ in range(num_leaves)]
private_keys, public_keys = zip(*key_pairs)

# 2. Create the Merkle tree from the public keys (leaf nodes)
merkle_root = create_merkle_tree(public_keys)
print(f"Merkle Tree Root: {merkle_root.hex()}")

# 3. Sign a message using one of the private keys (simplified signing)
message = b"Hello, this is a test message for XMSS-like scheme"
leaf_index = 0  # Choose which key to sign with (0 in this case)
private_key = private_keys[leaf_index]
public_key = public_keys[leaf_index]

signature = sign_message(message, private_key)
print(f"Signature: {signature.hex()}")

# 4. Verify the signature
is_valid = verify_signature(message, signature, public_key)
print("Signature is valid!" if is_valid else "Signature is invalid!")

It always says signature is invalid. If you spot what I did wrong let me know, but honestly I sort of lost enthusiasm for this as we went. Hopefully the code you shouldn't be using at least provides some context.

Is This Something I Should Worry About Now?

That really depends on what data you are dealing with. If I was dealing with tons of super-sensitive data, I would probably start preparing the way now. This isn't a change you are going to want to make quickly, in no small part to account for the performance difference in using some of these approaches vs more standard encryption. Were I working on something like a medical device or secure communications it would definitely be something I'd at least spike out and try to see what it looked like.

So basically if someone asks you about this, I hope now you can at least talk intelligently about it for 5 minutes until they wander away from boredom. If this is something you actually have to deal with, start with PQClean and work from there.

How Mobile Networks Work

September 27, 2024 in Networking

I've spent a fair amount of time around networking. I've worked for a small ISP, helped to set up campus and office networks and even done a fair amount of work with BGP and assisting with ISP failover and route work. However in my current role I've been doing a lot of mobile network diagnostics and troubleshooting which made me realize I actually don't know anything about how mobile networks operate. So I figured it was a good idea for me to learn more and write up what I find.

It's interesting that without a doubt cellular internet is either going to become or has become the default Internet for most humans alive, but almost no developers I know have any idea how it works (including myself until recently). As I hope that I demonstrate below, it is untold amounts of amazing work that has been applied to this problem over decades that has really produced incredible results. As it turns out the network engineers working with cellular were doing nuclear physics while I was hot-gluing stuff together.

I am not an expert. I will update this as I get better information, but use this as a reference for stuff to look up, not a bible. It is my hope, over many revisions, to turn this into a easier to read PDF that folks can download. However I want to get it out in front of people to help find mistakes.

TL/DR: There is a shocking, eye-watering amount of complexity when it comes to cellular data as compared to a home or datacenter network connection. I could spend the next six months of my life reading about this and feel like I barely scratched the surface. However I'm hoping that I have provided some basic-level information about how this magic all works.

Corrections/Requests: https://c.im/@matdevdug. I know I didn't get it all right, I promise I won't be offended.

Basics

A modern cellular network at the core is comprised of three basic elements:

the RAN (radio access network)
CN (core network)
Services network

RAN

The RAN contains the base stations that allow for the communication with the phones using radio signals. When we think of a cell tower we are thinking of a RAN. When we are thinking of what a cellular network provides in terms of services, a lot of that is actually contained within the CN. That's where the stuff like user authorization, services turned on or off for the user and all the background stuff for the transfer and hand-off of user traffic. Think SMS and phone calls for most users today.

Key Components of the RAN:

Base Transceiver Station (BTS): The BTS is a radio transmitter/receiver that communicates with your phone over the air interface.
Node B (or Evolved Node B for 4G or gNodeB for 5G): In modern cellular networks, Node B refers to a base station that's managed by multiple cell sites. It aggregates data from these cell sites and forwards it to the RAN controller.
Radio Network Controller (RNC): The RNC is responsible for managing the radio link between your phone and the BTS/Node B.
Base Station Subsystem (BSS): The BSS is a term used in older cellular networks, referring to the combination of the BTS and RNC.

Link: https://www.cisco.com/c/en/us/products/collateral/wireless/nb-06-radio-access-networks-cte-en.html

Startup

Cell Search and Network Acquisition. The device powers on and begins searching for available cells by scanning the frequencies of surrounding base stations (e.g., eNodeB for LTE, gNodeB for 5G).

┌──────────────┐             ┌──────────────┐
│  Base Station│             │   Mobile     │
│              │             │   Device     │
│   Broadcast  │             │              │
│  ──────────> │ Search for  │ <──────────  │
│              │ Sync Signals│ Synchronizes │
│              │             │              │
└──────────────┘             └──────────────┘

- Device listens for synchronization signals.
- Identifies the best base station for connection.

Random Access. After identifying the cell to connect to, the device sends a random access request to establish initial communication with the base station.This is often called RACH. If you want to read about it I found an incredible amount of detail here: https://www.sharetechnote.com/html/RACH_LTE.html

┌──────────────┐             ┌──────────────┐
│  Base Station│             │   Mobile     │
│              │             │   Device     │
│  Random Access Response    │              │
│  <────────── │ ──────────> │ Random Access│
│              │             │ Request      │
└──────────────┘             └──────────────┘

- Device sends a Random Access Preamble.
- Base station responds with timing and resource allocation.

Dedicated Radio Connection Setup (RRC Setup). The base station allocates resources for the device to establish a dedicated radio connection using the Radio Resource Control (RRC) protocol.

┌──────────────┐             ┌──────────────┐
│  Base Station│             │   Mobile     │
│              │             │   Device     │
│  RRC Setup   │             │              │
│  ──────────> │ Send RRC    │              │
│              │ Request     │ <──────────  │
│              │             │ RRC Response │
└──────────────┘             └──────────────┘

- Device requests RRC connection.
- Base station assigns resources and confirms.

Device-to-Core Network Communication (Authentication, Security, etc.). Once the RRC connection is established, the device communicates with the core network (e.g., EPC in LTE, 5GC in 5G) for authentication, security setup, and session establishment.

┌──────────────┐               ┌──────────────┐
│  Base Station│               │   Mobile     │
│  ──────────> │ Forward       │              │
│              │ Authentication Data          │
│              │ <──────────   │Authentication│
│              │               │ Request      │
│              │               │              │
└──────────────┘               └──────────────┘

- Device exchanges authentication and security data with the core network.
- Secure communication is established.

Data Transfer (Downlink and Uplink). After setup, the device starts sending (uplink) and receiving (downlink) data using the established radio connection.

┌──────────────┐             ┌──────────────┐
│  Base Station│             │   Mobile     │
│  ──────────> │ Data        │              │
│  Downlink    │             │  <─────────  │
│  <────────── │ Data Uplink │ ──────────>  │
│              │             │              │
└──────────────┘             └──────────────┘

- Data is transmitted between the base station and the device.
- Downlink (BS to Device) and Uplink (Device to BS) transmissions.

Handover. If the device moves out of range of the current base station, a handover is initiated to transfer the connection to a new base station without interrupting the service.

Signaling

As shown in the diagram above, there are a lot of references to something called "signaling". Signaling seems to be a shorthand for handling a lot of configuration and hand-off between tower and device and the core network. As far as I can tell they can be broken into 3 types.

Access Stratum Signaling
1. Set of protocols to manage the radio link between your phone and cellular network.
2. Handles authentication and encryption
3. Radio bearer establishment (setting up a dedicated channel for data transfer)
4. Mobility management (handovers, etc)
5. Quality of Service control.
Non-Access Stratum (NAS) Signaling
1. Set of protocols used to manage the interaction between your phone and the cellular network's core infrastructure.
2. It handles tasks such as authentication, billing, and location services.
3. Authentication with the Home Location Register (HLR)
4. Roaming management
5. Charging and billing
6. IMSI Attach/ Detach procedure
Lower Layer Signaling on the Air Interface
1. This refers to the control signaling that occurs between your phone and the cellular network's base station at the physical or data link layer.
2. It ensures reliable communication over the air interface, error detection and correction, and efficient use of resources (e.g., allocating radio bandwidth).
3. Modulation and demodulation control
4. Error detection and correction using CRCs (Cyclic Redundancy Checks)

High Level Overview of Signaling

You turn on your phone (AS signaling starts).
Your phone sends an Initial Direct Transfer (IDT) message to establish a radio connection with the base station (lower layer signaling takes over).
The base station authenticates your phone using NAS signaling, contacting the HLR for authentication.
Once authenticated, lower layer signaling continues to manage data transfer between your phone and the base station.

What is HLR?

Home Location Register contains the subscriber data for a network. Their IMSI, phone number, service information and is what negotiates where in the world the user physically is.

Duplexing

You have a lot of devices and you have a few towers. You need to do many uplinks and downlinks to many devices.

It is important that any cellular communications system you can send and receive in both directions at the same time. This enables conversations to be made, with either end being able to talk and listen as required. In order to be able to transmit in both directions, a device (UE) and base station must have a duplex scheme. There are a lot of them including Frequency Division Duplex (FDD), Time Division Duplex (TDD), Semi-static TDD and Dynamic TDD.

Duplexing Types:

Frequency Division Duplex (FDD): Uses separate frequency bands for downlink and uplink signals.
1. Downlink: The mobile device receives data from the base station on a specific frequency (F1).
2. Uplink: The mobile device sends data to the base station on a different frequency (F2).
3. Key Principle: Separate frequencies for uplink and downlink enable simultaneous transmission and reception.

┌──────────────┐              ┌──────────────┐
│  Base Station│              │   Mobile     │
│              │              │   Device     │
│  ──────────> │ F1 (Downlink)│ <──────────  │
│              │              │              │
│  <────────── │ F2 (Uplink)  │ ──────────>  │
└──────────────┘              └──────────────┘

Separate frequency bands (F1 and F2)

Time Division Duplex (TDD): Alternates between downlink and uplink signals over the same frequency band.
1. Downlink: The base station sends data to the mobile device in a time slot.
2. Uplink: The mobile device sends data to the base station in a different time slot using the same frequency.
3. Key Principle: The same frequency is used for both uplink and downlink, but at different times.

 ┌──────────────┐                     ┌──────────────┐
 │  Base Station│                     │  Mobile Phone│
 │ (eNodeB/gNB) │                     │              │
 └──────────────┘                     └──────────────┘

     ───────────►  Time Slot 1 (Downlink)
                 (Base station sends data)

     ◄───────────  Time Slot 2 (Uplink)
                 (Mobile sends data)
     
     ───────────►  Time Slot 3 (Downlink)
                 (Base station sends data)
                 
     ◄───────────  Time Slot 4 (Uplink)
                 (Mobile sends data)
 
    - The same frequency is used for both directions.
    - Communication alternates between downlink and uplink in predefined time slots.

Frame design

Downlink/Uplink: There are predetermined time slots for uplink and downlink, but they can be changed periodically (e.g., minutes, hours).
Key Principle: Time slots are allocated statically for longer durations but can be switched based on network traffic patterns (e.g., heavier downlink traffic during peak hours).
A frame typically lasts 10 ms and is divided into time slots for downlink (DL) and uplink (UL).
"Guard" time slots are used to allow switching between transmission and reception.

4. Dynamic Time Division Duplex (Dynamic TDD):

Downlink/Uplink: Time slots for uplink and downlink are dynamically adjusted in real time based on instantaneous traffic demands.
Key Principle: Uplink and downlink time slots are flexible and can vary dynamically to optimize the usage of the available spectrum in real-time, depending on the traffic load.
See second diagram for what "guard periods" are. Basically windows to ensure there are gaps and the signal doesn't overlap.

 ┌──────────────┐                     ┌──────────────┐
 │  Base Station│                     │  Mobile Phone│
 │ (eNodeB/gNB) │                     │              │
 └──────────────┘                     └──────────────┘

     ───────────►  Time Slot 1 (Downlink)
     ───────────►  Time Slot 2 (Downlink)
     ───────────►  Time Slot 3 (Downlink)
     ◄───────────  Time Slot 4 (Uplink)
     ───────────►  Time Slot 5 (Downlink)
     
     ◄───────────  Time Slot 6 (Uplink)
     
    - More slots for downlink in scenarios with high download traffic (e.g., streaming video).
    - Dynamic slot assignment can change depending on the real-time demand.

 ┌──────────────┐                     ┌──────────────┐
 │  Base Station│                     │  Mobile Phone│
 │ (eNodeB/gNB) │                     │              │
 └──────────────┘                     └──────────────┘

     ───────────►  Time Slot 1 (Downlink)
     ───────────►  Time Slot 2 (Downlink)
     [Guard Period]                          (Switch from downlink to uplink)
     ◄───────────  Time Slot 3 (Uplink)
     [Guard Period]                          (Switch from uplink to downlink)
     ───────────►  Time Slot 4 (Downlink)
     
    - Guard periods allow safe switching from one direction to another.
    - Guard periods prevent signals from overlapping and causing interference.

Core

So I've written a lot about what the RAN does. But we haven't really touched on what the core network concept does. Basically once the device registers with the base station using the random access procedure discussed above, the device is enabled and allows the core network to do a bunch of stuff that we typically associate with "having a cellular plan".

For modern devices when we say authentication we mean "mutual authentication", which means the device authenticates the network and the network authenticates the device. This is typically something like a subscriber-specific secret key and a random number to generate a response to the request sent by the device. Then the network sends an authentication token and the device compares this token with the expected token to authenticate the network. It looks like the following:

┌───────────────────────┐
│    Encryption &       │
│  Integrity Algorithms │
├───────────────────────┤
│  - AES (Encryption)   │
│  - SNOW 3G (Encryption│
│  - ZUC (Encryption)   │
│  - SHA-256 (Integrity)│
└───────────────────────┘

- AES: Strong encryption algorithm commonly used in LTE/5G.
- SNOW 3G: Stream cipher used for encryption in mobile communications.
- ZUC: Encryption algorithm used in 5G.
- SHA-256: Integrity algorithm ensuring data integrity.

The steps of the core network are as follows:

Registration (also called attach procedure): The device connects to the core network (e.g., EPC in LTE or 5GC in 5G) to register and declare its presence. This involves the device identifying itself and the network confirming its identity.

Mutual Authentication: The network and device authenticate each other to ensure a secure connection. The device verifies the network’s authenticity, and the network confirms the device’s identity.
Security Activation: After successful authentication, the network and the device establish a secure channel using encryption and integrity protection to ensure data confidentiality and integrity.
Session Setup and IP Address Allocation: The device establishes a data session with the core network, which includes setting up bearers (logical paths for data) and assigning an IP address to enable internet connectivity.

How Data Gets To Phone

Alright we've talked about how the phone finds a tower to talk to, how the tower knows who the phone is and all the millions of steps involved in getting the mobile phone an actual honest-to-god IP address. How is data actually getting to the phone itself?

Configuration for Downlink Measurement: Before downlink data transmission can occur, the mobile device (UE) must be configured to perform downlink measurements. This helps the network optimize transmission based on the channel conditions. Configuration messages are sent from the base station (eNodeB in LTE or gNB in 5G) to instruct the UE to measure certain DL reference signals.
Reference Signal (Downlink Measurements): The mobile device receives reference signals from the network. These reference signals are used by the UE to estimate DL channel conditions. In LTE, Cell-specific Reference Signals (CRS) are used, and in 5G, Channel State Information-Reference Signals (CSI-RS) are used.
DL Channel Conditions (CQI, PMI, RI): The mobile device processes the reference signals to assess the downlink channel conditions and generates reports such as CQI (Channel Quality Indicator), PMI (Precoding Matrix Indicator), and RI (Rank Indicator). These reports are sent back to the base station.
DL Resource Allocation and Packet Transmission: Based on the UE’s channel reports (CQI, PMI, RI), the base station allocates appropriate downlink resources. It determines the modulation scheme, coding rate, MIMO layers, and frequency resources (PRBs) and sends a DL scheduling grant to the UE. The data packets are then transmitted over the downlink.
Positive/Negative Acknowledgement (HARQ Feedback): After the UE receives the downlink data, it checks the integrity of the packets using CRC (Cyclic Redundancy Check). If the CRC passes, the UE sends a positive acknowledgement (ACK) back to the network. If the CRC fails, a negative acknowledgement (NACK) is sent, indicating that retransmission is needed.
New Transmission or Retransmission (HARQ Process): If the network receives a NACK, it retransmits the packet using the HARQ process. The retransmission is often incremental (IR-HARQ), meaning the device combines the new transmission with previously received data to improve decoding.

Uplink is a little different but is basically the device asking for a timeslot to upload, getting a grant, sending the data up and then getting an ack that it is sent.

Gs

So as everyone knows cellular networks have gone through a series of revisions over the years around the world. I'm going to talk about them and just try to walk through how they are different and what they mean.

1G

Starts in Japan, moves to Europe and then the US and UK.
Speeds up to 2.4kbps and operated in the frequency band of 150 KHz.
Didn't work between countries, had low capacity, unreliable handoff and no security. Basically any receiver can listen to a conversation.

2G

Launched in 1991 in Finland
Allows for text messages, picture messages and MMS.
Speeds up to 14.4kbps between 900MHz and 1800MHz bands
Actual security between sender and receiver with messages digitally encrypted.

Wait, are text messages encrypted?

So this was completely new to me but I guess my old Nokia brick had some encryption on it. Here's how that process worked:

Mobile device stores a secret key in the SIM card and the network generates a random challenge and sends it to the mobile device.
The A3 algorithm is used to compute a Signed Response (SRES) using the secret key and the random value.
Then the A8 algorithm is used with secret and the random value to generate a session encryption key Kc (64-bit key). This key will be used for encrypting data, including SMS.
After the authentication process and key generation, encryption of SMS messages begins. GSM uses a stream cipher to encrypt both voice and data traffic, including text messages. The encryption algorithm used for SMS is either A5/1 or A5/2, depending on the region and network configuration.
1. A5/1: A stronger encryption algorithm used in Europe and other regions.
2. A5/2: A weaker variant used in some regions, but deprecated due to its vulnerabilities.
3. The A5 algorithm generates a keystream that is XORed with the plaintext message (SMS) to produce the ciphertext, ensuring the confidentiality of the message.

So basically text messages from the phone to the base station were encrypted and then exposed there. However I honestly didn't even know that was happening.

TSMA and CDMA

I remember a lot of conversations about GSM vs CDMA when you were talking about cellular networks but at the time all I really knew was "GSM is European and CDMA is US".

TSMA is GSM and uses time slots
CDMA allocates each user a special code to communicate over multiple physical channels
GSM is where we see services like voice mail, SMS, call waiting

EDGE

So everyone who is old like me remembers EDGE on cellphones, including the original iPhone I waited in line for. EDGE was effectively a retrofit you could put on top of an existing GSM network, keeping the cost for adding it low. You got speeds on 9.6-200kbps.

3G

Welcome to the year 2000
Frequency spectrum of 3G transmissions is 1900-2025MHz and 2110-2200MHz.
UTMS takes over for GSM and CDMA2000 takes over from CDMA.
Maxes out around 8-10Mbps
IMT-2000 = 3G

So let's just recap quickly how we got here.

2G (GSM): Initially focused on voice communication and slow data services (up to 9.6 kbps using Circuit Switched Data).
2.5G (GPRS): Introduced packet-switched data with rates of 40-50 kbps. It allowed more efficient use of radio resources for data services.
2.75G (EDGE): Enhanced the data rate by improving modulation techniques (8PSK). This increased data rates to around 384 kbps, making it more suitable for early mobile internet usage.

EDGE introduced 8-PSK (8-Phase Shift Keying) modulation, which allowed the encoding of 3 bits per symbol (as opposed to 1 bit per symbol with the original GSM’s GMSK (Gaussian Minimum Shift Keying) modulation). This increased spectral efficiency and data throughput.

EDGE had really high latency so it wasn't really usable for things like video streaming or online gaming.

3G (WCDMA): Max data rate: 2 Mbps (with improvements over EDGE in practice). Introduced spread-spectrum (CDMA) technology with QPSK modulation.
3.5G (HSDPA): Enhanced WCDMA by introducing adaptive modulation (AMC), HARQ, and NodeB-based scheduling. Max data rate: 14.4 Mbps (downlink).

So when we say 3G we actually mean a pretty wide range of technologies all underneath the same umbrella.

4G

4G or as it is sometimes called LTE evolved from WCDMA. Instead of developing new radio interfaces and new technology existing and newly developed wireless system like GPRS, EDGE, Bluetooth, WLAN and Hiper-LAN were integrated together
4G has a download speed of 67.65Mbps and upload speed of 29.37Mbps
4G operates at frequency bands of 2500-2570MHz for uplink and 2620-2690MHz for downlink with channel bandwidth of 1.25-20MHz
4G has a few key technologies, mainly OFDM, SDR and Multiple-Input Multiple-Output (MIMO).
- OFDM (Orthogonal Frequency Division Multiplexing)
  - Allows for more efficient use of the available bandwidth by breaking down data into smaller pieces and sending them simultaneously
  - Since each channel uses a different frequency, if one channel experiences interference or errors, the others remain unaffected.
  - OFDM can adapt to changing network conditions by dynamically adjusting the power levels and frequencies used for each channel.
- SDR (Software Defined Radio)
  - Like it sounds, it is a technology that enables flexible and efficient implementation of wireless communication systems by using software algorithms to control and process radio signals in real-time. In cellular 4G, SDR is used to improve performance, reduce costs, and enable advanced features like multi-band support and spectrum flexibility.
- MIMO (multiple-input multiple-output)
  - A technology used in cellular 4G to improve the performance and capacity of wireless networks. It allows for the simultaneous transmission and reception of multiple data streams over the same frequency band, using multiple antennas at both the base station and mobile device.
  - Works by having both the base station and the mobile device equipped with multiple antennas
  - Each antenna transmits and receives a separate data stream, allowing for multiple streams to be transmitted over the same frequency band
  - There is Spatial Multiplexing where multiple data streams are transmitted over the same frequency band using different antennas. Then Beamforming where advanced signal processing techniques to direct the transmitted beams towards specific users, improving signal quality and reducing interference. Finally Massive MIMO where you use a lot of antennas (64 or more) to improve capacity and performance.

5G

The International Telecommunication Union (ITU) defines 5G as a wireless communication system that supports speeds of at least 20 Gbps (gigabits per second), with ultra-low latency of less than 1 ms (millisecond).
5G operates on a much broader range of frequency bands than 4G
- Low-band frequencies: These frequencies are typically below 3 GHz and are used for coverage in rural areas or indoor environments. Examples include the 600 MHz, 700 MHz, and 850 MHz bands.
- Mid-band frequencies: These frequencies range from approximately 3-10 GHz and are used for both coverage and capacity in urban areas. Examples include the 4.5 GHz, 6 GHz, and 24 GHz bands.
- High-band frequencies: These frequencies range from approximately 10-90 GHz and are used primarily for high-speed data transfer in dense urban environments. Examples include the 28 GHz, 39 GHz, and 73 GHz bands.
5g network designs are a step up in complexity from their 4g predecessors, with a control plane and a userplane with each plane using a separate network function. 4G networks have a single plane.
5G uses advanced modulation schemes such as 256-Quadrature Amplitude Modulation (QAM) to achieve higher data transfer rates than 4G, which typically uses 64-QAM or 16-QAM
All the MIMO stuff discussed above.

What the hell is Quadrature Amplitude Modulation?

I know, it sounds like a Star Trek thing. It is a way to send digital information over a communication channel, like a wireless network or cable. It's a method of "modulating" the signal, which means changing its characteristics in a way that allows us to transmit data.

When we say 256-QAM, it refers to the specific type of modulation being used. Here's what it means:

Quadrature: This refers to the fact that the signal is being modulated using two different dimensions (or "quadratures"). Think of it like a coordinate system with x and y axes.
Amplitude Modulation (AM): This is the way we change the signal's characteristics. In this case, we're changing the amplitude (magnitude) of the signal to represent digital information.
256: This refers to the number of possible states or levels that the signal can take on. Think of it like a binary alphabet with 2^8 = 256 possible combinations.

Why does 5G want this?

More information per symbol: With 256-QAM, each "symbol" (or signal change) can represent one of 256 different values. This means we can pack more data into the same amount of time.
Faster transmission speeds: As a result, we can transmit data at higher speeds without compromising quality.

Kubernetes and 5G

Kubernetes is a popular technology in 5G and is used for a number of functions, including the following:

Virtual Network Functions (VNFs): VNFs are software-based implementations of traditional network functions, such as firewalls or packet filters. Kubernetes is used to deploy and manage these VNFs.
Cloud-Native Network Functions (CNFs): CNFs are cloud-native applications that provide network function capabilities, such as traffic management or security filtering. Kubernetes is used to deploy and manage these CNFs.
Network Function Virtualization (NFV) Infrastructure: NFV infrastructure provides the underlying hardware and software resources for running VNFs and CNFs. Kubernetes is used to orchestrate and manage this infrastructure.

Conclusion

So one of the common sources of frustration for developers I've worked with when debugging cellular network problems is that often while there is plenty of bandwidth for what they are trying to do, the latency involved can be quite variable. If you look at all the complexity behind the scenes and then factor in that the network radio on the actual cellular device is constantly flipping between an Active and Idle state in an attempt to save battery life, this suddenly makes sense.

Because all of the complexity I'm talking about ultimately gets you back to the same TCP stack we've been using for years with all the overhead involved in that back and forth. We're still ending up with a SYN -> SYN-ACK. There are tools you can use to shorten this process somewhat (TCP Fast Open) and changing the initial congestion window but still you are mostly dealing with the same level of overhead you always dealt with.

Ultimately there isn't much you can do with this information, as developers have almost no control over the elements present here. However I think it's useful as cellular networks continue to become the dominant default Internet for the Earth's population that more folks understand the pieces happening in the background of this stack.

Why Login Security Sucks

September 06, 2024 in security

I've complained a lot about the gaps in offerings for login security in the past. The basic problem is this domain of security serves a lot of masters. To get the widest level of buy-in from experts, the solution has to scale from normal logins to national security. This creates a frustrating experience for users because it is often overkill for the level of security they need. Basically is it reasonable that you need Google Authenticator to access your gym website? In terms of communication, the solutions we hear about the most, i.e. with the most marketing, allow for the insertion of SaaS services into the chain so that an operation that was previously free now pays a monthly fee based on usage.

This creates a lopsided set of incentives where only the most technologically complex and extremely secure solutions are endorsed and when teams are (understandably) overwhelmed by their requirements a SaaS attempts to get inserted into a critical junction of their product.

The tech community have mostly agreed that username and passwords assigned by the user are not sufficient for even basic security. What we haven't done is precisely explained what it is that we want normal average non-genius developers to do about that. We've settled on this really weird place with the following rules:

Email accounts are always secure but SMS is never secure. You can always email a magic link and that's fine for some reason.
You should have TOTP but we've settled on very short time windows because I guess we decided NTP was a solved problem. There's no actual requirement the code changes every 30 seconds, we're just pretending that we're all spies and someone is watching your phone. Also consumers should be given recovery codes, which are basically just passwords you generate and give to them and only allow to be used once. It is unclear why generating a one-time password for the user is bad but if we call the password a "recovery code" it is suddenly sufficient.

TOTP serves two purposes. One is it ensures there is one randomly generated secret associated with the account that we don't hash (even though I think you could....but nobody seems to), so it's actually kind of a dangerous password that we need to encrypt and can't rotate. The other is we tacked on this stupid idea that it is multi-device, even though there's zero requirement that the code lives on another device. Just someone decided that because there is a QR code it is now multi-device because phones scan QR codes.
At some point we decided to add a second device requirement, but those devices live in entirely different ecosystems. Even if you have an iPhone and a work MacBook, they shouldn't be using the same Apple ID, so I'm not really clear how things would ever line up. It seems like most people sync things like TOTP with their personal Google accounts across different work devices over time. I can't imagine that was the intended functionality.
Passkeys are great but also their range of behavior is bizarre and unpredictable so if you implement them you will be expected to effectively build every other possible recovery flow into this system. Even highly technical users cannot be relied upon to know whether they will lose their passkey when they do something.
Offloading the task to a large corporation is good, but you cannot pick one big corporation. You must have a relationship with Apple and Facebook and Microsoft and Google and Discord and anyone else who happens to be wandering around when you build this. Their logins are secured with magic and unbreakable, but if they are bypassed you can go fuck yourself because that is your problem, not theirs.

All of this is sort of a way to talk around the basic problem. I need a username and a password for every user on my platform. That password needs to be randomly generated and never stored as plain text in my database. If I had a way to know that the browser generated and stored the password, this basic level of security is met. As far as I can tell, there's no way for me to know that for sure. I can guess based on the length of the password and how quickly it was entered into a form field.

Keep in mind all I am trying to do is build a simple login route on an application that is portable, somewhat future proof and doesn't require a ton of personal data from the user to resolve common human error problems. Ideally I'd like to be able to hand this to someone else, they generate a new secret and they too can enroll as many users as they want. This is a simple thing to build so it should be simple to solve the login story as well.

Making a simple CMS

The site you are reading this on is hosted on Ghost, a CMS that is written in Node. It supports a lot of very exciting features I don't use and comes with a lot of baggage I don't need. Effectively all I actually use for is:

RSS
Writing posts in its editor
Fixing typos in the posts I publish (sometimes, my writing is not good)
Let me write a million drafts for every thing I publish
Minimize the amount of JS I'm inflicting on people and try whenever possible to stick to just HTML and CSS

Ghost supports a million things on top of the things I have listed and it also comes with some strange requirements like running MySQL. I don't really need a lot of that stuff and running a full MySQL for a CMS that doesn't have any sort of multi-instance scaling functionality seems odd. I also don't want to stick something this complicated on the internet for people to use for long periods of time without regular maintenance.

Before you say it I don't care for static site generators. I find it's easier for me to have a tab open, write for ten minutes, then go back to what I was doing before.

My goal with this is just to make a normal friendly baby CMS that I could share with a group of people, less technical people, so they could write stuff when they felt like it. We're not trading nuclear secrets here. The requirements are:

Needs to be open to the public internet with no special device enrollment or network segmentation
Not administered by me. Whatever normal problem arises it has to be solvable by a non-technical person.

Making the CMS

So in a day when I was doing other stuff I put this together: https://gitlab.com/matdevdug/ezblog. It's nothing amazing, just sort of a basic template I can build on top of later. Uses sqlite and it does the things you would expect it to do. I can:

Write posts in Quill
Save the posts as drafts or as published posts
Edit the posts after I publish them
Have a valid RSS feed of the posts
The whole frontend is just HTML/CSS so it'll load fast and be easy to cache

Then there is the whole workflow of draft to published.

For one days work this seems to be roughly where I hoped to be. Now we get to the crux of the matter. How do I log in?

What you built is bad and I hate it

The point is I should be able to solve this problem quickly and easily for a hobby website, not that you personally like what I made. The examples are not fully-fleshed out examples, just templates to demonstrate the problem. Also I'm allowed to make stuff that serves no other function than it amuses me.

The default for most sites (including Ghost) is just a username and password. The reason for this: it's easy, works on everything and it's pretty simple to work out a fallback flow for users. Everyone understands it, there's no concerns around data ownership or platform lock-in.

My login page:

{% extends 'base.html' %}

{% block header %}
  <h1>{% block title %}Log In{% endblock %}</h1>
{% endblock %}

{% block content %}
  <form method="post">
    <input type="hidden" name="csrf_token" value="{{ csrf_token() }}">
    <label for="username">Username</label>
    <input name="username" id="username" required>
    <label for="password">Password</label>
    <input type="password" name="password" id="password" required>
    <input type="submit" value="Log In">
  </form>
{% endblock %}

I've got a csrf_token in there and the rest is pretty straight forward. Server-side is also pretty easy.

@bp.route('/login', methods=('GET', 'POST'))
@limiter.limit("5 per minute")
def login():
    if request.method == 'POST':
        username = request.form['username']
        password = request.form['password']
        db = get_db()
        error = None
        user = db.execute(
            'SELECT * FROM user WHERE username = ?', (username,)
        ).fetchone()

        if user is None:
            error = 'Incorrect username.'
        elif not check_password_hash(user['password'], password):
            error = 'Incorrect password.'

        if error is None:
            session.clear()
            session['user_id'] = user['id']
            return redirect(url_for('index'))

        flash(error)

    return render_template('auth/login.html')

I'm not storing the raw password, just the hash. It's requires almost no work to do. It works exactly the way I think it should. Great fine.

Why are passwords insufficient?

This has been talked to death but let's recap for the sake of me being able to say I did it and you can just kinda scroll quickly through this part.

Users reuse usernames and passwords, so even though I might not know the raw text of the password another website might be (somehow) even lazier than me and their database gets leaked and then oh no I'm hacked.
The password might be a bad password and it's just one people try and oh no they are in the system.
I have to build in a password reset flow because humans are bad at remembering things and that's just how it is.

Password Reset Flow

Everyone has seen this, but let's talk about what I would need to modify about this small application to allow more than one person to use it.

I would need to add a route that handles allowing the user to reset their password by requesting it through their email
To know where to send that email, I would need to receive and store the email address for every user
I would also need to verify the users email to ensure it worked
All of this hinges on having a token I could send to that user that I could generate with something like the following:

def generate_reset_password_token(self):
    serializer = URLSafeTimedSerializer(current_app.config["SECRET_KEY"])

    return serializer.dumps(self.email, salt=self.password_hash)

Since I'm salting it with the hash of the current password which will change when they change the password, the token can only be used once. Makes sense.

Why is this bad?

For a ton of reasons.

I don't want to know an email address if I don't need it. There's no reason to store more personal information about a user that makes the database more valuable if someone were to steal it.
Email addresses change. You need to write another route which handles that process, which isn't hard but then you need to decide whether you need to confirm that the user has access to address 1 with another magic URL or if it is sufficient to say they are currently logged in.
Finally it sort of punts the problem to email and says "well I assume and hope your email is secure even if statistically you probably use the same password for both".

How do you fix this?

The problem can be boiled down to 2 basic parts:

I don't want the user to tell me a username, I want a randomly generated username so it further reduces the value of information stored in my database. It also makes it harder to do a random drive-by login attempt.
I don't want to own the password management story. Ideally I want the browser to do this on its side.
In a perfect world I want a response that says "yes we have stored these credentials somewhere under this users control" and I can wash my hands of that until we get into the situation where somehow they've lost access to the sync account (which should hopefully be rare enough that we can just do that in the database).

The annoying thing is this technology already exists.

The Credential Manager API does the things I am talking about. Effectively I would need to add some Javascript to my Registration page:

    <script>
        document.getElementById('register-form').addEventListener('submit', function(event) {
            event.preventDefault(); // Prevent form submission

            const username = document.getElementById('username').value;
            const password = document.getElementById('password').value;

            // Save credentials using Credential Management API
            if ('credentials' in navigator) {
                const cred = new PasswordCredential({
                    id: username,
                    password: password
                });

                // Store credentials in the browser's password manager
                navigator.credentials.store(cred).then(() => {
                    console.log('Credentials stored successfully');
                    // Proceed with registration, for example, send credentials to your server
                    registerUser(username, password);
                }).catch(error => {
                    console.error('Error storing credentials:', error);
                });
            } 
        });

        function registerUser(username, password) {
            // Simulate server registration request
            fetch('/register', {
                method: 'POST',
                headers: { 'Content-Type': 'application/json' },
                body: JSON.stringify({ username: username, password: password })
            }).then(response => {
                if (response.ok) {
                    console.log('User registered successfully');
                    // Redirect or show success message
                } else {
                    console.error('Registration failed');
                }
            });
        }
    </script>

Then on my login page something like this:

function attemptAutoLogin() {
    if ('credentials' in navigator) {
        navigator.credentials.get({password: true}).then(cred => {
            if (cred) {
                // Send the credentials to the server to log in the user
                fetch('/login', {
                    method: 'POST',
                    body: new URLSearchParams({
                        'username': cred.id,
                        'password': cred.password
                    })
                }).then(response => {
                    // Handle login success or failure
                    if (response.ok) {
                        console.log('User logged in');
                    } else {
                        console.error('Login failed');
                    }
                });
            }
        }).catch(error => {
            console.error('Error retrieving credentials:', error);
        });
    }
}

// Call the function when the page loads
document.addEventListener('DOMContentLoaded', attemptAutoLogin);

So great, I assign a random cred.id and cred.password, stick it in the browser and then I sorta wash my hands of it.

We know the password is stored somewhere and can be synced for free
We know the user can pull the password out and put it somewhere else if they want to switch platforms
Browsers handle password migrations for users

The problem with this approach is I don't know if I'm supposed to use it.

I have no idea what this means. Could this go away? In testing it does seem like the performance is all over the place. Firefox seems to have some issues with this, whereas Chrome seems to always nail it. iOS Safari also seems to have some problems. So this isn't seemingly reliable enough to use.

Just please just make this a thing that works everywhere.

Before you yell at me about Math.random I think the following would make a good password:

function generatePassword(length) {
  const charset = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789";
  let password = "";
  for (let i = 0; i < length; i++) {
    const randomIndex = Math.floor(Math.random() * charset.length);
    password += charset.charAt(randomIndex);
  }
  return password;
}

const password = generatePassword(32);
console.log(password);

Source

But also maybe I'm wrong. I never have any idea if I'm generating random things in a safe way. Every time I do it someone tells me I did it wrong.

TOTP

Alright so I can't get away with just a password, so I have to assume the password is bunk and use it as one element of login. Then I have to use either TOTP or HOTP.

From a user perspective TOTP works as follows:

Set up 2FA for your online account.
Get a QR code.
You scan this QR code with an authenticator app of your choice
Your app will immediately start generating these six-digit tokens.
The website asks you to provide one of these six-digit tokens.

Practically this is pretty straight forward. I add a few extra libraries:

import io
import pyotp
import qrcode
from flask import send_file

I have to generate a secret totp_secret = pyotp.random_base32() which then I have to store in the database. Then I have to generate a QR code to show the user so they can generate the time-based codes.

otp_uri = pyotp.totp.TOTP(totp_secret).provisioning_uri(username, issuer_name="ezblog")
                qr = qrcode.make(otp_uri)
                buf = io.BytesIO()
                qr.save(buf)
                buf.seek(0)

                return send_file(buf, mimetype='image/png')

However the more you look into this, the more complicated it gets.

You actually don't need the token to be 6 digits. It can be up to 10. I don't know why I'd want more or less. Presumably more is better.
The token can be valid for longer than 30 seconds. From reading it seems like that makes the code less reliant on perfect time sync between client and server (great) but also increases the probability of someone stealing the TOTP and using it. That doesn't seem like a super likely attack vector here so I'll make it way longer. But then why don't more services use longer tokens if the only concern then is if someone sees my code? Is this just people being unspeakably annoying?
I need to add some recovery step in case you lose access to the TOTP code.

How do you recover from a TOTP failure?

Effectively I'm back to my original problem. I can either:

Go back to the email workflow I don't want because again I don't want to rely on email as some sort of super-secure bastion and I really don't want to store email addresses.
Or I generate a recovery code and give you those codes which let you bypass the TOTP requirement. That at least lets me be like "this is no longer my fault". I like that.

How do I make a recovery code?

Honest to god I have no idea. As far as I can tell a "recovery code" is just a randomly generated value I hash and stick in the database and then when the user enters it on a form, check the hash. It's just another password. I don't know why all the recovery codes I see are numbers, since it seems to have no relationship to that and would likely work with any string.

Effectively all I need to do with the recovery code is ensure it gets burned once used. Which is fine, but now I'm confused. So I'm generating passwords for the user and then I give the passwords back to the user and tell them to store them somewhere? Why don't I just give them the one good password for the initial login and call it a day? Why is one forbidden and the other is mandatory?

Does HOTP help?

I'm really still not clear how HOTP works. Like I understand the basics:

@app.route('/verify_2fa', methods=['GET', 'POST'])
def verify_2fa():
    if request.method == 'POST':
        hotp = pyotp.HOTP(user_data['hotp_secret'])
        otp = request.form['otp']
        if hotp.verify(otp, user_data['counter']):
            user_data['counter'] += 1  # Increment the counter after successful verification
            return redirect(url_for('index'))
        flash('Invalid OTP')
    return render_template('verify_2fa.html')

There is a secret per-user and a counter and then I increment the counter every single time the user logs in. As far as I can tell there isn't a forcing mechanism which keeps the client and the server in-sync, so basically you tap a button and generate a password and then if you accidentally tap the button again the two counters are off. It seems like then the server has to decide "are you a reasonable number of times off or an unreasonable amount of counts off". With the PyOTP library I don't see a way for me to control that:

verify(otp: str, counter: int) → bool[source]

    Verifies the OTP passed in against the current counter OTP.

    Parameters:

            otp – the OTP to check against

            counter – the OTP HMAC counter

So I mean I could test it against a certain range of counters from the counter I know and then accept it if it falls within that window, but you still are either running a different application or an app on your phone to enter this code. I'm not sure exactly why I would ever use this over TOTP, but it definitely doesn't seem easier to recover from.

So TOTP would work with the recovery code but this seems aggressive to ask a normal people to install a different program on their computer or phone in order to login based on a time-based code which will stop working if the client and server (who have zero way to sync time with each other) drift too far apart. Then I need to give you recovery codes and just sorta hope you have somewhere good to put those.

That said, it is the closest to solving the problem because those are at least normal understandable human problems and it does meet my initial requirement of "the user has one good password". It's also portable and allows administrators to be like "well you fell through the one safety net, account is locked, make a new one".

What is the expected treatment of the TOTP secret?

When I was writing this out I became unsure if I'm allowed to hash this secret. Like in theory I should be able to, because I don't need to recover it. If the user was to go through a TOTP reset flow, then I would probably (presumably) want to generate a new secret in which case there's nothing stopping me from using a strong key derivation function.

None of the tutorials I was able to find seemed to have any opinion on this topic. It seems like using encryption is the SOP, which is fine (it's not sitting on disk as a plain string) but introduces another failure point. It seems odd there isn't a way to negotiate a rotation with a client or really provide any sort of feedback. It meets my initial requirement, but the more I read about TOTP the more surprised I was it hasn't been better thought out.

Things I would love from TOTP/HOTP

Some sort of secret rotation process would be great. It doesn't have to be common, but it would be nice if there was some standard way of informing the client.
Be great if we more clearly explained to people how long the codes should be valid for. Certainly 1 hour is sufficient for consumer-level applications right?
Explain like what would I do if the counters get off with HOTP. Certainly some human error must be accounted for by the designers. People are going to hit the button too many times at some point.

Use Google/Facebook/Apple

I'm not against using these sorts of login buttons except I can't offer just one, I need to offer all of them. I have no idea what login that user is going to have or what make sense for them to use. It also means I need to manage some sort of app registration with each one of these companies for each domain that they can suspend approximately whenever they feel like it because they're giant megacorps.

So now I can't just spin up as many copies of this thing as I want with different URLs and I need to go through and test each one to ensure they work. I also need to come up with some sort of migration path for if one of them disappears and I need to authenticate the users into their existing accounts but using a different source of truth.

Since I cannot think of a way to do that which doesn't involve me basically emailing a magic link to the email address I get sent in the response from your corpo login and then allowing that form to update your user account with a different "real_user_id" I gotta abandon this. It just seems like a tremendous amount of work to not really "solve" the problem but just make the problem someone else's fault if it doesn't work.

Like if a user could previously log into a Facebook account and now no longer can, there's no customer service escalation they can go on. They can effectively go fuck themselves because nobody cares about one user encountering a problem. But that means you would still need some way of being like "you were a Facebook user and now you are a Google user". Or what if the user typically logs in with Google, clicks Facebook instead and now has two accounts? Am I expected to reconcile the two?

It's also important to note that I don't want any permissions and I don't want all the information I get back. I don't want to store email address or real name or anything like that, so again like the OAuth flow is overkill for my usage. I have no intention of requesting permissions on behalf of these users with any of these providers.

Use Passkeys

Me and passkeys don't get along super well, mostly because I think they're insane. I've written a lot about them in the past: https://matduggan.com/passkeys-as-a-tool-for-user-retention/ and I won't dwell on it except to say I don't think passkeys are designed with the first goal being an easy user experience.

But regardless passkeys do solve some of my problems.

Since I'm getting a public key I don't care if my database gets leaked
In theory I don't need an email address for fallback because on some platforms some of the time they sync
If users care a lot about ownership of personal data they can use a password manager sometimes if the password manager knows the right people and idk is friends with the mayor of passkeys or something. I don't really understand how that works, like what qualifies you to store the passkeys.

My issue with passkeys is I cannot conceive of a even "somewhat ok" fallback plan. So you set it up on an iPhone with a Windows computer at home. You break your iPhone and get an Android. It doesn't seem that crazy of a scenario to me to not have any solution for. Do I need your phone number on top of all of this? I don't want that crap sitting in a database.

Tell the users to buy a cross-platform password manager

Oh ok yeah absolutely normal people care enough about passwords to pay a monthly fee. Thanks for the awesome tip. I think everyone on Earth would agree they'd give up most of the price of a streaming platform full of fun content to pay for a password manager. Maybe I should tell them to spin up a docker container and run bitwarden while we're at it.

Anyway I have a hyper-secure biometric login as step 1 and then what is step 2, as the fallback? An email magic link? Based on what? Do I give you "recovery codes" like I did with TOTP? It seems crazy to layer TOTP on top of passkeys but maybe that...makes some sense as a fallback route? That seems way too secure but also possibly the right answer?

I'm not even trying to be snarky, I just don't understand what would be the acceptable position to take here.

What to do from here

Basically I'm left where I started. Here are my options:

Let the user assign a username and password and hope they let the browser or password manager do it and assume it is a good one.
Use the API in the browser to generate a good username and password and store it, hoping they always use a supported browser and that this API doesn't go away in the future.
Generate a TOTP but then also give them passwords called "recovery codes" and then hope they store those passwords somewhere good.
Use email magic links a lot and hope they remember to change their email address here when they lose access to an old email.
Use passkeys and then add on one of the other recovery systems and sort of hope for the best.

What basic stuff would I need to solve this problem forever:

The browser could tell me if it generated the password or if the user typed the password. If they type the password, force the 2FA flow. If not, don't. Let me tell the user "seriously let the system make the password". 1 good password criteria met.
Have the PasswordCredential API work everywhere all the time and I'll make a random username and password on the client and then we can just be done with this forever.
Passkeys but they live in the browser and sync like a normal password. Passkey lite. Passkey but not for nuclear secrets.
TOTP but if recovery codes are gonna be a requirement can we make it part of the spec? It seems like a made-up concept we sorta tacked on top.

I don't think these are crazy requirements. I just think if we want people to build more stuff and for that stuff to be secure, someone needs to sit down and realistically map out "how does a normal person do this". We need consistent reliable conventions I can build on top of, not weird design patterns we came up with because the initial concept was never tested on normal people before being formalized into a spec.

Watch The Bubble Burst

Let's Jump On A Call

Getting Someone To Pay You

The Honeymoon Is Over

Account Reps

Fond Farewell

Taking Orbit Apart

Slightly Mysterious Stuff

Orbit Network Traffic

Does it work well?

Testing

Conclusion

Where Should I Apply?

Interview

Getting Started

1800s Factory Tycoon

Sad Engineer

Jira as a Religion (JaaR)

Your Team

Team Dynamics

Take Care Of Yourself

Desired End State

Pokemon TCG Pocket Friend Website

Why Flask?

Let me just see what you made and skip the tutorial

Note:

Getting Started

Let's start writing stuff

Recommendations to do this real fast

Deploying to Production

See? Wasn't that hard!

What do users actually deserve?

The fallacies of decentralization as a solution

The reality of privacy as a privilege

It actually makes a difference

Quantum Computers Explained Through Pokemon Cards

Why Aren't Quantum Computers More Relevant To Me?

Got it, so Post-Quantum Cryptography

FIPS 203 (Module-Lattice-Based Key-Encapsulation Mechanism Standard)

More Lattices

FIPS 204 (Module-Lattice-Based DigitalSignature Standard)

Module-Lattice-Based Digital Signature

FIPS 205 (Stateless Hash-Based Digital Signature Standard)

Basic Parts

How It Works

Multi-Time Signature Scheme: Extended Merkle Signature Scheme (XMSS)

Is This Something I Should Worry About Now?

Basics

RAN

Startup

Signaling

Duplexing

Core

How Data Gets To Phone

Gs

1G

2G

3G

4G

5G

Kubernetes and 5G

Conclusion

Making a simple CMS

Making the CMS

Password Login

Why are passwords insufficient?

Password Reset Flow

How do you fix this?

TOTP

Use Google/Facebook/Apple

Use Passkeys

What to do from here

FIPS 204 (Module-Lattice-Based Digital
Signature Standard)