Why Login Security Sucks

I've complained a lot about the gaps in offerings for login security in the past. The basic problem is this domain of security serves a lot of masters. To get the widest level of buy-in from experts, the solution has to scale from normal logins to national security. This creates a frustrating experience for users because it is often overkill for the level of security they need. Basically is it reasonable that you need Google Authenticator to access your gym website? In terms of communication, the solutions we hear about the most, i.e. with the most marketing, allow for the insertion of SaaS services into the chain so that an operation that was previously free now pays a monthly fee based on usage.

This creates a lopsided set of incentives where only the most technologically complex and extremely secure solutions are endorsed and when teams are (understandably) overwhelmed by their requirements a SaaS attempts to get inserted into a critical junction of their product.

The tech community have mostly agreed that username and passwords assigned by the user are not sufficient for even basic security. What we haven't done is precisely explained what it is that we want normal average non-genius developers to do about that. We've settled on this really weird place with the following rules:

Email accounts are always secure but SMS is never secure. You can always email a magic link and that's fine for some reason.
You should have TOTP but we've settled on very short time windows because I guess we decided NTP was a solved problem. There's no actual requirement the code changes every 30 seconds, we're just pretending that we're all spies and someone is watching your phone. Also consumers should be given recovery codes, which are basically just passwords you generate and give to them and only allow to be used once. It is unclear why generating a one-time password for the user is bad but if we call the password a "recovery code" it is suddenly sufficient.

TOTP serves two purposes. One is it ensures there is one randomly generated secret associated with the account that we don't hash (even though I think you could....but nobody seems to), so it's actually kind of a dangerous password that we need to encrypt and can't rotate. The other is we tacked on this stupid idea that it is multi-device, even though there's zero requirement that the code lives on another device. Just someone decided that because there is a QR code it is now multi-device because phones scan QR codes.
At some point we decided to add a second device requirement, but those devices live in entirely different ecosystems. Even if you have an iPhone and a work MacBook, they shouldn't be using the same Apple ID, so I'm not really clear how things would ever line up. It seems like most people sync things like TOTP with their personal Google accounts across different work devices over time. I can't imagine that was the intended functionality.
Passkeys are great but also their range of behavior is bizarre and unpredictable so if you implement them you will be expected to effectively build every other possible recovery flow into this system. Even highly technical users cannot be relied upon to know whether they will lose their passkey when they do something.
Offloading the task to a large corporation is good, but you cannot pick one big corporation. You must have a relationship with Apple and Facebook and Microsoft and Google and Discord and anyone else who happens to be wandering around when you build this. Their logins are secured with magic and unbreakable, but if they are bypassed you can go fuck yourself because that is your problem, not theirs.

All of this is sort of a way to talk around the basic problem. I need a username and a password for every user on my platform. That password needs to be randomly generated and never stored as plain text in my database. If I had a way to know that the browser generated and stored the password, this basic level of security is met. As far as I can tell, there's no way for me to know that for sure. I can guess based on the length of the password and how quickly it was entered into a form field.

Keep in mind all I am trying to do is build a simple login route on an application that is portable, somewhat future proof and doesn't require a ton of personal data from the user to resolve common human error problems. Ideally I'd like to be able to hand this to someone else, they generate a new secret and they too can enroll as many users as they want. This is a simple thing to build so it should be simple to solve the login story as well.

Making a simple CMS

The site you are reading this on is hosted on Ghost, a CMS that is written in Node. It supports a lot of very exciting features I don't use and comes with a lot of baggage I don't need. Effectively all I actually use for is:

RSS
Writing posts in its editor
Fixing typos in the posts I publish (sometimes, my writing is not good)
Let me write a million drafts for every thing I publish
Minimize the amount of JS I'm inflicting on people and try whenever possible to stick to just HTML and CSS

Ghost supports a million things on top of the things I have listed and it also comes with some strange requirements like running MySQL. I don't really need a lot of that stuff and running a full MySQL for a CMS that doesn't have any sort of multi-instance scaling functionality seems odd. I also don't want to stick something this complicated on the internet for people to use for long periods of time without regular maintenance.

Before you say it I don't care for static site generators. I find it's easier for me to have a tab open, write for ten minutes, then go back to what I was doing before.

My goal with this is just to make a normal friendly baby CMS that I could share with a group of people, less technical people, so they could write stuff when they felt like it. We're not trading nuclear secrets here. The requirements are:

Needs to be open to the public internet with no special device enrollment or network segmentation
Not administered by me. Whatever normal problem arises it has to be solvable by a non-technical person.

Making the CMS

So in a day when I was doing other stuff I put this together: https://gitlab.com/matdevdug/ezblog. It's nothing amazing, just sort of a basic template I can build on top of later. Uses sqlite and it does the things you would expect it to do. I can:

Write posts in Quill
Save the posts as drafts or as published posts
Edit the posts after I publish them
Have a valid RSS feed of the posts
The whole frontend is just HTML/CSS so it'll load fast and be easy to cache

Then there is the whole workflow of draft to published.

For one days work this seems to be roughly where I hoped to be. Now we get to the crux of the matter. How do I log in?

What you built is bad and I hate it

The point is I should be able to solve this problem quickly and easily for a hobby website, not that you personally like what I made. The examples are not fully-fleshed out examples, just templates to demonstrate the problem. Also I'm allowed to make stuff that serves no other function than it amuses me.

The default for most sites (including Ghost) is just a username and password. The reason for this: it's easy, works on everything and it's pretty simple to work out a fallback flow for users. Everyone understands it, there's no concerns around data ownership or platform lock-in.

My login page:

{% extends 'base.html' %}

{% block header %}
  <h1>{% block title %}Log In{% endblock %}</h1>
{% endblock %}

{% block content %}
  <form method="post">
    <input type="hidden" name="csrf_token" value="{{ csrf_token() }}">
    <label for="username">Username</label>
    <input name="username" id="username" required>
    <label for="password">Password</label>
    <input type="password" name="password" id="password" required>
    <input type="submit" value="Log In">
  </form>
{% endblock %}

I've got a csrf_token in there and the rest is pretty straight forward. Server-side is also pretty easy.

@bp.route('/login', methods=('GET', 'POST'))
@limiter.limit("5 per minute")
def login():
    if request.method == 'POST':
        username = request.form['username']
        password = request.form['password']
        db = get_db()
        error = None
        user = db.execute(
            'SELECT * FROM user WHERE username = ?', (username,)
        ).fetchone()

        if user is None:
            error = 'Incorrect username.'
        elif not check_password_hash(user['password'], password):
            error = 'Incorrect password.'

        if error is None:
            session.clear()
            session['user_id'] = user['id']
            return redirect(url_for('index'))

        flash(error)

    return render_template('auth/login.html')

I'm not storing the raw password, just the hash. It's requires almost no work to do. It works exactly the way I think it should. Great fine.

Why are passwords insufficient?

This has been talked to death but let's recap for the sake of me being able to say I did it and you can just kinda scroll quickly through this part.

Users reuse usernames and passwords, so even though I might not know the raw text of the password another website might be (somehow) even lazier than me and their database gets leaked and then oh no I'm hacked.
The password might be a bad password and it's just one people try and oh no they are in the system.
I have to build in a password reset flow because humans are bad at remembering things and that's just how it is.

Password Reset Flow

Everyone has seen this, but let's talk about what I would need to modify about this small application to allow more than one person to use it.

I would need to add a route that handles allowing the user to reset their password by requesting it through their email
To know where to send that email, I would need to receive and store the email address for every user
I would also need to verify the users email to ensure it worked
All of this hinges on having a token I could send to that user that I could generate with something like the following:

def generate_reset_password_token(self):
    serializer = URLSafeTimedSerializer(current_app.config["SECRET_KEY"])

    return serializer.dumps(self.email, salt=self.password_hash)

Since I'm salting it with the hash of the current password which will change when they change the password, the token can only be used once. Makes sense.

Why is this bad?

For a ton of reasons.

I don't want to know an email address if I don't need it. There's no reason to store more personal information about a user that makes the database more valuable if someone were to steal it.
Email addresses change. You need to write another route which handles that process, which isn't hard but then you need to decide whether you need to confirm that the user has access to address 1 with another magic URL or if it is sufficient to say they are currently logged in.
Finally it sort of punts the problem to email and says "well I assume and hope your email is secure even if statistically you probably use the same password for both".

How do you fix this?

The problem can be boiled down to 2 basic parts:

I don't want the user to tell me a username, I want a randomly generated username so it further reduces the value of information stored in my database. It also makes it harder to do a random drive-by login attempt.
I don't want to own the password management story. Ideally I want the browser to do this on its side.
In a perfect world I want a response that says "yes we have stored these credentials somewhere under this users control" and I can wash my hands of that until we get into the situation where somehow they've lost access to the sync account (which should hopefully be rare enough that we can just do that in the database).

The annoying thing is this technology already exists.

The Credential Manager API does the things I am talking about. Effectively I would need to add some Javascript to my Registration page:

    <script>
        document.getElementById('register-form').addEventListener('submit', function(event) {
            event.preventDefault(); // Prevent form submission

            const username = document.getElementById('username').value;
            const password = document.getElementById('password').value;

            // Save credentials using Credential Management API
            if ('credentials' in navigator) {
                const cred = new PasswordCredential({
                    id: username,
                    password: password
                });

                // Store credentials in the browser's password manager
                navigator.credentials.store(cred).then(() => {
                    console.log('Credentials stored successfully');
                    // Proceed with registration, for example, send credentials to your server
                    registerUser(username, password);
                }).catch(error => {
                    console.error('Error storing credentials:', error);
                });
            } 
        });

        function registerUser(username, password) {
            // Simulate server registration request
            fetch('/register', {
                method: 'POST',
                headers: { 'Content-Type': 'application/json' },
                body: JSON.stringify({ username: username, password: password })
            }).then(response => {
                if (response.ok) {
                    console.log('User registered successfully');
                    // Redirect or show success message
                } else {
                    console.error('Registration failed');
                }
            });
        }
    </script>

Then on my login page something like this:

function attemptAutoLogin() {
    if ('credentials' in navigator) {
        navigator.credentials.get({password: true}).then(cred => {
            if (cred) {
                // Send the credentials to the server to log in the user
                fetch('/login', {
                    method: 'POST',
                    body: new URLSearchParams({
                        'username': cred.id,
                        'password': cred.password
                    })
                }).then(response => {
                    // Handle login success or failure
                    if (response.ok) {
                        console.log('User logged in');
                    } else {
                        console.error('Login failed');
                    }
                });
            }
        }).catch(error => {
            console.error('Error retrieving credentials:', error);
        });
    }
}

// Call the function when the page loads
document.addEventListener('DOMContentLoaded', attemptAutoLogin);

So great, I assign a random cred.id and cred.password, stick it in the browser and then I sorta wash my hands of it.

We know the password is stored somewhere and can be synced for free
We know the user can pull the password out and put it somewhere else if they want to switch platforms
Browsers handle password migrations for users

The problem with this approach is I don't know if I'm supposed to use it.

I have no idea what this means. Could this go away? In testing it does seem like the performance is all over the place. Firefox seems to have some issues with this, whereas Chrome seems to always nail it. iOS Safari also seems to have some problems. So this isn't seemingly reliable enough to use.

Just please just make this a thing that works everywhere.

Before you yell at me about Math.random I think the following would make a good password:

function generatePassword(length) {
  const charset = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789";
  let password = "";
  for (let i = 0; i < length; i++) {
    const randomIndex = Math.floor(Math.random() * charset.length);
    password += charset.charAt(randomIndex);
  }
  return password;
}

const password = generatePassword(32);
console.log(password);

Source

But also maybe I'm wrong. I never have any idea if I'm generating random things in a safe way. Every time I do it someone tells me I did it wrong.

TOTP

Alright so I can't get away with just a password, so I have to assume the password is bunk and use it as one element of login. Then I have to use either TOTP or HOTP.

From a user perspective TOTP works as follows:

Set up 2FA for your online account.
Get a QR code.
You scan this QR code with an authenticator app of your choice
Your app will immediately start generating these six-digit tokens.
The website asks you to provide one of these six-digit tokens.

Practically this is pretty straight forward. I add a few extra libraries:

import io
import pyotp
import qrcode
from flask import send_file

I have to generate a secret totp_secret = pyotp.random_base32() which then I have to store in the database. Then I have to generate a QR code to show the user so they can generate the time-based codes.

otp_uri = pyotp.totp.TOTP(totp_secret).provisioning_uri(username, issuer_name="ezblog")
                qr = qrcode.make(otp_uri)
                buf = io.BytesIO()
                qr.save(buf)
                buf.seek(0)

                return send_file(buf, mimetype='image/png')

However the more you look into this, the more complicated it gets.

You actually don't need the token to be 6 digits. It can be up to 10. I don't know why I'd want more or less. Presumably more is better.
The token can be valid for longer than 30 seconds. From reading it seems like that makes the code less reliant on perfect time sync between client and server (great) but also increases the probability of someone stealing the TOTP and using it. That doesn't seem like a super likely attack vector here so I'll make it way longer. But then why don't more services use longer tokens if the only concern then is if someone sees my code? Is this just people being unspeakably annoying?
I need to add some recovery step in case you lose access to the TOTP code.

How do you recover from a TOTP failure?

Effectively I'm back to my original problem. I can either:

Go back to the email workflow I don't want because again I don't want to rely on email as some sort of super-secure bastion and I really don't want to store email addresses.
Or I generate a recovery code and give you those codes which let you bypass the TOTP requirement. That at least lets me be like "this is no longer my fault". I like that.

How do I make a recovery code?

Honest to god I have no idea. As far as I can tell a "recovery code" is just a randomly generated value I hash and stick in the database and then when the user enters it on a form, check the hash. It's just another password. I don't know why all the recovery codes I see are numbers, since it seems to have no relationship to that and would likely work with any string.

Effectively all I need to do with the recovery code is ensure it gets burned once used. Which is fine, but now I'm confused. So I'm generating passwords for the user and then I give the passwords back to the user and tell them to store them somewhere? Why don't I just give them the one good password for the initial login and call it a day? Why is one forbidden and the other is mandatory?

Does HOTP help?

I'm really still not clear how HOTP works. Like I understand the basics:

@app.route('/verify_2fa', methods=['GET', 'POST'])
def verify_2fa():
    if request.method == 'POST':
        hotp = pyotp.HOTP(user_data['hotp_secret'])
        otp = request.form['otp']
        if hotp.verify(otp, user_data['counter']):
            user_data['counter'] += 1  # Increment the counter after successful verification
            return redirect(url_for('index'))
        flash('Invalid OTP')
    return render_template('verify_2fa.html')

There is a secret per-user and a counter and then I increment the counter every single time the user logs in. As far as I can tell there isn't a forcing mechanism which keeps the client and the server in-sync, so basically you tap a button and generate a password and then if you accidentally tap the button again the two counters are off. It seems like then the server has to decide "are you a reasonable number of times off or an unreasonable amount of counts off". With the PyOTP library I don't see a way for me to control that:

verify(otp: str, counter: int) → bool[source]

    Verifies the OTP passed in against the current counter OTP.

    Parameters:

            otp – the OTP to check against

            counter – the OTP HMAC counter

So I mean I could test it against a certain range of counters from the counter I know and then accept it if it falls within that window, but you still are either running a different application or an app on your phone to enter this code. I'm not sure exactly why I would ever use this over TOTP, but it definitely doesn't seem easier to recover from.

So TOTP would work with the recovery code but this seems aggressive to ask a normal people to install a different program on their computer or phone in order to login based on a time-based code which will stop working if the client and server (who have zero way to sync time with each other) drift too far apart. Then I need to give you recovery codes and just sorta hope you have somewhere good to put those.

That said, it is the closest to solving the problem because those are at least normal understandable human problems and it does meet my initial requirement of "the user has one good password". It's also portable and allows administrators to be like "well you fell through the one safety net, account is locked, make a new one".

What is the expected treatment of the TOTP secret?

When I was writing this out I became unsure if I'm allowed to hash this secret. Like in theory I should be able to, because I don't need to recover it. If the user was to go through a TOTP reset flow, then I would probably (presumably) want to generate a new secret in which case there's nothing stopping me from using a strong key derivation function.

None of the tutorials I was able to find seemed to have any opinion on this topic. It seems like using encryption is the SOP, which is fine (it's not sitting on disk as a plain string) but introduces another failure point. It seems odd there isn't a way to negotiate a rotation with a client or really provide any sort of feedback. It meets my initial requirement, but the more I read about TOTP the more surprised I was it hasn't been better thought out.

Things I would love from TOTP/HOTP

Some sort of secret rotation process would be great. It doesn't have to be common, but it would be nice if there was some standard way of informing the client.
Be great if we more clearly explained to people how long the codes should be valid for. Certainly 1 hour is sufficient for consumer-level applications right?
Explain like what would I do if the counters get off with HOTP. Certainly some human error must be accounted for by the designers. People are going to hit the button too many times at some point.

Use Google/Facebook/Apple

I'm not against using these sorts of login buttons except I can't offer just one, I need to offer all of them. I have no idea what login that user is going to have or what make sense for them to use. It also means I need to manage some sort of app registration with each one of these companies for each domain that they can suspend approximately whenever they feel like it because they're giant megacorps.

So now I can't just spin up as many copies of this thing as I want with different URLs and I need to go through and test each one to ensure they work. I also need to come up with some sort of migration path for if one of them disappears and I need to authenticate the users into their existing accounts but using a different source of truth.

Since I cannot think of a way to do that which doesn't involve me basically emailing a magic link to the email address I get sent in the response from your corpo login and then allowing that form to update your user account with a different "real_user_id" I gotta abandon this. It just seems like a tremendous amount of work to not really "solve" the problem but just make the problem someone else's fault if it doesn't work.

Like if a user could previously log into a Facebook account and now no longer can, there's no customer service escalation they can go on. They can effectively go fuck themselves because nobody cares about one user encountering a problem. But that means you would still need some way of being like "you were a Facebook user and now you are a Google user". Or what if the user typically logs in with Google, clicks Facebook instead and now has two accounts? Am I expected to reconcile the two?

It's also important to note that I don't want any permissions and I don't want all the information I get back. I don't want to store email address or real name or anything like that, so again like the OAuth flow is overkill for my usage. I have no intention of requesting permissions on behalf of these users with any of these providers.

Use Passkeys

Me and passkeys don't get along super well, mostly because I think they're insane. I've written a lot about them in the past: https://matduggan.com/passkeys-as-a-tool-for-user-retention/ and I won't dwell on it except to say I don't think passkeys are designed with the first goal being an easy user experience.

But regardless passkeys do solve some of my problems.

Since I'm getting a public key I don't care if my database gets leaked
In theory I don't need an email address for fallback because on some platforms some of the time they sync
If users care a lot about ownership of personal data they can use a password manager sometimes if the password manager knows the right people and idk is friends with the mayor of passkeys or something. I don't really understand how that works, like what qualifies you to store the passkeys.

My issue with passkeys is I cannot conceive of a even "somewhat ok" fallback plan. So you set it up on an iPhone with a Windows computer at home. You break your iPhone and get an Android. It doesn't seem that crazy of a scenario to me to not have any solution for. Do I need your phone number on top of all of this? I don't want that crap sitting in a database.

Tell the users to buy a cross-platform password manager

Oh ok yeah absolutely normal people care enough about passwords to pay a monthly fee. Thanks for the awesome tip. I think everyone on Earth would agree they'd give up most of the price of a streaming platform full of fun content to pay for a password manager. Maybe I should tell them to spin up a docker container and run bitwarden while we're at it.

Anyway I have a hyper-secure biometric login as step 1 and then what is step 2, as the fallback? An email magic link? Based on what? Do I give you "recovery codes" like I did with TOTP? It seems crazy to layer TOTP on top of passkeys but maybe that...makes some sense as a fallback route? That seems way too secure but also possibly the right answer?

I'm not even trying to be snarky, I just don't understand what would be the acceptable position to take here.

What to do from here

Basically I'm left where I started. Here are my options:

Let the user assign a username and password and hope they let the browser or password manager do it and assume it is a good one.
Use the API in the browser to generate a good username and password and store it, hoping they always use a supported browser and that this API doesn't go away in the future.
Generate a TOTP but then also give them passwords called "recovery codes" and then hope they store those passwords somewhere good.
Use email magic links a lot and hope they remember to change their email address here when they lose access to an old email.
Use passkeys and then add on one of the other recovery systems and sort of hope for the best.

What basic stuff would I need to solve this problem forever:

The browser could tell me if it generated the password or if the user typed the password. If they type the password, force the 2FA flow. If not, don't. Let me tell the user "seriously let the system make the password". 1 good password criteria met.
Have the PasswordCredential API work everywhere all the time and I'll make a random username and password on the client and then we can just be done with this forever.
Passkeys but they live in the browser and sync like a normal password. Passkey lite. Passkey but not for nuclear secrets.
TOTP but if recovery codes are gonna be a requirement can we make it part of the spec? It seems like a made-up concept we sorta tacked on top.

I don't think these are crazy requirements. I just think if we want people to build more stuff and for that stuff to be secure, someone needs to sit down and realistically map out "how does a normal person do this". We need consistent reliable conventions I can build on top of, not weird design patterns we came up with because the initial concept was never tested on normal people before being formalized into a spec.