would still hash to the same fixed length even if the password were 1000 characters long
That’s a big assumption. At one point at work we had a system (thankfully dead now) for accessing customer data that limited passwords to a max of 8 characters, only supported three non-alphanumeric characters in those passwords, and shipped a .net dll with every desktop client install that had both the password encryption and decryption function in it and accessible to anyone who loaded the dll. If you had user table access and that dll, you owned the whole system.
Fucking clown show.
Then they finally allowed longer passwords on the software side, but it was causing problems on the DB side. Because different length passwords hashed to different length hashes. Not one to one, but something like 1-4 in 14 out, 4-12 in 23 out, 12-15 in 36 out. Some sort of pattern like that. That was fun to troubleshoot.
Point is, you can’t guarantee hash length in unaffected by password length.
Point is, you can’t guarantee hash length in unaffected by password length.
You can if the software you use makes use of a sane hashing algorithm.
A hash function is any function that can be used to map data of arbitrary size to fixed-size values, though there are some hash functions that support variable-length output (https://en.wikipedia.org/wiki/Hash_function)
But that’s exactly my point. You can’t rely on that.
The rest of this is a rambling attempt to back up my point.
We wouldn’t have car accidents if everyone drove and maintained their vehicles properly.
People, including software devs, make mistakes. They end up with deadlines, ridiculous shit expectations and decrees from management, any of a countless number of reasons why any piece of software might not be designed as it should have been. And something as ridiculously back-end as password hashing functionality isn’t liable to be seen by nearly anyone. Besides the other members of my team I warned about this system, no one else in tge company knew of these flaws and quirks.
From unfortunate experience, if someone can confidently bullshit to the regulatory auditors, incredibly few of the auditors have the skills to truly verify claims about whether something is actually compliant or not. Actually safe or not. So many cybersecurity “professionals” I’ve encountered in my career are glorified run buttons for premade vuln scanners, unable to even check the mitigating factors in the enterprise systems they are responsible for the security of.
This wasn’t some in-house hacked together program, it was a piece of software created by a very large company in the financial technologies space, with a double digit number of corporations as customers just for this specific heinously insecure piece of software they sold. For decades.
I did my part and reported it to their security contact. Nothing happened. Beyond internal discussion with some of my teammates, I’ve not spoken about it until after the software was deprecated not only for my company, but globally.
I can’t really emphasize how big this system was for having these babby’s first software project level oversights. Financial transactions were initiated by this system.
Given that, and many of the other just absurdly insane things I’ve seen from professional, million dollar contract type pieces of software, in my decade plus in the industry… There’s how things should work. The theoretical ideal. And then there’s the actually implemented garbage we actually have to deal with.
Before I begin I want to point out that this is one of the very, very, very few times I will ever take such a hard stance on a subject. The reason I do is because it is absolutely one of the most important things in our society that we could easily get right, but yet somehow still don’t (as noted by your story).
First, let’s talk definitions. What you’re talking about is encryption. Yes, encryption is variable length because you need to be able to reliably decrypt the encrypted data back into its original form. This works really well for things like HTTPS, text messaging, and other stuff that needs to be decrypted.
Hashing is not encryption. There is absolutely zero use-case for needing to decrypt someone’s password; this is why passwords are to be hashed and not encrypted (yes, the distinction very much matters). As such, hashing (or 1-way hashing) is fixed-length based on the type of algorithm used. MD5 I believe uses a 128-bit hash, where as SHA-1 uses 160-bit, and SHA512 uses 512-bit hashes. The bigger the hash the less likely you’ll run into something called collisions. A collision in hashing means that two (or more) values generate the same hash. That’s very bad.
Now, any “developer” that uses encryption for password storage, or tries to roll their own system, should be fired, physically branded with a hot iron on their forehead with the letter A (for dumbass), and sent back to grade school, because I guarantee you they’re doing it absolutely wrong, and they are one of the many preventable reasons why we have so many fucking data breaches these days.
Don’t roll your own encryption or password hashing. Don’t. I don’t care. There is absolutely no reason to do so. If you think there is, quit and go work a job more suited for your level of intelligence.
Wouldn’t a hash function whose length is affected by the contents be less secure that a fixed? It’s still hard to break as a hash, but why give any hints on anything? It also complicates the database part. Keep it simple.
You’re confusing hashing with encryption. Hashing is always fixed length, where as encryption is not. This is important because in hashing you’ll never need to know the original value. The original value is only used to generate the hash. No two (different) values should ever produce the same hash (i.e. collision). The weaker the hash (e.g. md5), the more likely you’ll get collisions.
Now with encryption, you do need to get back the original value (decryption), and as such the length of the encrypted data will vary based on the length of the original data. But encrypting passwords is bad because that means someone could potentially decrypt the encrypted data and learn your password, at which point nothing would stop them from accessing your system.
Now, that doesn’t mean that people can’t guess your password. In fact, there is a whole facet to information system security dedicated to what’s called brute force cracking; rainbow tables, password salts, etc. this is why password entropy (entropy is a term used to reflect data that will feed into a random number generator that is used in both encryption and hashing to do their black magic) is important. The more entropy (i.e. the more characters allowed, and the longer the password), the more difficult and time-consuming it is to brute force your password. Limiting password length or characters because of storage space is a myth, lazy as hell, or worse, developer inexperience/ignorance and is dangerous as hell.
I understand the difference, hashes being one way for a reason. The claim was that what is being hashed will control the size, which if true would give some suggestions on what was hashed. In reading through the wiki on hashing though the variations in size are controllable for what’s needed in application, not because of the hash contents as was stated.
That’s a big assumption. At one point at work we had a system (thankfully dead now) for accessing customer data that limited passwords to a max of 8 characters, only supported three non-alphanumeric characters in those passwords, and shipped a .net dll with every desktop client install that had both the password encryption and decryption function in it and accessible to anyone who loaded the dll. If you had user table access and that dll, you owned the whole system.
Fucking clown show.
Then they finally allowed longer passwords on the software side, but it was causing problems on the DB side. Because different length passwords hashed to different length hashes. Not one to one, but something like 1-4 in 14 out, 4-12 in 23 out, 12-15 in 36 out. Some sort of pattern like that. That was fun to troubleshoot.
Point is, you can’t guarantee hash length in unaffected by password length.
You can if the software you use makes use of a sane hashing algorithm.
But that’s exactly my point. You can’t rely on that.
The rest of this is a rambling attempt to back up my point.
We wouldn’t have car accidents if everyone drove and maintained their vehicles properly.
People, including software devs, make mistakes. They end up with deadlines, ridiculous shit expectations and decrees from management, any of a countless number of reasons why any piece of software might not be designed as it should have been. And something as ridiculously back-end as password hashing functionality isn’t liable to be seen by nearly anyone. Besides the other members of my team I warned about this system, no one else in tge company knew of these flaws and quirks.
From unfortunate experience, if someone can confidently bullshit to the regulatory auditors, incredibly few of the auditors have the skills to truly verify claims about whether something is actually compliant or not. Actually safe or not. So many cybersecurity “professionals” I’ve encountered in my career are glorified run buttons for premade vuln scanners, unable to even check the mitigating factors in the enterprise systems they are responsible for the security of.
This wasn’t some in-house hacked together program, it was a piece of software created by a very large company in the financial technologies space, with a double digit number of corporations as customers just for this specific heinously insecure piece of software they sold. For decades.
I did my part and reported it to their security contact. Nothing happened. Beyond internal discussion with some of my teammates, I’ve not spoken about it until after the software was deprecated not only for my company, but globally.
I can’t really emphasize how big this system was for having these babby’s first software project level oversights. Financial transactions were initiated by this system.
Given that, and many of the other just absurdly insane things I’ve seen from professional, million dollar contract type pieces of software, in my decade plus in the industry… There’s how things should work. The theoretical ideal. And then there’s the actually implemented garbage we actually have to deal with.
Before I begin I want to point out that this is one of the very, very, very few times I will ever take such a hard stance on a subject. The reason I do is because it is absolutely one of the most important things in our society that we could easily get right, but yet somehow still don’t (as noted by your story).
First, let’s talk definitions. What you’re talking about is encryption. Yes, encryption is variable length because you need to be able to reliably decrypt the encrypted data back into its original form. This works really well for things like HTTPS, text messaging, and other stuff that needs to be decrypted.
Hashing is not encryption. There is absolutely zero use-case for needing to decrypt someone’s password; this is why passwords are to be hashed and not encrypted (yes, the distinction very much matters). As such, hashing (or 1-way hashing) is fixed-length based on the type of algorithm used. MD5 I believe uses a 128-bit hash, where as SHA-1 uses 160-bit, and SHA512 uses 512-bit hashes. The bigger the hash the less likely you’ll run into something called collisions. A collision in hashing means that two (or more) values generate the same hash. That’s very bad.
Now, any “developer” that uses encryption for password storage, or tries to roll their own system, should be fired, physically branded with a hot iron on their forehead with the letter A (for dumbass), and sent back to grade school, because I guarantee you they’re doing it absolutely wrong, and they are one of the many preventable reasons why we have so many fucking data breaches these days.
Don’t roll your own encryption or password hashing. Don’t. I don’t care. There is absolutely no reason to do so. If you think there is, quit and go work a job more suited for your level of intelligence.
Wouldn’t a hash function whose length is affected by the contents be less secure that a fixed? It’s still hard to break as a hash, but why give any hints on anything? It also complicates the database part. Keep it simple.
You’re confusing hashing with encryption. Hashing is always fixed length, where as encryption is not. This is important because in hashing you’ll never need to know the original value. The original value is only used to generate the hash. No two (different) values should ever produce the same hash (i.e. collision). The weaker the hash (e.g. md5), the more likely you’ll get collisions.
Now with encryption, you do need to get back the original value (decryption), and as such the length of the encrypted data will vary based on the length of the original data. But encrypting passwords is bad because that means someone could potentially decrypt the encrypted data and learn your password, at which point nothing would stop them from accessing your system.
Now, that doesn’t mean that people can’t guess your password. In fact, there is a whole facet to information system security dedicated to what’s called brute force cracking; rainbow tables, password salts, etc. this is why password entropy (entropy is a term used to reflect data that will feed into a random number generator that is used in both encryption and hashing to do their black magic) is important. The more entropy (i.e. the more characters allowed, and the longer the password), the more difficult and time-consuming it is to brute force your password. Limiting password length or characters because of storage space is a myth, lazy as hell, or worse, developer inexperience/ignorance and is dangerous as hell.
I understand the difference, hashes being one way for a reason. The claim was that what is being hashed will control the size, which if true would give some suggestions on what was hashed. In reading through the wiki on hashing though the variations in size are controllable for what’s needed in application, not because of the hash contents as was stated.