About a month into my internship at BlockFrame, I have been relearning a lot about cybersecurity and learning a lot of new stuff. Some of the work I have been doing involves working with hashed information.
Hashing takes a piece of information and returns a fixed-size string that looks random. It’s mostly used to ensure data integrity and allow for efficient retrieval, and that is about the extent of what I knew before I started at BlockFrame. We used bcrypt in a few projects at Turing, and that uses hashing + salting (adding a unique, random value before you hash it).
One of the fascinating aspects of hashing is its ability to verify data integrity. I had the opportunity to explore this firsthand by using a tool to calculate the hash of a piece of information, and then comparing it to the original data to see if any alterations had been made.
If I take my name and don’t capitalize it, if it is set to SHA256, then this is the hash
brendan = 58abd882fba88d5d5d5f86aa60c1f825480353c496d0ebecb74760fc69001380
But if I capitalize the B…
Brendan = 689fb5f3d57afdfdf309732f805791ae2dfe9eca4c2d0d4da33a980247c9b78e
So looking at that, I would have no idea what that is. I realize that a computer can crack it, though, and I admittedly started writing this because I was curious how long it would take…According to a quick conversation with chatGPT, cracking a hash could take seconds for a weak password or years for a strong password. Assuming it’s SHA-256 and nothing else has been done. Using bcrypt (as mentioned earlier) would take ‘an impractically long time’ (says chatGPT).
Anyway, SHA-256 stands for Secure Hash Algorithm 256-Bit and is part of the SHA-2 cryptographic hash functions the NSA designed. It’s used widely, especially in blockchain (which I’m learning about at BlockFrame).
It’s deterministic, which means the same input always produces the same value. As you can see earlier, a slight change in the input can result in a massive shift in the output. Here are some of the things about how it works.
- It processes the input data in 512-bit (64 bytes) blocks and performs operations on them.
- The input is padded to ensure that its length is a multiple of 512 bits. Padding involves adding a single ‘1’ bit and the necessary number of ‘0’ bits.
- It gets divided up into 512-bit blocks.
- Eight 32-bit words are used as initial hash values, and they are derived from the fractional parts of the square roots of the first eight primes.
- Each block is processed through a series of 64 rounds of operations, and the hash values are updated.
- After the blocks are processed, the final value is produced by concatenating the eight 32-bit words.
There is more to it than that, but that is the overview I wrote. Maybe I’ll get into it more later. I have a lot of information to process from the last month or so, so it will be adjacent, if nothing else.
Leave a Reply