UUIDs and Hashes: When to Use Which for Unique Identifiers
Every system needs unique identifiers. The question is whether to generate them randomly (UUIDs), derive them from content (hashes), or use something else entirely. The wrong choice leads to performance problems, security vulnerabilities, or subtle collision bugs. This guide covers the options and when each one fits.
UUIDs: universally unique identifiers
A UUID is a 128-bit identifier formatted as 32 hexadecimal digits in five groups: 550e8400-e29b-41d4-a716-446655440000. The key property is that any system can generate a UUID independently, without coordinating with any other system, and the probability of collision is negligibly small.
UUID v4: random
UUID v4 is the most commonly used version. It fills 122 of the 128 bits with random data (the remaining 6 bits encode the version and variant).
f47ac10b-58cc-4372-a567-0e02b2c3d479
^^^^
version 4
Collision probability: With 122 random bits, you would need to generate approximately 2.71 * 10^18 UUIDs to have a 50% chance of a single collision. In practical terms, generating 1 billion UUIDs per second for 86 years gives you a 50% chance of one duplicate. For any real-world application, collisions are not a concern.
Use cases:
- Primary keys in distributed databases
- Session identifiers
- Temporary file names
- Correlation IDs in microservices
Downsides:
- Random values cause poor index locality in B-tree databases. Insertions scatter across the index, leading to more page splits and slower writes.
- Not sortable by creation time.
UUID v7: time-ordered
UUID v7 is the modern answer to v4's indexing problem. The first 48 bits encode a Unix timestamp in milliseconds, followed by random bits. This means v7 UUIDs are chronologically sortable.
019078e5-d2b0-7cc0-b3a4-9e1c284f5a6a
^^^^^^^^ ^^^^
timestamp version 7
Advantages over v4:
- B-tree friendly: new UUIDs always sort after old ones, so inserts are append-only.
- Naturally sortable:
ORDER BY idgives you chronological order. - Still globally unique: the random component prevents collisions even at millisecond granularity.
Use cases:
- Primary keys in PostgreSQL, MySQL, SQLite where index performance matters.
- Event logs and audit trails where ordering is useful.
- Any place you would use v4 but also want time-based sorting.
When to prefer v4 over v7:
- When the creation time of an entity should not be inferrable from its ID (privacy concern).
- When you need compatibility with systems that only understand v4.
Hash functions: content-derived identifiers
Unlike UUIDs, hashes are deterministic — the same input always produces the same output. This makes them ideal for verifying integrity and deduplicating data.
MD5 (128-bit)
Input: "hello world"
MD5: 5eb63bbbe01eeed093cb22bb8f5acdc3
MD5 is fast but cryptographically broken. Collisions can be crafted deliberately, meaning an attacker can create two different inputs that produce the same hash.
Acceptable uses: File deduplication, cache keys, non-security checksums.
Never use for: Password hashing, digital signatures, security-sensitive integrity checks.
SHA-256 (256-bit)
Input: "hello world"
SHA-256: b94d27b9934d3e08a52e52d7da7dabfac484efe37a5380ee9088f7ace2efcde9
SHA-256 is part of the SHA-2 family and is the current standard for cryptographic hashing. No practical collisions have been found.
Use cases:
- File integrity verification (checksums)
- Content-addressable storage (Git uses SHA-1, but the principle applies)
- API signature verification (HMAC-SHA256)
- Blockchain and certificate verification
SHA-512 (512-bit)
Twice the output length of SHA-256. Marginally slower on 32-bit systems but actually faster on 64-bit systems due to the internal word size.
Use when: You need a longer hash for additional collision resistance, or your platform processes 64-bit words more efficiently.
Choosing the right tool
| Need | Solution |
|---|---|
| Primary key (single database) | Auto-increment integer or UUID v7 |
| Primary key (distributed system) | UUID v7 |
| Session token | UUID v4 (random, unpredictable) |
| File checksum | SHA-256 |
| Cache key from content | SHA-256 or MD5 (speed over security) |
| Password storage | bcrypt, scrypt, or Argon2 (not raw SHA) |
| Deduplication | SHA-256 of the content |
| Short unique slug | First 8-12 chars of a UUID v4 or nanoid |
Password hashing: a special case
Raw hash functions (even SHA-256) are terrible for passwords because they are too fast. An attacker with a GPU can compute billions of SHA-256 hashes per second, brute-forcing most passwords in minutes.
Password hashing algorithms like bcrypt, scrypt, and Argon2 are intentionally slow and memory-intensive. The cost parameter controls how slow the hash is — increase it as hardware gets faster.
Database primary key performance
If you are choosing between auto-increment integers and UUIDs for primary keys, here is the trade-off:
Auto-increment: Smallest storage (4 or 8 bytes), best index performance, but leaks creation order and cannot be generated client-side.
UUID v7: 16 bytes, good index performance (time-ordered), can be generated anywhere without coordination.
UUID v4: 16 bytes, worst index performance (random insertion), but does not leak any information.
For most new projects, UUID v7 offers the best balance of uniqueness, sortability, and distributed generation.
Try our UUID and Hash Generator to generate UUIDs and compute hashes instantly — right in your browser, no upload required.