I'm intereted in opinions... but I think I know what I should do.

I produce files that have a 64-bit ID generated by an STM32's RNG. This seems to do a reasonable job at being random (no collisions yet, in ~5k files), but I don't fully trust it, and 64-bit isn't that big. [it's likely that future hardware will have CSRNG, but the file IDs will probably remain 64-bit].

When handling this data, it's sometimes split across multiple files, which share that 64-bit ID... this and some other herustics (like timestamps) allow you to confidently associate these parts into a whole.

For some years, the files have been uploaded, processed, poked and prodded - but reside in a filesystem structure, and that was it. Accessible by me with utilities, but not accessible to the others.

Until now, the files have been "uniquely" identified and referred to by that 64-bit ID. And the uniqueness has persisted.

Now that we're building a Web UI for better accessibility, the details are being brought into a database, which has a basic unique 'id' column.

How do I refer to these files going forward?... by the hopefully-unique 64-bit ID, or by the actually-unique database ID?

I like the 64-bit ID, because it's the "source of truth", and it's familiar... but I'm not confident (enough) that they'll remain unique over time.

I like the database ID, because it's guaranteed to be unique within the system... but I don't necessarily want to depend on lookups via the database.

I've considered adding a small metadata file that identifies the file's database ID and it's other parts, so lookup can remain a "filesystem only" activity with either ID.

I've also considered using a standard 128-bit UUID that is generated and stored on the filesystem and in the database.

I don't know why this decision is being so problematic for me. 🫣

Use the 64-bit hopefully-unique ID
20%
Use the database's actually-unique ID
20%
Use a 128-bit UUID
50%
Something else
10%
Poll ended at .
@attie just hash the file and use it as the id?

@katachora ... I quite like that idea!

There's more contex here: the files are formed of 4 KiB chunks, and I use the SHA256 of the file's first chunk to store it in the filesystem (it's all we're guaranteed to have at first... files are uploaded progressively).

I'm much more confident that there won't be a collision here, due to what's in the files.