Can anyone point me to a serde data format that uses multiple distinct string representations?

Or a serde data format that isn't self-describing?

Or a serde data format that has distinct "top level" types that are different from inner types

Unfortunately I'm implementing a data format that has all of these things and it's making serde hard

#rustlang #rust #serde

@asonix The closest example I can give is 8 RFC numbers off what you're doing:
In CBOR, there exists both a byte-string type, and an array type (that can, among other things, contain integers). So any CBOR serde binding, when faced with a &[u8], has to make a choice between "it's obviously bytes" (byte string) and "it's a special case of &[T]" (array of numbers).

I think what they do is that they make users go through newtypes in the serialized side <https://docs.rs/minicbor/latest/minicbor/bytes/index.html>.

minicbor::bytes - Rust

Newtypes for `&[u8]`, `[u8;N]` and `Vec<u8>`.

@asonix Not 100% sure if it really works that way, though: There are a lot of other friction points with serde, eg. that it doesn't support integer keys serialized structs, so maybe those newtypes just work when going through minicbor's own derives rather than through serde's.
@chrysn Thank you, I will have a look

@asonix @chrysn

Or derive Serialize on these?

enum TopChar {
Char(c),
String1(String),
String2(OtherString),
}

struct TopString(Vec<TopChar>)

No string quotes in there...

@KingmaYpe @chrysn

The problem is that my multiple string types have a common subset of valid contents, but its important that round-tripping a structure produce the same serialized format

I have 'bare strings' that can look like `hello`, and quoted strings that can look like `"hello"`, but if I deserialize these both into a String they'd both end up as `hello`, and then serializing again I would need to choose just one of the two formats for it, which will lead to problems

@asonix But that sounds right like what @KingmaYpe suggested:
enum FieldString { Quoted(String), Bare(String) }
The hard part is probably to send those through serde's interface, right? You could probably recognize the names in your serializer in serialize_newtype_variant and then emit possibly-quotes and the inner value.
Whether that is good use of #Serde, I do not know.

@asonix 🤔 serialization is by definition storing data with descriptions, then there's marshalling and few other ways.

There are crates dealing with protocols built with structure definitions, but I don't remember any from top of my head.

@asonix postcard (widely used in embedded Rust) isn't self describing