So, how does #RDFGraphGen work, and why was it needed?

If you’re interested, you can read the long paper linked above, or this short blogpost: https://blog.mjovanovik.com/post/757050614006104064/new-preprint-rdfgraphgen-a-synthetic-rdf-graph

Earlier this year we introduced #RDFGraphGen, a general-purpose, domain-independent generator of synthetic RDF knowledge graphs, based on #SHACL constraints.

In July, we published a preprint detailing its design and implementation.

#RDF #KnowledgeGraphs #SyntheticData

https://arxiv.org/abs/2407.17941

RDFGraphGen: An RDF Graph Generator based on SHACL Shapes

Developing and testing modern RDF-based applications often requires access to RDF datasets with certain characteristics. Unfortunately, it is very difficult to publicly find domain-specific knowledge graphs that conform to a particular set of characteristics. Hence, in this paper we propose RDFGraphGen, an open-source RDF graph generator that uses characteristics provided in the form of SHACL (Shapes Constraint Language) shapes to generate synthetic RDF graphs. RDFGraphGen is domain-agnostic, with configurable graph structure, value constraints, and distributions. It also comes with a number of predefined values for popular schema.org classes and properties, for more realistic graphs. Our results show that RDFGraphGen is scalable and can generate small, medium, and large RDF graphs in any domain.

arXiv.org