Using Text Mining to Search for Neolithic Vlaardingen Culture Sites in the Rhine-Meuse-Scheldt Delta
This paper presents a study on Vlaardingen Culture (3400-2500 BCE) sites in theRhine-Meuse-Scheldt delta using AGNES, an intelligent search engine for Dutch andFlemish archaeological grey literature. The aims of this paper are twofold: 1) toprovide an up-to-date overview of Vlaardingen Culture sites; 2) to evaluate theperformance of AGNES in searching for period specific sites. Vlaardingen Culture(VLC) sites usually consist of artefact scatters without clearly discernible houseplans. These scatters are often found amongst abundant remains from later periods.This type of ‘by-catch’ is usually not found in the metadata of archaeological reports.It can only be recovered in full text searches. AGNES uses Text Mining and LargeLanguage Models to allow searches on archaeological concepts (in this case anarchaeological culture) in full texts extracted from three major repositories for Dutch(DANS and ARCHIS) and Flemish (Onroerend Erfgoed) archaeology. This paperpresents a search for VLC sites, and a comparison of the retrieved information with arecently compiled overview of VLC sites in the area. Using eight queries we retrieved4532 hits, which were subdivided into: relevant hits (n=430); semi-relevant hits(n=2133); and irrelevant hits (n=1960). We recovered thirty previously unknownVlaardingen Culture sites, amounting to 19% of the total number of VLC sites(n=158). Not all sites could be found in AGNES, older archaeological sites are oftenpublished in scientific and semi-scientific journals, theses, or books. Thesepublications are absent in the repositories which can be accessed through AGNES,and by extent, they cannot be retrieved. As such AGNES does not provide analternative to traditional search methods. Nevertheless, most of the newly found sitesconsist of sites which cannot be found by searching the metadata of reports in DANSand ARCHIS. Therefore, AGNES proved to be an essential and effective addition totraditional search methods. Finally, our study highlighted the fact that clearterminology to describe Vlaardingen Culture sites is presently lacking. As such thestudy provided interesting insights into the terminologies employed indevelopment-led archaeology.
