The TIB AV-Portal in 2025: New Infrastructure, AI-Based Media Analysis, and Audio-Only
diesen Beitrag auf Deutsch lesenAs in previous years, we would once again like to provide an overview of the most important technical and functional enhancements of the TIB AV-Portal. In 2025, the Scrum team implemented a wide range of improvements that strengthened both the infrastructural foundation and the functional capabilities of the portal.
Several of these developments directly respond to feedback and concrete requirements raised by users. For some readers, this review may therefore be not only informative but also personally relevant – perhaps you will spot a feature that you yourself helped inspire.
From External Hosting to TIB-Owned Infrastructure
With the complete migration of video and audio delivery to servers operated by TIB in January 2025, the AV-Portal has taken a significant step forward in its infrastructural development. Previously, individual components for streaming, download, and delivery operated on external third-party systems; these processes are now conducted entirely within TIB infrastructure. Supplementary materials – such as presentations, scripts, research data, or additional teaching resources – are likewise hosted directly at TIB.
By running these services on its own servers, TIB not only controls all technical processes but also governs data flows, storage locations, and security standards. External dependencies – such as those related to availability or service levels – have been further reduced. Following the principle: Scientific data belongs in scientific infrastructure – under conditions that meet the requirements of research, teaching, and Open Science.
Adaptive Streaming with MPEG-DASH
Since January 2025, we have been generating adaptive derivatives in the MPEG-DASH format. This enables video quality to adjust dynamically to the user’s available bandwidth during playback.
Instead of delivering a single, statically encoded video, the AV-Portal provides multiple quality levels between which the player switches automatically.
The result is a significantly more stable streaming experience: delays, stutters, and playback interruptions are reduced, while always delivering the best possible resolution. At the same time, bandwidth usage decreases, as unnecessarily large files are no longer transmitted when a user’s connection cannot support them. MPEG-DASH thus represents an important step toward a modern, scalable streaming infrastructure.
Various quality levels for adaptive streamingHigher Resolutions for Scientific Content
Since April 2025, we have been generating resolutions beyond Full HD. These include high-quality rescans from a digitization project, available at 2048×1536 pixels and offering visibly more detail than standard HD formats. In addition, numerous videos are now available in 4K, which is particularly beneficial for visual material, animations, and complex scientific content.
Support for Audio-Only Files
Since the introduction of MPEG-DASH, the AV-Portal can not only generate audio streams as part of video derivatives but also produce real audio formats for the first time. This significantly broadens the portal’s scope: in addition to traditional video content, users can now upload, analyze, and publish standalone audio sources – such as interviews, podcasts, lectures, or audio recordings from research projects.
Audio with searchable transcriptTo ensure reliable processing of audio-only files, the AV-Portal uses a unified technical procedure. The audio track is automatically extracted from an uploaded file and converted into M4A – a widely supported format that can be played on most devices.
With this enhancement, the AV-Portal now supports not only videos but also audio formats, becoming a platform for scientific sound and image media alike.
More Flexible and Extended Upload Process
With the latest enhancement of the upload function, significantly larger files can now be uploaded directly via the AV-Portal’s upload form. This is made possible by a new transfer process that automatically divides large files into smaller data chunks and uploads them incrementally. With this so-called “chunked upload”, video files up to 10 GB can be uploaded reliably.
The workflow has also become more flexible: users can now select their video file and simultaneously enter the metadata in the form. This allows any waiting time during the upload to be used productively.
The expansion is rounded off by additional upload options: alongside the video or audio file, users may now supply their own transcripts and preview images.
OpenCLIP for Precise Image Content Analysis
To improve the discoverability of visual content in scientific videos, we have implemented a new generation of image-based search within the TIB AV-Portal. The technological foundation consists of OpenCLIP vectors, which we computed for every video frame in the portal.
On this basis, we developed a prototype for zero-shot queries that matches free-form textual input – across multiple languages – directly with the visual content. Even this initial prototype demonstrated that highly complex search phrases can return suitable image results.
Subsequently, we fundamentally renewed VCD labeling. A curated list of visual concepts was created, covering both established and newly defined categories – such as “chemical experiment”, “microphotography”, or “robot”. For each of the current 86 concepts, we formulated specific prompts and generated corresponding text vectors. Using thresholds derived from a manually created ground truth, we determined at which point a concept can be considered present in the video material. Additionally, these visual concepts were linked to subject headings from the Integrated Authority File (GND).
For users, this means: the entire video collection can now be filtered using visual concepts, and the detail pages allow direct navigation to the exact timestamps where these concepts occur.
Image-content search with jump markersPerhaps the most significant progress is that the Scrum team can now define new VCD concepts at any time and integrate them directly into the portal. Since the underlying open-source software OpenCLIP is operated entirely on TIB servers, all data and processes remain fully under our control. This represents a major milestone, and additional OpenCLIP-based features are already under development.
Improved Display of GND Annotations
In the AV-Portal, speech, on-screen text, and visual content are automatically enriched with GND subject headings. These annotations are now displayed far more clearly on the detail pages: instead of appearing in a scattered layout, users now see an alphabetically sorted list of all detected entities, which can be searched and filtered by language, text, or image.
Annotations from speech, text, and imageA single click reveals where in the video the term appears – the matches are highlighted clearly in the timeline. Users can therefore jump directly to relevant scenes without having to navigate through the entire video.
New Subtitle Segmentation for Improved Readability
To further enhance subtitle quality, we introduced a new segmentation method for Whisper transcripts. This method is based on OpenNLP, an open-source toolkit for natural language processing, and considers not only punctuation but also part-of-speech information and natural speech pauses.
Additionally, a look-ahead algorithm evaluates all possible breakpoints within a preview window of 150 characters to determine the optimal cue boundary. Unlike simple heuristic approaches, the algorithm considers upcoming options to maximize overall subtitle quality. This reliably prevents unnaturally short segments – such as single words at the end of a subtitle line.
This improvement enhances readability for accessibility purposes and establishes the technical ground-work for potential text-to-speech functionality.
More Precise Sharing and Citing of Video Content
With recent releases, we have expanded and refined the functions for sharing and citing videos. The share dialog now includes an optional start timestamp, enabling video playback to begin at a specific point; the same option is available for the embed code. The citation dialog was likewise enhanced: the timestamp of a segment can now be displayed or removed as needed. As part of these improvements, we redesigned the share dialog to make the overall structure more intuitive.
Share dialog with start timestamp for the embed codeProviding Metadata as Open Data
TIB promotes the use and visibility of its audiovisual holdings by publishing the AV-Portal’s meta-data as Open Data. Once per week, the metadata and preview images of all legally eligible videos are automatically made available. On our Open Data page, the data is offered in two formats:
JSONL for efficient processing of large volumes, and Turtle as an RDF format suitable for semantic applications and Linked Data environments.
Embedding Selected Metadata into the MP4 File
Metadata such as title, author, and the link to the detail page are now embedded directly into the downloadable MP4 file. These details remain available even when the video is saved locally, shared, or opened in other applications. This ensures that the origin of the video and the appropriate citation source can always be identified – without additional notes or manual research.
Display of embedded metadata in the downloaded MP4 file (VLC Player)Outlook for 2026
Stella as an Evaluation Framework for Video Recommendations
Stella is a living-lab infrastructure for evaluating experimental retrieval and recommendation systems with real users; the TIB AV-Portal is a product partner in this project. In 2025, we created the technical foundations for integrating Stella into the portal; the live deployment is planned for the coming year.
With Stella, various recommendation algorithms can be compared directly within the portal using interleaved A/B tests: users are shown recommendations alternating between our existing approach (Solr MoreLikeThis) and experimental recommenders. The resulting clicks serve as anonymized feedback. This enables an empirical determination of which algorithm performs better in real-world use.
The Visual Analytics research group at TIB will continue to develop and provide additional recommender experiments, ensuring that all required components are available in-house to continuously evaluate and improve the recommendation system.
Prompt-Based Image Search in the AV-Portal
Building on the OpenCLIP developments of 2025, we aim to implement a full-fledged image search in the AV-Portal in 2026. In future, users will not be limited to filtering by predefined visual concepts but will be able to search the visual content of our videos directly using freely formulated text queries (zero-shot search). Our current considerations involve offering this prompt-based search both across the entire portal and on the detail pages of videos. This would create a novel way of accessing scientific videos, making visual content as intuitively and precisely searchable as textual content.
#AVMedia #AVPortal #LizenzCCBY40INT #scientificVideos







