The TIB AV-Portal in 2025: New Infrastructure, AI-Based Media Analysis, and Audio-Only
diesen Beitrag auf Deutsch lesen
As in previous years, we would once again like to provide an overview of the most important technical and functional enhancements of the TIB AV-Portal. In 2025, the Scrum team implemented a wide range of improvements that strengthened both the infrastructural foundation and the functional capabilities of the portal.
Several of these developments directly respond to feedback and concrete requirements raised by users. For some readers, this review may therefore be not only informative but also personally relevant â perhaps you will spot a feature that you yourself helped inspire.
From External Hosting to TIB-Owned Infrastructure
With the complete migration of video and audio delivery to servers operated by TIB in January 2025, the AV-Portal has taken a signiïŹcant step forward in its infrastructural development. Previously, individual components for streaming, download, and delivery operated on external third-party systems; these processes are now conducted entirely within TIB infrastructure. Supplementary materials â such as presentations, scripts, research data, or additional teaching resources â are likewise hosted directly at TIB.
By running these services on its own servers, TIB not only controls all technical processes but also governs data ïŹows, storage locations, and security standards. External dependencies â such as those related to availability or service levels â have been further reduced. Following the principle: ScientiïŹc data belongs in scientiïŹc infrastructure â under conditions that meet the requirements of research, teaching, and Open Science.
Adaptive Streaming with MPEG-DASH
Since January 2025, we have been generating adaptive derivatives in the MPEG-DASH format. This enables video quality to adjust dynamically to the userâs available bandwidth during playback.
Instead of delivering a single, statically encoded video, the AV-Portal provides multiple quality levels between which the player switches automatically.
The result is a signiïŹcantly more stable streaming experience: delays, stutters, and playback interruptions are reduced, while always delivering the best possible resolution. At the same time, bandwidth usage decreases, as unnecessarily large ïŹles are no longer transmitted when a userâs connection cannot support them. MPEG-DASH thus represents an important step toward a modern, scalable streaming infrastructure.
Various quality levels for adaptive streaming
Higher Resolutions for ScientiïŹc Content
Since April 2025, we have been generating resolutions beyond Full HD. These include high-quality rescans from a digitization project, available at 2048Ă1536 pixels and oïŹering visibly more detail than standard HD formats. In addition, numerous videos are now available in 4K, which is particularly beneïŹcial for visual material, animations, and complex scientiïŹc content.
Support for Audio-Only Files
Since the introduction of MPEG-DASH, the AV-Portal can not only generate audio streams as part of video derivatives but also produce real audio formats for the ïŹrst time. This signiïŹcantly broadens the portalâs scope: in addition to traditional video content, users can now upload, analyze, and publish standalone audio sources â such as interviews, podcasts, lectures, or audio recordings from research projects.
Audio with searchable transcript
To ensure reliable processing of audio-only ïŹles, the AV-Portal uses a uniïŹed technical procedure. The audio track is automatically extracted from an uploaded ïŹle and converted into M4A â a widely supported format that can be played on most devices.
With this enhancement, the AV-Portal now supports not only videos but also audio formats, becoming a platform for scientiïŹc sound and image media alike.
More Flexible and Extended Upload Process
With the latest enhancement of the upload function, signiïŹcantly larger ïŹles can now be uploaded directly via the AV-Portalâs upload form. This is made possible by a new transfer process that automatically divides large ïŹles into smaller data chunks and uploads them incrementally. With this so-called âchunked uploadâ, video ïŹles up to 10 GB can be uploaded reliably.
The workïŹow has also become more ïŹexible: users can now select their video ïŹle and simultaneously enter the metadata in the form. This allows any waiting time during the upload to be used productively.
The expansion is rounded oïŹ by additional upload options: alongside the video or audio ïŹle, users may now supply their own transcripts and preview images.
OpenCLIP for Precise Image Content Analysis
To improve the discoverability of visual content in scientiïŹc videos, we have implemented a new generation of image-based search within the TIB AV-Portal. The technological foundation consists of OpenCLIP vectors, which we computed for every video frame in the portal.
On this basis, we developed a prototype for zero-shot queries that matches free-form textual input â across multiple languages â directly with the visual content. Even this initial prototype demonstrated that highly complex search phrases can return suitable image results.
Subsequently, we fundamentally renewed VCD labeling. A curated list of visual concepts was created, covering both established and newly deïŹned categories â such as âchemical experimentâ, âmicrophotographyâ, or ârobotâ. For each of the current 86 concepts, we formulated speciïŹc prompts and generated corresponding text vectors. Using thresholds derived from a manually created ground truth, we determined at which point a concept can be considered present in the video material. Additionally, these visual concepts were linked to subject headings from the Integrated Authority File (GND).
For users, this means: the entire video collection can now be ïŹltered using visual concepts, and the detail pages allow direct navigation to the exact timestamps where these concepts occur.
Image-content search with jump markers
Perhaps the most signiïŹcant progress is that the Scrum team can now deïŹne new VCD concepts at any time and integrate them directly into the portal. Since the underlying open-source software OpenCLIP is operated entirely on TIB servers, all data and processes remain fully under our control. This represents a major milestone, and additional OpenCLIP-based features are already under development.
Improved Display of GND Annotations
In the AV-Portal, speech, on-screen text, and visual content are automatically enriched with GND subject headings. These annotations are now displayed far more clearly on the detail pages: instead of appearing in a scattered layout, users now see an alphabetically sorted list of all detected entities, which can be searched and ïŹltered by language, text, or image.
Annotations from speech, text, and image
A single click reveals where in the video the term appears â the matches are highlighted clearly in the timeline. Users can therefore jump directly to relevant scenes without having to navigate through the entire video.
New Subtitle Segmentation for Improved Readability
To further enhance subtitle quality, we introduced a new segmentation method for Whisper transcripts. This method is based on OpenNLP, an open-source toolkit for natural language processing, and considers not only punctuation but also part-of-speech information and natural speech pauses.
Additionally, a look-ahead algorithm evaluates all possible breakpoints within a preview window of 150 characters to determine the optimal cue boundary. Unlike simple heuristic approaches, the algorithm considers upcoming options to maximize overall subtitle quality. This reliably prevents unnaturally short segments â such as single words at the end of a subtitle line.
This improvement enhances readability for accessibility purposes and establishes the technical ground-work for potential text-to-speech functionality.
More Precise Sharing and Citing of Video Content
With recent releases, we have expanded and reïŹned the functions for sharing and citing videos. The share dialog now includes an optional start timestamp, enabling video playback to begin at a speciïŹc point; the same option is available for the embed code. The citation dialog was likewise enhanced: the timestamp of a segment can now be displayed or removed as needed. As part of these improvements, we redesigned the share dialog to make the overall structure more intuitive.
Share dialog with start timestamp for the embed code
Providing Metadata as Open Data
TIB promotes the use and visibility of its audiovisual holdings by publishing the AV-Portalâs meta-data as Open Data. Once per week, the metadata and preview images of all legally eligible videos are automatically made available. On our Open Data page, the data is oïŹered in two formats:
JSONL for eïŹcient processing of large volumes, and Turtle as an RDF format suitable for semantic applications and Linked Data environments.
Embedding Selected Metadata into the MP4 File
Metadata such as title, author, and the link to the detail page are now embedded directly into the downloadable MP4 ïŹle. These details remain available even when the video is saved locally, shared, or opened in other applications. This ensures that the origin of the video and the appropriate citation source can always be identiïŹed â without additional notes or manual research.
Display of embedded metadata in the downloaded MP4 ïŹle (VLC Player)
Outlook for 2026
Stella as an Evaluation Framework for Video Recommendations
Stella is a living-lab infrastructure for evaluating experimental retrieval and recommendation systems with real users; the TIB AV-Portal is a product partner in this project. In 2025, we created the technical foundations for integrating Stella into the portal; the live deployment is planned for the coming year.
With Stella, various recommendation algorithms can be compared directly within the portal using interleaved A/B tests: users are shown recommendations alternating between our existing approach (Solr MoreLikeThis) and experimental recommenders. The resulting clicks serve as anonymized feedback. This enables an empirical determination of which algorithm performs better in real-world use.
The Visual Analytics research group at TIB will continue to develop and provide additional recommender experiments, ensuring that all required components are available in-house to continuously evaluate and improve the recommendation system.
Prompt-Based Image Search in the AV-Portal
Building on the OpenCLIP developments of 2025, we aim to implement a full-ïŹedged image search in the AV-Portal in 2026. In future, users will not be limited to ïŹltering by predeïŹned visual concepts but will be able to search the visual content of our videos directly using freely formulated text queries (zero-shot search). Our current considerations involve oïŹering this prompt-based search both across the entire portal and on the detail pages of videos. This would create a novel way of accessing scientiïŹc videos, making visual content as intuitively and precisely searchable as textual content.
#AVMedia #AVPortal #LizenzCCBY40INT #scientificVideos