The TIB AV-Portal in 2025: New Infrastructure, AI-Based Media Analysis, and Audio-Only

diesen Beitrag auf Deutsch lesen

As in previous years, we would once again like to provide an overview of the most important technical and functional enhancements of the TIB AV-Portal. In 2025, the Scrum team implemented a wide range of improvements that strengthened both the infrastructural foundation and the functional capabilities of the portal.

Several of these developments directly respond to feedback and concrete requirements raised by users. For some readers, this review may therefore be not only informative but also personally relevant – perhaps you will spot a feature that you yourself helped inspire.

From External Hosting to TIB-Owned Infrastructure

With the complete migration of video and audio delivery to servers operated by TIB in January 2025, the AV-Portal has taken a significant step forward in its infrastructural development. Previously, individual components for streaming, download, and delivery operated on external third-party systems; these processes are now conducted entirely within TIB infrastructure. Supplementary materials – such as presentations, scripts, research data, or additional teaching resources – are likewise hosted directly at TIB.

By running these services on its own servers, TIB not only controls all technical processes but also governs data flows, storage locations, and security standards. External dependencies – such as those related to availability or service levels – have been further reduced. Following the principle: Scientific data belongs in scientific infrastructure – under conditions that meet the requirements of research, teaching, and Open Science.

Adaptive Streaming with MPEG-DASH

Since January 2025, we have been generating adaptive derivatives in the MPEG-DASH format. This enables video quality to adjust dynamically to the user’s available bandwidth during playback.

Instead of delivering a single, statically encoded video, the AV-Portal provides multiple quality levels between which the player switches automatically.

The result is a significantly more stable streaming experience: delays, stutters, and playback interruptions are reduced, while always delivering the best possible resolution. At the same time, bandwidth usage decreases, as unnecessarily large files are no longer transmitted when a user’s connection cannot support them. MPEG-DASH thus represents an important step toward a modern, scalable streaming infrastructure.

Various quality levels for adaptive streaming

Higher Resolutions for Scientific Content

Since April 2025, we have been generating resolutions beyond Full HD. These include high-quality rescans from a digitization project, available at 2048×1536 pixels and offering visibly more detail than standard HD formats. In addition, numerous videos are now available in 4K, which is particularly beneficial for visual material, animations, and complex scientific content.

Support for Audio-Only Files

Since the introduction of MPEG-DASH, the AV-Portal can not only generate audio streams as part of video derivatives but also produce real audio formats for the first time. This significantly broadens the portal’s scope: in addition to traditional video content, users can now upload, analyze, and publish standalone audio sources – such as interviews, podcasts, lectures, or audio recordings from research projects.

Audio with searchable transcript

To ensure reliable processing of audio-only files, the AV-Portal uses a unified technical procedure. The audio track is automatically extracted from an uploaded file and converted into M4A – a widely supported format that can be played on most devices.

With this enhancement, the AV-Portal now supports not only videos but also audio formats, becoming a platform for scientific sound and image media alike.

More Flexible and Extended Upload Process

With the latest enhancement of the upload function, significantly larger files can now be uploaded directly via the AV-Portal’s upload form. This is made possible by a new transfer process that automatically divides large files into smaller data chunks and uploads them incrementally. With this so-called “chunked upload”, video files up to 10 GB can be uploaded reliably.

The workflow has also become more flexible: users can now select their video file and simultaneously enter the metadata in the form. This allows any waiting time during the upload to be used productively.

The expansion is rounded off by additional upload options: alongside the video or audio file, users may now supply their own transcripts and preview images.

OpenCLIP for Precise Image Content Analysis

To improve the discoverability of visual content in scientific videos, we have implemented a new generation of image-based search within the TIB AV-Portal. The technological foundation consists of OpenCLIP vectors, which we computed for every video frame in the portal.

On this basis, we developed a prototype for zero-shot queries that matches free-form textual input – across multiple languages – directly with the visual content. Even this initial prototype demonstrated that highly complex search phrases can return suitable image results.

Subsequently, we fundamentally renewed VCD labeling. A curated list of visual concepts was created, covering both established and newly defined categories – such as “chemical experiment”, “microphotography”, or “robot”. For each of the current 86 concepts, we formulated specific prompts and generated corresponding text vectors. Using thresholds derived from a manually created ground truth, we determined at which point a concept can be considered present in the video material. Additionally, these visual concepts were linked to subject headings from the Integrated Authority File (GND).

For users, this means: the entire video collection can now be filtered using visual concepts, and the detail pages allow direct navigation to the exact timestamps where these concepts occur.

Image-content search with jump markers

Perhaps the most significant progress is that the Scrum team can now define new VCD concepts at any time and integrate them directly into the portal. Since the underlying open-source software OpenCLIP is operated entirely on TIB servers, all data and processes remain fully under our control. This represents a major milestone, and additional OpenCLIP-based features are already under development.

Improved Display of GND Annotations

In the AV-Portal, speech, on-screen text, and visual content are automatically enriched with GND subject headings. These annotations are now displayed far more clearly on the detail pages: instead of appearing in a scattered layout, users now see an alphabetically sorted list of all detected entities, which can be searched and filtered by language, text, or image.

Annotations from speech, text, and image

A single click reveals where in the video the term appears – the matches are highlighted clearly in the timeline. Users can therefore jump directly to relevant scenes without having to navigate through the entire video.

New Subtitle Segmentation for Improved Readability

To further enhance subtitle quality, we introduced a new segmentation method for Whisper transcripts. This method is based on OpenNLP, an open-source toolkit for natural language processing, and considers not only punctuation but also part-of-speech information and natural speech pauses.

Additionally, a look-ahead algorithm evaluates all possible breakpoints within a preview window of 150 characters to determine the optimal cue boundary. Unlike simple heuristic approaches, the algorithm considers upcoming options to maximize overall subtitle quality. This reliably prevents unnaturally short segments – such as single words at the end of a subtitle line.

This improvement enhances readability for accessibility purposes and establishes the technical ground-work for potential text-to-speech functionality.

More Precise Sharing and Citing of Video Content

With recent releases, we have expanded and refined the functions for sharing and citing videos. The share dialog now includes an optional start timestamp, enabling video playback to begin at a specific point; the same option is available for the embed code. The citation dialog was likewise enhanced: the timestamp of a segment can now be displayed or removed as needed. As part of these improvements, we redesigned the share dialog to make the overall structure more intuitive.

Share dialog with start timestamp for the embed code

Providing Metadata as Open Data

TIB promotes the use and visibility of its audiovisual holdings by publishing the AV-Portal’s meta-data as Open Data. Once per week, the metadata and preview images of all legally eligible videos are automatically made available. On our Open Data page, the data is offered in two formats:

JSONL for efficient processing of large volumes, and Turtle as an RDF format suitable for semantic applications and Linked Data environments.

Embedding Selected Metadata into the MP4 File

Metadata such as title, author, and the link to the detail page are now embedded directly into the downloadable MP4 file. These details remain available even when the video is saved locally, shared, or opened in other applications. This ensures that the origin of the video and the appropriate citation source can always be identified – without additional notes or manual research.

Display of embedded metadata in the downloaded MP4 file (VLC Player)

Outlook for 2026

Stella as an Evaluation Framework for Video Recommendations

Stella is a living-lab infrastructure for evaluating experimental retrieval and recommendation systems with real users; the TIB AV-Portal is a product partner in this project. In 2025, we created the technical foundations for integrating Stella into the portal; the live deployment is planned for the coming year.

With Stella, various recommendation algorithms can be compared directly within the portal using interleaved A/B tests: users are shown recommendations alternating between our existing approach (Solr MoreLikeThis) and experimental recommenders. The resulting clicks serve as anonymized feedback. This enables an empirical determination of which algorithm performs better in real-world use.

The Visual Analytics research group at TIB will continue to develop and provide additional recommender experiments, ensuring that all required components are available in-house to continuously evaluate and improve the recommendation system.

Prompt-Based Image Search in the AV-Portal

Building on the OpenCLIP developments of 2025, we aim to implement a full-fledged image search in the AV-Portal in 2026. In future, users will not be limited to filtering by predefined visual concepts but will be able to search the visual content of our videos directly using freely formulated text queries (zero-shot search). Our current considerations involve offering this prompt-based search both across the entire portal and on the detail pages of videos. This would create a novel way of accessing scientific videos, making visual content as intuitively and precisely searchable as textual content.

#AVMedia #AVPortal #LizenzCCBY40INT #scientificVideos

The Project “OER with Ukraine” – Empowering Education in Challenging Times

In an era marked by unprecedented challenges, the importance of open, accessible education has never been clearer. TIB – Leibniz Information Centre for Science and Technology alongside with the Leibniz University Hannover (LUH) Research Center L3S – is proud to have been part of the Project “OER with Ukraine”, a collaborative initiative fostering Open Educational Resources (OER) to support Ukrainian academics, teachers, and students.

What is the Project OER with Ukraine?

The project “Open Educational Resources with Ukraine” (2022-2025) is funded by the German Academic Exchange Service (DAAD) and the Federal Ministry of Education and Research (BMBF) under the program line “Ukraine Digital”. It aims to support Ukrainian higher education during times of crisis by involving Leibniz University Hannover’s partner universities in Kyiv, Dnipro, Kharkiv, and Lviv. The project assists these institutions in maintaining, further developing, and digitalising teaching courses in the fields of biomedical engineering, biology, materials science, history of science, computer science, and information technology.

For this purpose, a total of 486 teaching and learning videos were produced in subject specific cross-location working groups. They were published under Creative Commons (CC-BY) licences and integrated into the ongoing teaching activities of the partner universities. The total number of media views exceeds 26,000 (as of July 2025).

The videos were translated and subtitled, their content adapted and prepared as Open Educational Resources in accordance with the UNESCO Definition. The videos were published under Creative Commons licences (CC-BY) on the TIB AV-Portal. Each video was assigned a DOI, is permanently archived, and enhanced with semantic data and standardized metadata. The Metadata are incorporated into the OER search index OERSI via open interfaces. The OERSI User Interface was translated into Ukrainian and enables cross-lingual searches in different languages such as Ukrainian. This ensures the resources are accessible, easily discoverable, citable, and free to use for the long term. In this way, the project contributes to the internationalisation, digitalisation, and openness of education, offering new opportunities for both teachers and students.

Example of an high recommended lecture video from the series Cryotechnology

Enhancing the impact through networking

The project has actively connected Ukrainian institutions with international OER communities, fostering exchange, mutual support, and new partnerships. By involving Ukrainian educators and students in these global networks, we are expanding the reach and impact of their work. Through workshops and webinars, we’ve enabled Ukrainian educators to create and adapt OER – not only in response to the current emergency, but as a sustainable, long-term solution.

Members at a project meeting of OER with Ukraine at TIB. Picture: Nataliya Butych

Looking ahead

The OER project with Ukraine stands for the values of openness, solidarity, and innovation that characterize the global OER movement. TIB is honored to stand with our Ukrainian colleagues as we work together to secure the future of education – no matter what challenges lie ahead. We are confident that these connections will last for a long time. The videos are definitely sustainable, are actively used, and will also be integrated into the DigiUni project (Digital University – Open Ukrainian Initiative).

Open Educational Resources with Ukraine project also has been added to the OER World Map.

Materials

#AVPortal #LizenzCCBY40INT #OER #OpenAccess #OpenEducationalResources #scientificVideos #Ukraine

The TIB AV-Portal in 2024: Advances in Delivery, Data Sovereignty, and Mobile Use

diesen Beitrag auf Deutsch lesen

The TIB AV-Portal is an open and free platform for scientific videos that provides a wide range of services for the professional use of audiovisual media in research and academia. These include permanent citability, long-term archiving, and precise searches within video content. The ad-free portal offers a secure and privacy-compliant environment tailored specifically to the needs of the academic community. Since 2020, we have been providing an annual overview of the developments and new features of the TIB AV-Portal. Here is a review of the innovations and highlights of 2024.

Adaptive Streaming and Hosting at TIB

MPEG-DASH and HLS are widely used streaming protocols that enable the efficient delivery of video content over the internet. Both technologies divide video files into smaller segments that can be streamed in varying quality depending on the available bandwidth and the performance of the device. This ensures a smooth video experience and is an essential component of modern video platforms.

Throughout 2024, we initially created MPEG-DASH and HLS derivatives in the development environment and prepared both the frontend and backend for their use. On 8 January 2025, adaptive streaming was successfully launched. The delivery of the new derivatives no longer relies on the previously used external Media Asset Management system but instead takes place via TIB’s own servers. This step strengthens our data sovereignty and continues the strategy initiated in 2019 and 2020 with the migration of the frontend and backend to the TIB infrastructure (see blog post from 2020).

In the production system, adaptive derivatives are currently being generated for newly published videos.

Server-Side Rendering

We introduced server-side rendering (SSR) with Nuxt.js, a framework based on Vue.js that significantly simplifies the development of modern web applications, for the AV-Portal. Through SSR, the HTML content of the pages is rendered directly on the server rather than in the user’s browser. This results in faster loading times, improved search engine optimization, and an overall smoother user experience, particularly in environments with limited bandwidth.

Scientific Audios in the AV-Portal

The backend and frontend of the AV-Portal were expanded to support audio files such as MP3 and WAV. In addition to scientific videos, we now also welcome scientific audio content in our collection, with approximately 50 titles already available. The audio files can be supplemented with a static image, and subtitles are displayed just as with videos.

As part of this expansion, we adapted our wording in many parts of the website, opting for either more general phrasing or making distinctions between audio and video.

Transcription and Translation of Spoken Content

Since July 2023, we have been using OpenAI’s speech recognition software Whisper to transcribe the original language of the videos. The transcripts are used for both subtitling and content-based search (see blog post from 2023). In 2024, we also created speech transcripts for older videos in the collection that previously lacked transcriptions.

In March 2024, we integrated Whisper’s translation feature, which translates all non-English videos into English. This enables English-speaking users to understand all videos – whether in German, Ukrainian, Spanish, French, Japanese, or other languages – thanks to subtitles and to search them specifically via the transcripts.

More Efficient Speech Recognition

At the beginning of 2024, we installed Faster-Whisper, an optimized and accelerated version of Whisper. This version is four times faster than Whisper, requires significantly less memory, and features an automatic silence filter. This has made the transcription of our videos much more efficient. Additionally, the silence filter helps to minimize the typical “hallucinations” often associated with AI.

Automatic Abstract Generation

A significant portion of the videos in the AV-Portal was provided without abstracts by the contributors. However, abstracts are essential as they offer a concise summary of the content, helping users quickly identify relevant material. To meet this need, we experimented with the large language model (LLM) Llama to generate abstracts based on the speech transcripts created by Whisper. The initial results were of promising quality.

However, it became evident that even smaller LLM models have such high memory requirements that our CPU reached its limits. In the long term, we are considering relying on a centralized GPU service, which could be used by multiple teams at TIB. This would allow resource-intensive tasks such as abstract generation to be performed more efficiently, making optimal use of the available infrastructure.

Demonstrator of the Scrum Team for Abstract Generation

Stella: Evaluating and Improving Video Recommendations

We are a practice partner in the DFG-funded project STELLA, a platform for living-lab experiments with ranking and recommendation systems. The Scrum Team of the AV-Portal plans to evaluate and improve the recommendation system for video recommendations using STELLA based on real user feedback.

Locally, we have already successfully implemented and tested the STELLA system. The first recommender algorithm trained by GESIS is expected to be available in the first quarter of 2025 and will then be evaluated by us.

Optimizing Mobile Use

Mobile usage has become increasingly important with the proliferation of smartphones and mobile internet and is now an integral part of everyday digital life. To make our video portal more user-friendly, we are continuously making adjustments for mobile use.

A particular focus was placed on redesigning the player: The play/pause function can now be activated by simply tapping on the video. To avoid confusion, the control bar only appears after touching the video, while a second touch on an element of the bar executes the desired function. Additionally, timeline preview images were adjusted for better display.

On small devices, the player’s control bar appeared overloaded due to the number of elements. Therefore, we restructured the arrangement and usability of the control elements to make the control bar look “tidier” and easier to use in mobile contexts.

Mobile view of a video

Highlighting Search Hits in Transcript and Segment Bar

Speech transcripts can be searched for specific terms. Found hits are highlighted in the transcript, and the exact locations are marked in red in the segment bar. This allows users to navigate directly
to the relevant parts, facilitating targeted searches.

Highlighting search hits in transcript and segment bar

Optimizations for Sharing Videos

The iFrame of the embedded player allows videos from the AV-Portal to be embedded on external websites. This element can be found in the sharing dialog on the video’s detail page. We made the iFrame more responsive, enabling the embedded player to dynamically adapt to the browser window size. This ensures consistent and user-friendly video display on both desktop and mobile devices. Additionally, we integrated a preview function in the sharing dialog, allowing users to preview the embedded player in fixed or responsive sizes.

Sharing dialog with preview of the embedded player

Outlook for 2025

In 2025, we will gradually create adaptive derivatives for our entire video collection and host them on TIB servers. Simultaneously, we plan to replace older, lower-quality automatic transcripts in the collection with high-quality Whisper transcripts.

In addition, we plan to use OpenClip to generate image vectors for all frames of our videos in the development system and store them in a database. This opens up the possibility of implementing zero-shot searches in the AV-Portal.

As a result, the portal could process natural language search queries and directly identify relevant video scenes based on their visual content, without requiring manual tagging or additional model training. This is possible because OpenClip is pre-trained and possesses generalized zero-shot capabilities that cover a wide range of concepts. How and where OpenClip will be used in the AV-Portal is still in the planning and coordination phase.

#AVMedia #AVPortal #LizenzCCBY40INT #openness #scientificVideos

Homepage - TIB AV-Portal

Scientific videos und audios: technology/engineering, architecture, chemistry, information technology, mathematics, and physics.

Science Sketches

Our Mission
Empowering scientists to communicate with the world using big markers and small words.
We are addressing two huge needs in science communication:
First, the public needs stepping stones from their general vocabulary to the language of science.
Second, there is a huge desire in the scientific community to contribute to outreach and communication. However, there are very few well-defined opportunities to do so.

#scientificVideos