The TIB AV-Portal in 2025: New Infrastructure, AI-Based Media Analysis, and Audio-Only

diesen Beitrag auf Deutsch lesen

As in previous years, we would once again like to provide an overview of the most important technical and functional enhancements of the TIB AV-Portal. In 2025, the Scrum team implemented a wide range of improvements that strengthened both the infrastructural foundation and the functional capabilities of the portal.

Several of these developments directly respond to feedback and concrete requirements raised by users. For some readers, this review may therefore be not only informative but also personally relevant – perhaps you will spot a feature that you yourself helped inspire.

From External Hosting to TIB-Owned Infrastructure

With the complete migration of video and audio delivery to servers operated by TIB in January 2025, the AV-Portal has taken a significant step forward in its infrastructural development. Previously, individual components for streaming, download, and delivery operated on external third-party systems; these processes are now conducted entirely within TIB infrastructure. Supplementary materials – such as presentations, scripts, research data, or additional teaching resources – are likewise hosted directly at TIB.

By running these services on its own servers, TIB not only controls all technical processes but also governs data flows, storage locations, and security standards. External dependencies – such as those related to availability or service levels – have been further reduced. Following the principle: Scientific data belongs in scientific infrastructure – under conditions that meet the requirements of research, teaching, and Open Science.

Adaptive Streaming with MPEG-DASH

Since January 2025, we have been generating adaptive derivatives in the MPEG-DASH format. This enables video quality to adjust dynamically to the user’s available bandwidth during playback.

Instead of delivering a single, statically encoded video, the AV-Portal provides multiple quality levels between which the player switches automatically.

The result is a significantly more stable streaming experience: delays, stutters, and playback interruptions are reduced, while always delivering the best possible resolution. At the same time, bandwidth usage decreases, as unnecessarily large files are no longer transmitted when a user’s connection cannot support them. MPEG-DASH thus represents an important step toward a modern, scalable streaming infrastructure.

Various quality levels for adaptive streaming

Higher Resolutions for Scientific Content

Since April 2025, we have been generating resolutions beyond Full HD. These include high-quality rescans from a digitization project, available at 2048×1536 pixels and offering visibly more detail than standard HD formats. In addition, numerous videos are now available in 4K, which is particularly beneficial for visual material, animations, and complex scientific content.

Support for Audio-Only Files

Since the introduction of MPEG-DASH, the AV-Portal can not only generate audio streams as part of video derivatives but also produce real audio formats for the first time. This significantly broadens the portal’s scope: in addition to traditional video content, users can now upload, analyze, and publish standalone audio sources – such as interviews, podcasts, lectures, or audio recordings from research projects.

Audio with searchable transcript

To ensure reliable processing of audio-only files, the AV-Portal uses a unified technical procedure. The audio track is automatically extracted from an uploaded file and converted into M4A – a widely supported format that can be played on most devices.

With this enhancement, the AV-Portal now supports not only videos but also audio formats, becoming a platform for scientific sound and image media alike.

More Flexible and Extended Upload Process

With the latest enhancement of the upload function, significantly larger files can now be uploaded directly via the AV-Portal’s upload form. This is made possible by a new transfer process that automatically divides large files into smaller data chunks and uploads them incrementally. With this so-called “chunked upload”, video files up to 10 GB can be uploaded reliably.

The workflow has also become more flexible: users can now select their video file and simultaneously enter the metadata in the form. This allows any waiting time during the upload to be used productively.

The expansion is rounded off by additional upload options: alongside the video or audio file, users may now supply their own transcripts and preview images.

OpenCLIP for Precise Image Content Analysis

To improve the discoverability of visual content in scientific videos, we have implemented a new generation of image-based search within the TIB AV-Portal. The technological foundation consists of OpenCLIP vectors, which we computed for every video frame in the portal.

On this basis, we developed a prototype for zero-shot queries that matches free-form textual input – across multiple languages – directly with the visual content. Even this initial prototype demonstrated that highly complex search phrases can return suitable image results.

Subsequently, we fundamentally renewed VCD labeling. A curated list of visual concepts was created, covering both established and newly defined categories – such as “chemical experiment”, “microphotography”, or “robot”. For each of the current 86 concepts, we formulated specific prompts and generated corresponding text vectors. Using thresholds derived from a manually created ground truth, we determined at which point a concept can be considered present in the video material. Additionally, these visual concepts were linked to subject headings from the Integrated Authority File (GND).

For users, this means: the entire video collection can now be filtered using visual concepts, and the detail pages allow direct navigation to the exact timestamps where these concepts occur.

Image-content search with jump markers

Perhaps the most significant progress is that the Scrum team can now define new VCD concepts at any time and integrate them directly into the portal. Since the underlying open-source software OpenCLIP is operated entirely on TIB servers, all data and processes remain fully under our control. This represents a major milestone, and additional OpenCLIP-based features are already under development.

Improved Display of GND Annotations

In the AV-Portal, speech, on-screen text, and visual content are automatically enriched with GND subject headings. These annotations are now displayed far more clearly on the detail pages: instead of appearing in a scattered layout, users now see an alphabetically sorted list of all detected entities, which can be searched and filtered by language, text, or image.

Annotations from speech, text, and image

A single click reveals where in the video the term appears – the matches are highlighted clearly in the timeline. Users can therefore jump directly to relevant scenes without having to navigate through the entire video.

New Subtitle Segmentation for Improved Readability

To further enhance subtitle quality, we introduced a new segmentation method for Whisper transcripts. This method is based on OpenNLP, an open-source toolkit for natural language processing, and considers not only punctuation but also part-of-speech information and natural speech pauses.

Additionally, a look-ahead algorithm evaluates all possible breakpoints within a preview window of 150 characters to determine the optimal cue boundary. Unlike simple heuristic approaches, the algorithm considers upcoming options to maximize overall subtitle quality. This reliably prevents unnaturally short segments – such as single words at the end of a subtitle line.

This improvement enhances readability for accessibility purposes and establishes the technical ground-work for potential text-to-speech functionality.

More Precise Sharing and Citing of Video Content

With recent releases, we have expanded and refined the functions for sharing and citing videos. The share dialog now includes an optional start timestamp, enabling video playback to begin at a specific point; the same option is available for the embed code. The citation dialog was likewise enhanced: the timestamp of a segment can now be displayed or removed as needed. As part of these improvements, we redesigned the share dialog to make the overall structure more intuitive.

Share dialog with start timestamp for the embed code

Providing Metadata as Open Data

TIB promotes the use and visibility of its audiovisual holdings by publishing the AV-Portal’s meta-data as Open Data. Once per week, the metadata and preview images of all legally eligible videos are automatically made available. On our Open Data page, the data is offered in two formats:

JSONL for efficient processing of large volumes, and Turtle as an RDF format suitable for semantic applications and Linked Data environments.

Embedding Selected Metadata into the MP4 File

Metadata such as title, author, and the link to the detail page are now embedded directly into the downloadable MP4 file. These details remain available even when the video is saved locally, shared, or opened in other applications. This ensures that the origin of the video and the appropriate citation source can always be identified – without additional notes or manual research.

Display of embedded metadata in the downloaded MP4 file (VLC Player)

Outlook for 2026

Stella as an Evaluation Framework for Video Recommendations

Stella is a living-lab infrastructure for evaluating experimental retrieval and recommendation systems with real users; the TIB AV-Portal is a product partner in this project. In 2025, we created the technical foundations for integrating Stella into the portal; the live deployment is planned for the coming year.

With Stella, various recommendation algorithms can be compared directly within the portal using interleaved A/B tests: users are shown recommendations alternating between our existing approach (Solr MoreLikeThis) and experimental recommenders. The resulting clicks serve as anonymized feedback. This enables an empirical determination of which algorithm performs better in real-world use.

The Visual Analytics research group at TIB will continue to develop and provide additional recommender experiments, ensuring that all required components are available in-house to continuously evaluate and improve the recommendation system.

Prompt-Based Image Search in the AV-Portal

Building on the OpenCLIP developments of 2025, we aim to implement a full-fledged image search in the AV-Portal in 2026. In future, users will not be limited to filtering by predefined visual concepts but will be able to search the visual content of our videos directly using freely formulated text queries (zero-shot search). Our current considerations involve offering this prompt-based search both across the entire portal and on the detail pages of videos. This would create a novel way of accessing scientific videos, making visual content as intuitively and precisely searchable as textual content.

#AVMedia #AVPortal #LizenzCCBY40INT #scientificVideos

10 Years of a Scientific Video Platform: The History of the TIB AV-Portal from its Beginnings to the Present from a Development Perspective

diesen Beitrag auf Deutsch lesen

The TIB AV-Portal celebrates its tenth anniversary on 29 April 2024. Since its launch, the portal has developed into a central platform for scientific videos and has steadily gained in importance in the academic community. The TIB has undergone a significant change in the management of the portal, in that it has evolved from a client and creator of requirements to an active implementer and operator. In addition, numerous services related to the AV-Portal were established at the TIB, which considerably expanded the range of services.

As a long-time companion of the project since 2013, I have been able to witness the impressive evolution of the portal at close quarters. In this blog article, I would like to provide a detailed insight into the development history of the portal from its beginnings to the present day.

The AV-Portal as the KNM’s first project

In 2010, the Competence Center for Non-Textual Materials (KNM) was established at the TIB after receiving a positive assessment from a special evaluation by the Leibniz Association. The aim of the competence center was to develop and operate infrastructures, tools and services for multimedia objects that improve the retrieval, searching and citation of these objects.

The KNM’s first project was the development of the TIB AV-Portal. This ambitious project aimed to develop an innovative video portal that would provide web-based access to scientific videos. The range of content included simulations, experiments, interviews, video abstracts, lectures and conferences – primarily from the fields of natural sciences and technology. It was also intended that the videos could be linked to further research information such as scientific articles, research data and persistent identifiers of academic authors. In order to understand and fulfill the user needs for such a video portal, the KNM conducted extensive expert interviews, environment analyses and focus groups in 2010.

Development of the portal until the launch in April 2014

The KNM (later renamed Lab Non-Textual Materials) developed the AV-Portal from July 2011 to April 2014 in cooperation with the Hasso Plattner Institute for Software Systems Engineering. During this phase, various video analyses were integrated, including shot detection, video OCR, visual concept detection and automatic video annotation. The graphical user interface was designed according to the user needs that emerged from the TIB’s requirements analysis.

An authorization layer ensured that access to certain films was restricted in order to protect copyright or personal rights. FlowWorks’ Flow Center was used as the media management system. In addition, interfaces and workflows were created to integrate automatically generated language transcripts and digital object identifiers (DOIs) into the system and to export metadata and video files to the digital long-term archiving system.

At the same time, the TIB set up numerous services for video management. These initially included DOI registration, long-term archiving, metadata management, video acquisition, rights clearance and licensing consulting. After three years of intensive development work, the AV-Portal went online on 29 April 2014. A highlight of the launch was the cross-lingual retrieval, which made it possible to search the content in both German and English.

The development of the AV-Portal over the years can be illustrated more clearly using a few images. The following wireframe from 2011 is an example of the first sketch of the basic structure and layout of the user interface. Wireframes are usually the first steps in the design process and are used to visualize ideas before the actual development and programming takes place.

Wireframe of the AV-Portal from 2011

When the AV-Portal officially went online in April 2014, it was presented in a gray-blue design. The homepage was predominantly static and focused on text and images. Interactive content for exploration was only sparsely represented at that time.

AV-Portal homepage in the old look from 2014

Redesign of the AV-Portal for the TIB foundation in January 2016

When the TIB became a foundation in January 2016, the AV-Portal underwent a comprehensive redesign of its screen design. This revision not only included an optimization of the interactive elements according to usability criteria, but also the introduction of a responsive design that automatically adapts to different screen sizes and device types. Since then, the TIB AV-Portal has been presented in a color scheme of red, white and grey. Despite these changes, the portal’s homepage has remained largely static.

Homepage of the AV-Portal in the new 2016 screen design

Integrating the portal into our own infrastructure

In September 2018, the Scrum team AV-Portal was founded at the TIB – with the goal of migrating the individual components of the AV-Portal, which were previously hosted by external providers, to the in-house infrastructure and developing them independently from then on. The team consists of four developers, a product owner and a scrum master. The first milestone was the migration of the frontend, including the search function and the graphical user interface, in 2019. The following year, the team completed the migration of the backend, which includes the video analyses. Since then, the Scrum team has been able to implement all requirements internally in both the frontend and the backend.

The in-house development of the AV-Portal led to a significantly faster and more efficient further development of the platform. A key factor in this success is the close collaboration between the Scrum team and TIB’s internal stakeholders. In monthly sprint reviews, the team presents the latest implementations to its stakeholders and collects valuable feedback. This feedback often flows directly into the next development phase and thus contributes to the continuous improvement of the portal.

In 2020, the Scrum team completely restructured the homepage of the AV-Portal. Various video categories were introduced, such as New content and Recommendations based on your viewing activity. These categories are updated automatically and therefore always offer users current and interesting content. In addition, the team has made it possible for the portal’s editorial team to create their own categories to highlight current topics and content. Blog articles and news from the TIB’s social media have also been integrated into the homepage and information tiles about the services and offers of the AV-Portal have been added. These features offer users a dynamic and interactive experience when browsing the content.

The following image shows the upper part of the portal’s revised homepage.

Homepage with dynamic content since 2020

Milestones in development since 2018

The Scrum team has implemented numerous other requirements in recent years. Here is a small excerpt:

• Integration of frontend and backend in a Kubernetes cluster
• Scaling the frontend
• Creation of all video formats
• Introduction of series as a new media type
• Provision of a separate space for publishers
• Revision of the search function, including an update to SolrCloud
• Extension of synonyms to all subject areas
• Implementation of a new video player
• Video subtitling and targeted search in the transcript (Whisper)
• Translation of non-English videos into English (Whisper)
• Migration to a new frontend framework (Vue.js)

More detailed information can be found in the various blog articles on the further development of the AV-Portal.

Current focus of development

The team is currently concentrating on delivering the videos via TIB servers and implementing adaptive streaming using MPEG-DASH. At the same time, the frontend is being converted to Nuxt, which is based on Vue.js. Vue.js enables a more seamless user experience, as it eliminates the need to completely reload the page when navigating and instead dynamically updates only the relevant parts of the page. Nuxt.js extends Vue.js with server-side rendering, which improves search engine optimization, performance and loading times of the portal – especially with large video content. Finally, the Scrum team is working closely with TIB’s Visual Analytics research group to extend the labeling of visual concepts based on a open clip model.

Since the launch of the AV-Portal in 2014, the TIB has become increasingly independent of external service providers. This independence is to be further expanded in the coming years. In addition to portal development, other services have been established at the TIB in recent years, such as the conference recording service TIB ConRec, Community Building for cooperation partners and scientific communities and the research group Visual Analytics, all of which expand the AV-Portal’s service
portfolio.

#10JahreTIBAVPortal #AVMedia #AVPortal #LizenzCCBY40INT #OpenAccess #Videos

The TIB AV-Portal in 2024: Advances in Delivery, Data Sovereignty, and Mobile Use

diesen Beitrag auf Deutsch lesen

The TIB AV-Portal is an open and free platform for scientific videos that provides a wide range of services for the professional use of audiovisual media in research and academia. These include permanent citability, long-term archiving, and precise searches within video content. The ad-free portal offers a secure and privacy-compliant environment tailored specifically to the needs of the academic community. Since 2020, we have been providing an annual overview of the developments and new features of the TIB AV-Portal. Here is a review of the innovations and highlights of 2024.

Adaptive Streaming and Hosting at TIB

MPEG-DASH and HLS are widely used streaming protocols that enable the efficient delivery of video content over the internet. Both technologies divide video files into smaller segments that can be streamed in varying quality depending on the available bandwidth and the performance of the device. This ensures a smooth video experience and is an essential component of modern video platforms.

Throughout 2024, we initially created MPEG-DASH and HLS derivatives in the development environment and prepared both the frontend and backend for their use. On 8 January 2025, adaptive streaming was successfully launched. The delivery of the new derivatives no longer relies on the previously used external Media Asset Management system but instead takes place via TIB’s own servers. This step strengthens our data sovereignty and continues the strategy initiated in 2019 and 2020 with the migration of the frontend and backend to the TIB infrastructure (see blog post from 2020).

In the production system, adaptive derivatives are currently being generated for newly published videos.

Server-Side Rendering

We introduced server-side rendering (SSR) with Nuxt.js, a framework based on Vue.js that significantly simplifies the development of modern web applications, for the AV-Portal. Through SSR, the HTML content of the pages is rendered directly on the server rather than in the user’s browser. This results in faster loading times, improved search engine optimization, and an overall smoother user experience, particularly in environments with limited bandwidth.

Scientific Audios in the AV-Portal

The backend and frontend of the AV-Portal were expanded to support audio files such as MP3 and WAV. In addition to scientific videos, we now also welcome scientific audio content in our collection, with approximately 50 titles already available. The audio files can be supplemented with a static image, and subtitles are displayed just as with videos.

As part of this expansion, we adapted our wording in many parts of the website, opting for either more general phrasing or making distinctions between audio and video.

Transcription and Translation of Spoken Content

Since July 2023, we have been using OpenAI’s speech recognition software Whisper to transcribe the original language of the videos. The transcripts are used for both subtitling and content-based search (see blog post from 2023). In 2024, we also created speech transcripts for older videos in the collection that previously lacked transcriptions.

In March 2024, we integrated Whisper’s translation feature, which translates all non-English videos into English. This enables English-speaking users to understand all videos – whether in German, Ukrainian, Spanish, French, Japanese, or other languages – thanks to subtitles and to search them specifically via the transcripts.

More Efficient Speech Recognition

At the beginning of 2024, we installed Faster-Whisper, an optimized and accelerated version of Whisper. This version is four times faster than Whisper, requires significantly less memory, and features an automatic silence filter. This has made the transcription of our videos much more efficient. Additionally, the silence filter helps to minimize the typical “hallucinations” often associated with AI.

Automatic Abstract Generation

A significant portion of the videos in the AV-Portal was provided without abstracts by the contributors. However, abstracts are essential as they offer a concise summary of the content, helping users quickly identify relevant material. To meet this need, we experimented with the large language model (LLM) Llama to generate abstracts based on the speech transcripts created by Whisper. The initial results were of promising quality.

However, it became evident that even smaller LLM models have such high memory requirements that our CPU reached its limits. In the long term, we are considering relying on a centralized GPU service, which could be used by multiple teams at TIB. This would allow resource-intensive tasks such as abstract generation to be performed more efficiently, making optimal use of the available infrastructure.

Demonstrator of the Scrum Team for Abstract Generation

Stella: Evaluating and Improving Video Recommendations

We are a practice partner in the DFG-funded project STELLA, a platform for living-lab experiments with ranking and recommendation systems. The Scrum Team of the AV-Portal plans to evaluate and improve the recommendation system for video recommendations using STELLA based on real user feedback.

Locally, we have already successfully implemented and tested the STELLA system. The first recommender algorithm trained by GESIS is expected to be available in the first quarter of 2025 and will then be evaluated by us.

Optimizing Mobile Use

Mobile usage has become increasingly important with the proliferation of smartphones and mobile internet and is now an integral part of everyday digital life. To make our video portal more user-friendly, we are continuously making adjustments for mobile use.

A particular focus was placed on redesigning the player: The play/pause function can now be activated by simply tapping on the video. To avoid confusion, the control bar only appears after touching the video, while a second touch on an element of the bar executes the desired function. Additionally, timeline preview images were adjusted for better display.

On small devices, the player’s control bar appeared overloaded due to the number of elements. Therefore, we restructured the arrangement and usability of the control elements to make the control bar look “tidier” and easier to use in mobile contexts.

Mobile view of a video

Highlighting Search Hits in Transcript and Segment Bar

Speech transcripts can be searched for specific terms. Found hits are highlighted in the transcript, and the exact locations are marked in red in the segment bar. This allows users to navigate directly
to the relevant parts, facilitating targeted searches.

Highlighting search hits in transcript and segment bar

Optimizations for Sharing Videos

The iFrame of the embedded player allows videos from the AV-Portal to be embedded on external websites. This element can be found in the sharing dialog on the video’s detail page. We made the iFrame more responsive, enabling the embedded player to dynamically adapt to the browser window size. This ensures consistent and user-friendly video display on both desktop and mobile devices. Additionally, we integrated a preview function in the sharing dialog, allowing users to preview the embedded player in fixed or responsive sizes.

Sharing dialog with preview of the embedded player

Outlook for 2025

In 2025, we will gradually create adaptive derivatives for our entire video collection and host them on TIB servers. Simultaneously, we plan to replace older, lower-quality automatic transcripts in the collection with high-quality Whisper transcripts.

In addition, we plan to use OpenClip to generate image vectors for all frames of our videos in the development system and store them in a database. This opens up the possibility of implementing zero-shot searches in the AV-Portal.

As a result, the portal could process natural language search queries and directly identify relevant video scenes based on their visual content, without requiring manual tagging or additional model training. This is possible because OpenClip is pre-trained and possesses generalized zero-shot capabilities that cover a wide range of concepts. How and where OpenClip will be used in the AV-Portal is still in the planning and coordination phase.

#AVMedia #AVPortal #LizenzCCBY40INT #openness #scientificVideos

Homepage - TIB AV-Portal

Scientific videos und audios: technology/engineering, architecture, chemistry, information technology, mathematics, and physics.