💡 Databricks Advanced Security with RBAC, RLS & ABAC

Our newest blog-post summarizes authorization patterns on the Databricks platform. How do role-based and attribute-based access control mix with row- and column-level security? All you need to know in a concise little write-up:

🔗 https://www.nextlytics.com/blog/master-databricks-security-with-rbac-rls-abac

#databricks #dataengineering #datascience #sapdatabricks #businessintelligence #blog #azuredatabricks #datawarehouse #datagovernance #unitycatalog

Master Databricks Security with RBAC, RLS & ABAC

Explore advanced Databricks security with RBAC, RLS, and ABAC. Learn how to manage data access effectively while ensuring compliance and governance.

📰 データエンジニアのためのオントロジー入門 ― Semantic Layer との違いと役割分担 (👍 28)

🇬🇧 Intro for data engineers on what ontologies are, how they differ from semantic layers, and how each should divide responsibilities in data and AI s...
🇰🇷 데이터 엔지니어를 위한 오ント로지 입문. 오ント로지가 무엇인지, semantic layer와 어떻게 다른지, 데이터·AI 시스템에서 역할을 어떻게 나눌지 설명합니다.

🔗 https://zenn.dev/bare64/articles/ecac1bbf510ce4

#DataEngineering #Ontology #Zenn

データエンジニアのためのオントロジー入門 ― Semantic Layer との違いと役割分担

Zenn

The OpenSearch Data Prepper maintainers are happy to announce the release of Data Prepper 2.15. This version adds the ability to ingest data from Apache Iceberg, making it easier to keep OpenSearch in sync without custom pipelines. It also extends Prometheus support with a remote-write source and the ability to send data to open source Prometheus.

Big kudos to the community for keeping this project moving forward. Read the full blog here: https://bit.ly/4sotUhD

#OpenSearch #DataEngineering

The tricky part of machine learning isn’t training the model, but putting it into production.
The serverless pattern using Docker, FastAPI, AWS Lambda, API Gateway and ECR offers a clean, scalable architecture with no servers to manage.

A modern approach to deploying models as real, reproducible and easy-to-maintain APIs.
#MachineLearning #MLOps #AWS #Serverless #Docker #FastAPI #DataEngineering #FediTech

ETL In-Flight vs At Rest

In-Flight (Streaming):

Transform while data moves
Real-time results
Higher cost, lower latency
Kafka, Flink, Spark Streaming

At Rest (Batch):

Store first, transform later
Scheduled processing
Lower cost, higher latency
SQL, dbt, Spark Batch

Real-time or cost-effective? Your call!

#ETL #DataEngineering #Streaming #Batch

Data Inlining in DuckLake: Unlocking Streaming for Data Lakes – DuckLake
https://ducklake.select/2026/04/02/data-inlining-in-ducklake/

#DuckLake #DataEngineering

Data Inlining in DuckLake: Unlocking Streaming for Data Lakes

DuckLake’s data inlining stores small updates directly in the catalog, eliminating the “small files problem” and making continuous streaming into data lakes practical. Our benchmark shows 926× faster queries and 105× faster ingestion when compared to Iceberg.

DuckLake

AI coding tools generate plausible but wrong SQL constantly. The fix isn't waiting for a smarter model.

AI skills are markdown files that encode domain knowledge into coding tools. No framework, just structured text in a repo.

The loop: code → review → handoff → skills update. Every session makes the next one smarter. One aggregation bug became a permanent rule enforced automatically.

Dori Wilson broke it all down at Data Debug SF. Full writeup:
https://blog.reccehq.com/ai-skills-for-analytics-eng

#DataEngineering #AI

A Practical Guide to AI Skills for Analytics Engineering

I built a self-improving AI skill system for analytics engineering at Recce. Here's the framework, a real bug it caught, and how we scaled it.

Project Manager for Data Science -- Arnaout Lab @UCSF

Post a job in 3min, or find thousands of job offers like this one at jobRxiv!

jobRxiv