57 upcoming events worldwide
Data engineering conferences cover the full data stack — from streaming pipelines and data warehousing to analytics engineering and real-time BI. Whether you're working with Spark, dbt, Snowflake, Databricks, or building a modern data platform, these events are where the community meets.
Leading data engineering events in 2026 include Data + AI Summit (Databricks, San Francisco), dbt Coalesce (the annual dbt community conference), Big Data & AI World (London), Current (by Confluent, the Kafka and Flink event), and the Data Council conference. For the open-source data ecosystem: ApacheCon covers Spark, Flink, and Kafka project updates. PyData conferences bridge scientific Python with data engineering. Subsurface LIVE covers Dremio and Iceberg. Our full list is updated continuously with verified official links.
The modern data stack at conferences in 2026 centers on: Apache Iceberg and Delta Lake as open table formats replacing proprietary data lakes, DuckDB for in-process analytics (displacing Spark for smaller datasets), Apache Flink for real-time streaming (replacing Spark Streaming in many architectures), dbt for analytics engineering and data transformation, Apache Airflow and Dagster for orchestration, Polars as a high-performance pandas replacement, and vector databases (Pinecone, Weaviate, pgvector) for AI-adjacent data workloads. The lakehouse architecture (combining data lake flexibility with data warehouse performance) is the dominant architectural pattern.
Analytics engineering sits between data engineering and data analysis — it uses software engineering practices (version control, testing, CI/CD) applied to data transformation and modeling. dbt (data build tool) is the central technology. dbt Coalesce is the dedicated conference for this community. Topics include: data modeling patterns (dimensional vs. OBT), dbt macros and packages, data contracts, semantic layers, and metric stores. Analytics engineering content is also growing at Data Council, Big Data & AI World, and PyData events.
Yes — Current (formerly Kafka Summit, run by Confluent) is the primary event for Apache Kafka and Flink practitioners. It covers stream processing patterns, Kafka Connect and SMTs, Kafka Streams, Schema Registry, and ksqlDB. Flink Forward is the dedicated Apache Flink conference covering real-time analytics, stateful stream processing, and the Table API. Current and Flink Forward often co-locate. Streaming data is also a major track at Data + AI Summit and ApacheCon.
Data engineering conferences focus on infrastructure: ingestion pipelines, storage formats, transformation, orchestration, and query engines — the plumbing that makes data available. ML conferences (NeurIPS, ICML) focus on algorithms and model architecture. The overlap is MLOps — managing data and model pipelines in production. MLOps World, the MLOps Community Summit, and the ML Platform tracks at Data + AI Summit specifically address this intersection: feature stores, model registries, data versioning (DVC, LakeFS), and inference infrastructure.
Data governance is a growing track at data engineering conferences. Key topics include: data catalogs (Apache Atlas, DataHub, Atlan, Collibra), data lineage and observability (Monte Carlo, Soda, Great Expectations, dbt tests), data contracts between producers and consumers, column-level lineage, GDPR and CCPA compliance in data pipelines, and privacy-preserving analytics (differential privacy, synthetic data). dbt Coalesce and Data Council have dedicated governance tracks. The shift from reactive data quality to proactive data contracts is a major theme in 2026.