About the conference

Data Engineering conference SmartData 2020 took place on December 9-12, 2020.

Streaming

  • Flink;
  • Spark;
  • Kafka.

DBMS and big data storage

Using classic relational, columnar, NoSQL, SMP/MPP storages to build DWH:

  • Hive, Impala, Presto, Vertica, ClickHouse, Cassandra;
  • Teradata, Redshift, GreenPlum, Exadata;
  • MSSQL, PostgreSQL;
  • MongoDB, DynamoDB;
  • S3, ADLS, GCS, HDFS.

DWH architecture

  • Data modeling;
  • Examples of building corporate data warehouses;
  • Operational analytics;
  • Ad-hoc reporting;
  • Hadoop.

Data governance

  • Data security;
  • Data quality;
  • Metadata and catalog management;
  • Master data management;
  • Data migration.

ETL building technologies

  • Spark;
  • Hadoop MapReduce;
  • Sqoop;
  • Performance analysis and optimization.

Orchestration and MLOps

  • Airflow, NiFi, Luigi, Azkaban, Oozie;
  • MLflow.

Other

  • Box Cloud solutions;
  • Data engineering not for data engineers;
  • CI/CD for data pipelines;
  • Testing.

So, if you are interested in data engineering, if you want to be the first one to learn about the emerging technologies — join us!

Speakers

Jeff Zhang
Jeff Zhang Alibaba Group
Jeff has 11 years of experience in the big data industry. He is an open source...

Jeff has 11 years of experience in the big data industry. He is an open source veteran, started to use Hadoop in 2009, and is PMC of Apache project Tez/Livy/Zeppelin and committer of Apache Pig. His experience is not only on big data infrastructure but also on how to leverage these big data tools to get insight. He speaks several times at big data conferences like Hadoop summit, Strata + Hadoop World. Now he works in Alibaba Group as a staff engineer. Before that, he worked in Hortonworks where he had developed these popular big data tools.

Jacek Laskowski
Jacek Laskowski
Jacek is an IT freelancer specializing in Apache Spark, Delta Lake, Apache Kafka and Kafka Streams...

Jacek is an IT freelancer specializing in Apache Spark, Delta Lake, Apache Kafka and Kafka Streams (with brief forays into a wider data engineering space, e.g. Presto). Jacek offers software development and consultancy services with very hands-on in-depth workshops and mentoring. He is best known by his online books available free of charge at https://books.japila.pl/.

Neville Li
Neville Li Spotify
Neville is a data infrastructure engineer at Spotify and the creator of Scio. Over the years at...

Neville is a data infrastructure engineer at Spotify and the creator of Scio. Over the years at Spotify he has been driving the adoption of Scala and new tools for data processing, including Scalding, Spark, Storm, Parquet, and now Apache Beam and Scio. Before that he worked on search quality at Yahoo! and old school distributed systems like MPI.

Evgeny Legky
Evgeny Legky Retable
Evgeny is a founder and CEO at Retable — a powerful cloud platform for visually exploring,...
Evgeny Legky

Evgeny is a founder and CEO at Retable — a powerful cloud platform for visually exploring, cleaning and preparing data for Data Scientists and Data Engineers.

Also, Evgeny is a strategy consultant for high-growth Silicon Valley startups to help them build scalable data stacks and data-oriented products.

Till that has been a founder and CEO at Segmento, one of the biggest RTB startups in Russia (acquired by Sberbank), and a co-founder at Hintlab AI laboratory. Worked as a software developer at JetBrains and LG.

Pavel Yakunin
Pavel Yakunin Russian Tech Centre Deutsche Bank
Lead developer and the team lead for the big data team at Deutsche Bank Investment division. Pavel...
Pavel Yakunin

Lead developer and the team lead for the big data team at Deutsche Bank Investment division.

Pavel joined Deutsche in 2014. Before that, he managed to get a Ph.D. in quantum optics, spent some time as a developer in a small hedge fund, and then in Yandex. Pavel is building Big Data in DB with the team for almost four years now and would be happy to share his experience.

Mikhail Maryfich
Mikhail Maryfich Mail.Ru Group
Machine Learning Engineer at Mail.Ru Group, specializes in Deep Learning. Mikhail has been engaged in machine...

Machine Learning Engineer at Mail.Ru Group, specializes in Deep Learning. Mikhail has been engaged in machine learning for over 4 years, solves E2E problems, from problem formulation to rolling out into industrial operation and further system support. In his professional career, he values the reproducibility of results and good development processes above all.

Olga Makarova
Olga Makarova ivi
Product analyst at ivi and Yandex. Ivi Big Data team manager....
Olga Makarova

Product analyst at ivi and Yandex. Ivi Big Data team manager.

Stanislav Bogatyrev
Stanislav Bogatyrev NEO Saint Petersburg Competence Center
Co-founder and CIO of NEO Saint Petersburg Competence Center, where he's a lead of NeoFS development. Before...
Stanislav Bogatyrev

Co-founder and CIO of NEO Saint Petersburg Competence Center, where he's a lead of NeoFS development.

Before that, for over 15 years he worked in infrastructure and storage systems at Samsung Research, Clodo.ru, and DellEMC.

Nikolay Averin
Nikolay Averin Miro
For the last 3 years, Nikolay has been working for Miro. Migrates the service's data from...
Nikolay Averin

For the last 3 years, Nikolay has been working for Miro. Migrates the service's data from Redis to PostgreSQL. Implements the multi-tenant storage architecture at the application layer. Works on scaling and fault tolerance of databases. Half a backend engineer, half a DBA.

Moon soo Lee
Moon soo Lee Staroid, Inc.
Lee Moon soo is a founder of staroid.com, a platform that bridges the gap between the...

Lee Moon soo is a founder of staroid.com, a platform that bridges the gap between the open source community and enterprise users.

He has been working on building a sustainable open source eco-system since he created an open source project Apache Zeppelin and a business around it.

Evgeny Ermakov
Evgeny Ermakov Yandex Go
More than 10 years of experience in IT. Architect of data warehouses and analysis systems at...

More than 10 years of experience in IT. Architect of data warehouses and analysis systems at Mail.ru Group and Yandex Go. Candidate of Technical Sciences, author of more than 10 papers in data analysis, co-author of a monograph on the theory and practice of parallel database analysis.

Vladimir Verstov
Vladimir Verstov Yandex.Go
The lead of the DMP development at Yandex.Go. More than 10 years of experience in IT....
Vladimir Verstov

The lead of the DMP development at Yandex.Go. More than 10 years of experience in IT. At the university, Vladimir was engaged in parallel and distributed computing, developed his own CAD system, defended his Ph.D. in two specialties. 5 years in enterprise development in consulting. Vladimir went from a system analyst to Team & Tech Lead. For the last 4 years he has been working in data engineering at Yandex.Go.

Partners

We would not be able to hold SmartData on a regular basis without the tremendous support of our partners. Our conference is growing and evolving thanks to their efforts.

Platinum partner

Gold partner

Silver partner

Information partners

If you want to become a partner of our conference, please contact us via email: partners@cppconf.ru.