Metadataday2020

Metadata Day 2020 - Metaspeak Meetup

#metaspeak2020

December 14, 2020 at 4:00 PM PST

Join the conversation

The industry is tackling growing challenges in enabling productive data science in data-driven enterprises while maintaining data governance and compliance. Several popular open-source and commercial projects have emerged over the last few years, employing graph-based practices for managing and leveraging metadata. This event brings together these projects along with a larger community of metadata experts to help chart our common ground and work ahead.

Practitioners in this space have been invited to participate in an “unconference” styled workshop, to compare notes and develop reports about both (1) the metadata use cases their organizations have and how their practices have evolved, and (2) the challenges with metadata which their organizations face and how they address those.

This meetup is the public forum where these reports from the Metaspace practitioners workshop will be presented, along with live Q&A from the audience, for metadata questions answered by a collection of industry expert practitioners.

In addition, we’ll have a few selected lightning talks made available online ahead of time, to help prompt further audience discussions. We’ll also make an online survey form available for submitting questions for Q&A.

The moderators, speakers and influencers:

Paco Nathan

Metadata Organizing Committee/ Managing Partner

Derwen, Inc

Shirshanka Das

Metadata Organizing Committee / Principal Staff Software Engineer
Creator: DataHub

Nadiya Hayes

Metadata Organizing Committee/ Chief of Staff

Joe Hellerstein

Chief Strategy Officer and Professor

Trifacta and University of Berkeley

Kapil Surlaker

Vice President Of Engineering

Chris Williams

Staff Engineer, Data Visualization

CO-CREATOR: AIRBNB DATAPORTAL

Airbnb

Natasha Noy

Research Scientist

Google

Mandy Chessell

Distinguished Engineer

IBM

Daniella Lowenberg

Product Manager & Make Data Count Principal Investigator

Dryad

Ian Mulvany

Chief Technology Officer

BMJ

Mark Grover

Product manager

co-creator, AMUNDSEN

lyft

Alejandro Saucedo

Engineering Director, machine learning

Seldon

Deborah McGuinness

World Senior Constellation Chair and Professor and CEO

RPI and McGuinness Associates Consulting

Ted Habermann

small business owner

Metadata Game Changer

Charles Smith

Director - Data Platform Architecture

netflix

Julien Le Dem

CTO & Co-Founder

Datakin

Deepak Chandramouli

Engineering Lead | Data Platform Services

Paypal

Igor Perisic

Chief Data Officer

Sunheng Taing

Senior Software Engineer

Tech Lead: UBER DATABOOK

UBER

Satyen Sangani

CEO & Founder

Alation

Aaron Kalb

Co-Founder & CDAO

alation

Daniel Rincon Silva

product manager

paypal

Lightning Talk: Graph of Enterprise Metadata

Modern enterprises not only have a myriad of data sources, from real-time events, transactional, Big Data, and many other systems, but they also boast a rich ecosystem of thousands of APIs & treasure of deep technical metadata. How do you organize and gain insights from all of this? In addition, there is a trove of metadata from sources such as data transformations, SQL queries, security scans, slack chats, thousands user hierarchies, orgs & locations, access controls, Wiki pages, JIRA tickets and more. Normally, these sources are all disconnected from each other, and valuable insights are missed. Enterprise Metadata is central to data management, leading to effective Data Governance. In this context, we believe Enterprise metadata transcends the traditional Data Catalog, We envision Graph of Enterprise Metadata as a comprehensive knowledge graph that connects and puts all the critical metadata under one umbrella, which will eventually lead us to effective Data Governance.

Speakers: Daniel Rincon Silva & Deepak Chandramouli

lightning talk: Open Lineage

As data becomes core to every product, data operations become critical. The OpenLineage API

enables data pipeline observability.

Speaker: Julien Le Dem is the CTO and Co-Founder of Datakin. He co-created Apache Parquet and is involved in several open source projects including Marquez (LF AI), Apache Pig, Apache Arrow, Apache Iceberg and a few others. Previously, he was a senior principal at Wework; principal architect at Dremio; tech lead for Twitter’s data processing tools, where he also obtained a two-character Twitter handle (@J_); and a principal engineer and tech lead working on content platforms at Yahoo, where he received his Hadoop initiation. His French accent makes his talks particularly attractive.

Lightning talk: Use-Cases for Metadata at LinkedIn

Most people think that a catalog is just good for search and discovery.
But what all can you do if you had an amazing metadata platform sitting behind your search and discovery application?
Shirshanka explores just that by talking about the different use-cases for metadata that are powered by DataHub’s metadata platform at LinkedIn.

Speaker: Shirshanka Das is a Principal Staff Software Engineer in the Data team at LinkedIn. He is responsible for creating and driving the vision for the LinkedIn DataHub and Apache Gobblin projects which power metadata and big data management at the company.

Lightning Talk: Measuring Metadata

Improving metadata for Earth observations is a long-tern goal of mine. Assessing metadata completeness against community recommendations helps us understand past decisions by researchers and repositories and learn lessons for improving metadata today. Examples of assessments from several large repositories demonstrate how the assessment process can motivate and facilitate continuous metadata improvement efforts.

Speaker: documentation at NOAA’s National Centers for Environmental Information.

Speaker: Ted Habermann created Metadata Game Changers to focus on helping organizations improve metadata for data discovery, access, and understanding. Previously he was the Director of Earth Science at The HDF Group and worked for many years to improve data management, access, interoperability and documentation at NOAA’s National Centers for Environmental Information.

Lightning Talk: metadata as Signifier

This lightning talk will take a brief look at the surprisingly social life of metadata. In the research the academic citation is the most important type of metadata, but why is it that the way a citation looks can affect the way an entire industry behaves? We will take a quick look!

Speaker: Ian Mulvany is CTO at BMJ. Previously he was head of transformation at SAGE Publishing. He helped setup SAGE’s methods innovation incubator SAGE Ocean following a lean product development approach. He ran technology operations for eLife, was head of product for Mendeley and ran a number of early web2.0 products for Nature Publishing Group.

He is passionate about creating digital tools that support the research enterprise. He is interested in the interplay between different stakeholders that can lead to the sustainably of these kinds of tools.

Program Schedule:

4:00pm - 4:15pm Welcome address

Topic 1: Use-Cases for Leveraging Metadata

4:15pm - 4:30pm Presentation of outcomes from the Metaspace practitioners workshop

4:30pm - 5:00pm Panel discussion with audience Q&A

Topic 2: Hurdles in Practice for “Excellent” Metadata

5:00pm - 5:15pm Presentation of outcomes from the Metaspace practitioners workshop

5:15pm - 5:45pm Panel discussion with audience Q&A

5:45pm - 6:00pm Closing remarks

About

User Agreement