You've been using MongoDB to store your application data for sometime and now have the need to analyse your data as it changes. Debezium, an open source collection of Change Data Capture (CDC) connectors, is a fantastic way to get your data moving through Kafka. But there's a catch, MongoDB only tells you the new values of each change and not what they were before. Enter Kafka Streams. By treating our MongoDB replication topic as a stream, we can actually build a KTable with each row representing a document from MongoDB. From this table we can create a new stream that has both our old values and the new values in a single record thus enabling the analysis of change we were after. In this talk we'll introduce using Debezium and Kafka Streams and demonstrate through worked examples the configuration and code required to enable our change analysis. A fully working example system with MongoDB and Kafka will be made available in GitHub for attendees to continue to experiment with and start using what they've learned.
April 28 - 8:50 AM - Analysing Changes in MongoDB with Debezium and Kafka Streams
Mike Fowler is a SRE in the Public Cloud Practice of Claranet. Combining his software and systems engineering skills with his system administration experience and passion for automation, he works on behalf of Claranet’s customers to help them adopt Big Data and Machine Learning, often as part of cloud migration projects. Driven by a belief that humans should only do interesting things, Mike has spent many years automating aspects of his own job and business processes to make life better for himself and his colleagues. Mike is an Open Source advocate having contributed to PostgreSQL, Terraform, Bigshift and YAWL.