Data Ingesting Pipeline with Roy Kim

From IoT Hub to SQL: Real-Time Factory Data Processing in Azure (an Access User Group talk with Roy Kim)

Data Ingesting Pipeline with Roy Kim

What a fascinating exploration of how modern IoT solutions can seamlessly collect, process, and analyze data from manufacturing environments in real-time.

In this Access User Group presentation, Azure MVP Roy Kim provides a detailed walkthrough of a real-world IoT implementation for monitoring tape dispensing machines in manufacturing facilities. The solution demonstrates how to build a scalable data pipeline that can handle hundreds of thousands of data points from multiple factories, processing them through Azure services before storing them in SQL Server for analysis.

Whether you're an Access developer looking to integrate IoT data into your applications or simply interested in understanding how modern cloud architectures handle massive real-time data collection, this presentation offers valuable insights into enterprise-scale data processing solutions.

Architecture Overview

Core Components

  • Azure IoT Hub for device communication
  • Event Hub for message queuing
  • Stream Analytics for data processing
  • Azure SQL Database for data storage
  • Custom web application for reporting

Key Features

  • Scalable to handle 100,000+ endpoints
  • Real-time data processing
  • Robust error handling
  • Message queuing for reliability
  • Secure device communication

Device Communication

Gateway Setup

  • Cellular modems used at each factory
  • Avoids interference with local WiFi networks
  • Acts as collection point for local devices
  • Uses MQTT protocol for messaging
  • Secure authentication via SaaS tokens

Device Registration

  • Devices must be registered with IoT Hub
  • Gateway handles local device management
  • Each device has unique identifier
  • Supports multiple devices per gateway
  • Configurable through gateway interface

Data Processing Pipeline

Message Flow

  • Devices send JSON-formatted data
  • IoT Hub routes messages to Event Hub
  • Event Hub provides message queuing
  • Stream Analytics processes raw data
  • Processed data stored in SQL Database

Data Handling

  • UTC time standardization
  • JSON parsing and flattening
  • Data transformation rules
  • Error handling and recovery
  • Scalable processing units

Monitoring and Management

Key Metrics

  • Message throughput
  • Processing delays
  • Resource utilization
  • Error rates
  • Backlog monitoring

Administrative Features

  • Real-time monitoring dashboards
  • Error notifications
  • Performance scaling
  • Message replay capabilities
  • Diagnostic tools

Conclusion

Roy's presentation demonstrates how Azure's IoT platform can be leveraged to build industrial-scale data collection systems that are both robust and scalable. While complex in architecture, the solution provides a reliable foundation for collecting and analyzing manufacturing data across multiple facilities, with built-in features for handling errors, scaling, and maintenance.

Recording

The full recording is available on YouTube:

Join Live!

Want to get even more out of these presentations? Join the live Access User Group events! The next upcoming events are listed on the AUG Event Calendar.

Attending live gives you the opportunity to:

  • Interact directly with presenters during Q&A sessions
  • Network with other Access developers
  • Share your own experiences and challenges
  • Get immediate answers to your specific questions
  • Participate in group discussions

With multiple user groups across different time zones (and languages!), you're sure to find a meeting time that works for your schedule.

Acknowledgements

  • Base cover image generated by FLUX-schnell
  • Initial draft generated by Claude-3.5-Sonnet

All original code samples by Mike Wolfe are licensed under CC BY 4.0