Data Ingesting Pipeline with Roy Kim
From IoT Hub to SQL: Real-Time Factory Data Processing in Azure (an Access User Group talk with Roy Kim)

What a fascinating exploration of how modern IoT solutions can seamlessly collect, process, and analyze data from manufacturing environments in real-time.
In this Access User Group presentation, Azure MVP Roy Kim provides a detailed walkthrough of a real-world IoT implementation for monitoring tape dispensing machines in manufacturing facilities. The solution demonstrates how to build a scalable data pipeline that can handle hundreds of thousands of data points from multiple factories, processing them through Azure services before storing them in SQL Server for analysis.
Whether you're an Access developer looking to integrate IoT data into your applications or simply interested in understanding how modern cloud architectures handle massive real-time data collection, this presentation offers valuable insights into enterprise-scale data processing solutions.
Architecture Overview
Core Components
- Azure IoT Hub for device communication
- Event Hub for message queuing
- Stream Analytics for data processing
- Azure SQL Database for data storage
- Custom web application for reporting
Key Features
- Scalable to handle 100,000+ endpoints
- Real-time data processing
- Robust error handling
- Message queuing for reliability
- Secure device communication
Device Communication
Gateway Setup
- Cellular modems used at each factory
- Avoids interference with local WiFi networks
- Acts as collection point for local devices
- Uses MQTT protocol for messaging
- Secure authentication via SaaS tokens
Device Registration
- Devices must be registered with IoT Hub
- Gateway handles local device management
- Each device has unique identifier
- Supports multiple devices per gateway
- Configurable through gateway interface
Data Processing Pipeline
Message Flow
- Devices send JSON-formatted data
- IoT Hub routes messages to Event Hub
- Event Hub provides message queuing
- Stream Analytics processes raw data
- Processed data stored in SQL Database
Data Handling
- UTC time standardization
- JSON parsing and flattening
- Data transformation rules
- Error handling and recovery
- Scalable processing units
Monitoring and Management
Key Metrics
- Message throughput
- Processing delays
- Resource utilization
- Error rates
- Backlog monitoring
Administrative Features
- Real-time monitoring dashboards
- Error notifications
- Performance scaling
- Message replay capabilities
- Diagnostic tools
Conclusion
Roy's presentation demonstrates how Azure's IoT platform can be leveraged to build industrial-scale data collection systems that are both robust and scalable. While complex in architecture, the solution provides a reliable foundation for collecting and analyzing manufacturing data across multiple facilities, with built-in features for handling errors, scaling, and maintenance.
Recording
The full recording is available on YouTube:
Join Live!
Want to get even more out of these presentations? Join the live Access User Group events! The next upcoming events are listed on the AUG Event Calendar.
Attending live gives you the opportunity to:
- Interact directly with presenters during Q&A sessions
- Network with other Access developers
- Share your own experiences and challenges
- Get immediate answers to your specific questions
- Participate in group discussions
With multiple user groups across different time zones (and languages!), you're sure to find a meeting time that works for your schedule.
Acknowledgements
- Base cover image generated by FLUX-schnell
- Initial draft generated by Claude-3.5-Sonnet