Azure的推薦案例裏,Azure Databricks無處不在。請不要問我Databricks是什麽!今夜,分享壹個AI架構: Batch scoring of Spark models on Azure Databricks
這種場景很普遍。重資產的工業企業需要減少生產成本和提升運轉時間,那麽就需要減少非預期的機械事故。那麽,可以通過從機器收集到的IoT數據,來創建壹個機器學習模型預測機械維護。這樣在事故發生之前,可以做維護和修理,讓設備可以更長時間的運轉賺錢!整個流程都在用Databricks/Spark!
- 數據采集:Ingest the data from the external data store onto an Azure Databricks data store.
- 建模:Train a machine learning model by transforming the data into a training data set, then building a Spark MLlib model. MLlib consists of most common machine learning algorithms and utilities optimized to take advantage of Spark data scalability capabilities.
- 預測:Apply the trained model to predict (classify) component failures by transforming the data into a scoring data set. Score the data with the Spark MLLib model.
- 存結果:Store results on the Databricks data store for post-processing consumption.
Github裏面有jupyter/iPython notebooks,直接點開,改改就能跑了! 這位作者是多麽喜歡磚廠的紅磚啊!滿屏幕的磚啊,向經典的坦克大戰致敬!