Application of data mining methods to establish systems for early warning and proactive control in food supply chain networks

Y. Li

Research output: Thesisinternal PhD, WU

Abstract

Food quality problems in Food Supply Chain Networks (FSCN) have not only brought losses to the food industry, but also risks to the health of consumers. In current FSCN, Information Systems are widely used. Those information systems contain the data about various aspects of food production (e.g. primary inputs, operations) in different stages of FSCN. By applying Data Mining (DM) methods on those data sets, managers can identify the causes of encountered new problems, and also predict and prevent those problems. However, managers are often non-experts in the DM area. In this research, a framework for Early Warning and Proactive Control (EWPC) systems has been designed, and a prototype system according to this framework has been implemented. Such systems can enable managers to employ the power of DM methods to predict and prevent encountered problems. Moreover, such systems enable managers to accumulate the knowledge they obtain from data analysis into a Knowledge Base, so that other managers can use it when they encounter similar types of problems. In this research, we have two major objectives:
Objective 1.
To design a framework for EWPC systems to facilitate the following aspects:
• analyze relations between problems and causes
• predict upcoming problems
• suggest control actions to prevent upcoming problems
• use existing databases in FSCN
• support non-expert users in applying DM methods
• have an extendable knowledge base
The framework should describe the necessary components as well as the relations between those components in EWPC systems.
Objective 2.
To build a prototype system based on the framework to enable managers in FSCN, as non-experts in DM, to use DM methods for Early Warning and Proactive Control on the supply chain level.
Research questions
In order to realize those objectives, six research questions were formulated.
1. What are the requirements for EWPC system design considering current practice of FSCN management?
2. What components should be included in the EWPC systems, and how should those components cooperate to enable managers to achieve EWPC in FSCN?
3. What Data Mining methods are available and applicable for EWPC in FSCN?
4. What support needs to be provided to managers in order to enable them to use Data Mining methods for EWPC?
5. What kind of structure is suitable for the Knowledge Base in EWPC systems?
6. What is the validity of the designed framework and prototype system?
To answer these questions, we used both literature review and case analysis. We studied the literature from areas such as Decision Support Systems, Data Mining, Supply Chain Management, Ontology Engineering, and Knowledge Engineering. The cases we analyzed came from two food companies. From the cases in those companies we studied what kind of system would enable managers to realize EWPC in FSCN. During case analysis, we communicated with managers in those companies about the problems they encountered, the relevant data sets, and the objectives they wanted to achieve. The data sets obtained from those cases normally have more than ten fields and millions of records. By applying different DM methods on those cases, we accumulated knowledge on the applicability of those methods as well as on the generic processes of applying those methods for EWPC. Moreover, we categorized the types of knowledge obtained from problem investigation in order to design a proper structure for Knowledge Base.
Regarding the first research question, what are the requirements for EWPC system design considering current practice of FSCN management? our study distinguished three types of requirements: performance requirements concerning the time needed for using this system, specific quality requirements concerning the sufficiency and comprehensibility of the assistance this system can offer, and functional requirements. There are six functional requirements:
1) facilitate quantitatively formulating problems;
2) guiding data joining and data preparation;
3) guiding managers in using DM methods for quantitative modeling;
4) predict the problem as early as possible;
5) support evaluating different control measures;
6) provide relevant knowledge for encountered problems, and accommodate new knowledge obtained during problem solving and decision making.

Regarding the second research question, what components should be included in the EWPC systems, and how should those components cooperate to enable managers to achieve EWPC in FSCN? our study defined the following major components for the framework:
• Task classifier and Template Approaches: to direct managers to follow the correct processes when they intend to deal with the encountered problem. Task classifier helps users to quickly identify their task type: identifying a problem, finding relevant data, exploring potential causal factors for the problem, predicting upcoming problems, evaluate alternative control measures, and consulting the Knowledge Base. Each task is supported by a corresponding Template approach.
• Knowledge Base: stores information (e.g. causal factors, causal relations) on previously encountered problems in FSCN for easy reference by other users.
• DM methods library and Expert System: the DM methods library stores information (function, model format, and requirements on data sets) about the DM methods that can be used for EWPC. The Expert System gives suggestions on which DM methods to use and explain its reasoning.
• Explorer and Predictor: the Explorer component allows users to explore potential causal factors for the problems in FSCN. The Predictor warns about problems that are about to occur in FSCN. It is used for decision evaluation as well. Users can employ models built previously to compare results of different available decisions and choose the best one.
In addition to the specification of those components, we also defined the steps that are needed for using the system, as well as the correct sequence between those steps.
Regarding the third research question: what Data Mining methods are available and applicable for EWPC in FSCN? our study identified six requirements on the DM method level: prediction, problem detection, finding determinant factors, representing complex structure, different representation forms, and extendable with new knowledge. The first four functional requirements deal with functions of DM methods. In the DM area functions are categorized differently (such as classification, regression). Our study provided a mapping between these two kinds of functions. After that, we selected a list of widely used DM methods and identified which method can accomplish which DM function. The last two functional requirements relate to the representation form of DM methods. Our study provided another mapping between representation forms of those DM methods and their extensibility for new knowledge.
Regarding the fourth research question: what support needs to be provided to managers in order to enable them to use Data Mining methods for EWPC? our study found two aspects of support that are needed for enabling managers to use DM methods. One is how to find a proper DM method. This can be supported with an Expert System for DM method selection and a DM methods library. Managers can get suggestions on which DM method is proper after they specify their case situation and data set characteristics to the Expert System. The other is how to use the DM method found for EWPC. This is supported with Template approaches for data analysis. Those template approaches tell users how to execute the particular step, what performance indicator to look at, and what to do if a particular situation occurs.
Regarding the fifth research question: what kind of structure is suitable for the Knowledge Base in EWPC systems? our study defined a structure with two parts: a rule base and an inference structure. A rule base allows managers to store obtained knowledge. It allows managers to specify what kind of causal relation and/or remedies have been obtained. A rule base should contain an ontology that guarantees the consistent semantic meaning of terms in each rule. An inference structure allows managers to quickly identify relevant knowledge. It communicates with users, and uses the inference mechanism to find out applicable knowledge.
Regarding the sixth research question: What is the validity of the designed framework and prototype system? our study first assigned appropriate Key Performance Indicators (KPI) for different design aspects of the system (e.g. framework, design methodology), then employed Expert Validation to evaluate the performance of each design aspect on each KPI. Our expectations are roughly met by the results of expert validation. Expert Validation also brought forward that experts in FSCN management put special importance on the potential of the developed system.
The main contribution of this research is that it integrated different scientific areas into one decision support architecture. This architecture presents a new means to monitor and control the processes for Supply Chain Management. Managers with EWPC can handle new problems with existing data resources. The scope of problems that can be solved is not restricted beforehand. For Data Mining, this study extends the existing research on applicability of DM methods by creating a bridge between the reality needs and the DM area. For Knowledge Engineering, this study creates a suitable Knowledge Base structure for sharing knowledge among non-experts users by linking Knowledge Management with Ontology Engineering.
As far as the impact of this research for supply chain managers is concerned, we advise them to ensure that the requirements for effective proactive control are fulfilled. The framework presented in this thesis supports supply chain managers by providing them with usable DM methods for obtaining new insights through modelling and application of new data. For food quality managers in FSCN, the implication is that a EWPC system can be used to explore causal factors when a problem occurs. Food quality managers can also verify the hypothesis on the causes with the EWPC system. By using this system together with other problem investigation strategies, such as field investigation, managers can improve the efficiency and effectiveness of problem solving. For information technology managers, we advise them to use such a system to enforce correct and continuous data collection mechanism. In this system, the facilities for handling outliers and missing values can enable managers to easily identify problems in collected data. FSCN managers from practice recognize the potential of the system and knowledge stored in it for improving decision support by making Data Mining applicable for non-experts.


Original languageEnglish
QualificationDoctor of Philosophy
Awarding Institution
  • Wageningen University
Supervisors/Advisors
  • Beulens, Adrie, Promotor
  • van der Vorst, Jack, Promotor
Award date21 May 2010
Place of Publication[S.l.
Print ISBNs9789085856382
Publication statusPublished - 2010

Keywords

  • control
  • food supply
  • management
  • food industry
  • data mining
  • supply chain management
  • agro-industrial chains
  • control systems
  • decision support systems
  • knowledge systems
  • business management

Fingerprint Dive into the research topics of 'Application of data mining methods to establish systems for early warning and proactive control in food supply chain networks'. Together they form a unique fingerprint.

  • Cite this