Artificial Neural Network Approach for Business Decision Making applied to a Corporate Relocation Problem

This paper presents a new Artificial Neural Network approach to making a business decision. A corporate relocation problem is considered as an example for a business decision and the new approach is applied to select a city for corporate relocation and to rank a set of potential alternatives. Selecting the location of corporate real estate can be key to optimizing an organization’s success. This is the first time Artificial Neural Networks have been used for this sort of business application. The Neural Network behaved satisfactorily and provided 91.76% accuracy when tested against randomly generated test sets. Five potential cities were considered: New York City, Washington D.C. Atlanta, Los Angeles and Portland. Decision makers identified six criteria: Financial Considerations Employee Availability, Support Services, Cultural Opportunities, Leisure Activities, and Climate. A suitable city is recommended that provides an appropriate solution and the outcome of the new approach is also used to rank potential cities based on their suitability.


INTRODUCTION
This paper presents a new Artificial Neural Network (ANN) approach to assist with business decisions. The new approach is applied to a real world corporate relocation problem considered in [1]. Results from applying the new approach are validated against results presented in [1] and advantages of the new approach are presented.
Reducing operating costs and enhancing commercial competitiveness have often bean reasons for corporate relocation. Christersson and Rothe [4] suggested five reasons for corporate relocation: Economic; Social; Environmental; Organizational identity, culture and image. Hassanain et al. [5] claimed that the main reason for relocation was often to reduce cost. Corporate relocation decisions are important long-term strategic decisions in the life time of any organization. Relocation decision often directly affect organization's existence and effectiveness [2][3][4].
Many researchers have used the term "Black Box" to represent the corporate relocation decision process [6]. Haddad et al. [1] presented a framework that demystified part of that "Black Box" and helped decision makers to understand the corporate relocation decision process, selecting an appropriate location that achieved their anticipated goals using Multiple Criteria Decision Making (MCDM) methods. A real world example was presented for ranking five cities in the United States of America based on their suitability for corporate relocation. The new approach described in this paper used MCDM concepts to calculate the overall score for the alternatives used in training and testing the Neural Network. This is the first time ANN have been used in this way and for this sort of application.
The next Section briefly describes corporate relocation decisions. Section 3 describes LSTM Neural Networks and Section 4 presents the new approach to making business decisions and applies it to a corporate relocation problem. Section 5 validates the results of the new approach, Section 6 discusses the results and Section 7 presents some concluding remarks and future work.

CORPORATE RELOCATION DECISIONS
Corporate relocation is often considered as the physical move of an organization from old sites to new sites and often included all the processes and services required to successfully complete the relocation. A corporate relocation decision is often seen as a major and long term strategic decision in the life span of an organization. Relocation decisions often affected the organizations' existence and competitiveness [2][3][4].
Several factors should be taken into consideration when making relocation decisions: cost of relocation, employees' satisfaction and disruption to work progress during the relocation process. Researchers have suggested that the main reason for corporate relocation was to improve corporate profitability [7 & 8]. Others claimed that cultural, geographical, and legal factors could be the most important reason for relocation [4, 9 & 10]. Relocation has a number of effects on an organization and its employees. Glatte [3] proposed four corporate relocation concepts: This paper is concerned with site selection. Christersson et al. [10] considered relocation decisions as an added value function for Real Estate Management. Rothe and Heywood [11] claimed that the corporate relocation process was not part of an organizations' "day-to day" activities, and stressed that it was the function of a Corporate Real Estate (CRE) department since it often included physical relocation. All the services and processes required to successfully complete the relocation and all the post move, settling in and adjustments required after the relocation are conducted by the CRE department.
Corporate relocation decisions can be prone to distortion as decisions can often be made in fuzzy, risky and uncertain conditions, where many assumptions might be considered, and outcomes could have major effects. Using more complex and computationally inexpensive scientific algorithms, for example AI and ANN, could help achieve better decisions.
Corporate relocation decisions involve many factors and have major effects on organization survival. ANN could produce reliable decisions when large number of inputs are considered.
LSTM NEURAL NETWORKS Despite their biologically inspired name, ANNs, are mathematical algorithms often used to model non-linear and complex functions. The basic unit of an ANN is called a perceptron and one is shown in Figure 1. They are responsible for conducting mathematical operations and modeling non-linear functions. ANNs are often constructed by combining several perceptrons together to create a layer of perceptrons, adding multiple layers to create a Network as shown in Figure 2.

Figure 2. ANN model
A perceptron can be trained on a set of samples using a "learning algorithm". To train a Network, the perceptron weights are adjusted in proportion to the difference between the correct output, and the perceptron output, for each sample. The aim is to minimize the difference between the correct output and the Network output. This is achieved by slightly adjusting the weights by small values "learning rate" and running all the samples again. Running all the samples in a set and adjusting the weights to minimize the difference is "one epoch".
An ANN often consists of an input layer and an output layer and hidden layers in between. Different types of ANN are used for different types of applications. This paper considered a LSTM Neural Network. LSTM Neural Networks are often regarded as a type of Recurrent Neural Network. They were first introduced by Hochreiter and Schmidhuber [12] in 1997. Since then, many researchers have worked on simplifying their structure, and enhancing their efficiency and accuracy [13 & 14]. Due to the growth of statistical modeling in science, researchers are using more complex methods and algorithms, such as ANNs, to deal with problems concerned with patterns and prediction [15]. ANNs can show excellent predictive power compared to traditional approaches. ANNs are receiving increased attention as powerful, flexible modeling techniques for predicting patterns in data [16]. LSTM have been successfully used in many fields of science including handwriting recognition, text completion, vehicle trajectory prediction and pattern recognition [16 & 17] . LSTM networks evaluate mappings from input sequence to output sequence by calculating the network units' activations [18]. Deep LSTM Networks are often created by stacking several LSTM layers. Input sets go through multiple non-linear layers with each layer identifying specific features of a set. After data were processed through the layers, the network could recognize the appropriate identifiers for classifying the data to appropriate classes [19]. Due to their classification and predictive capabilities, ANNs could produce suitable and reliable outcomes for corporate location problems.

THE NEW APPROACH APPLIED TO CORPORATE RELOCATION
To demonstrate the new business decision making approach, it was applied to a corporate relocation problem considered in [1]. The corporate relocation problem identified six decision criteria: Financial Considerations, Employee Availability, Support Services, Cultural Opportunities, Leisure Activities, and Climate; Seasonal, and Year Round. Five potential cities were identified and are shown in Figure 3. The five cities were: New York City, Washington D.C., Atlanta, Georgia, Los Angeles, California, and Portland, Oregon. Decision maker identified the most important criteria and provided weights showing their relevant importance. Also performance measures for all the alternatives (cities) with respect to all decision criteria were identified. The performance measures signified how suitable the alternatives were with respect to each criterion. A High performance measure indicated good suitability. If two alternatives showed the same performance with respect to a specific criterion, then their performance measures with respect to that criterion were set equal. The performance measures of the five cities and the criteria weights are shown as a decision matrix in Table 1. Financial Considerations was identified as the most important criterion and it was given the highest weight, its weight was set to 0.428, followed by Employee Availability and Support Services which were equally important, their weights were set to 0.207. The fourth most important criterion was Leisure activities, its weight was set to 0.063, followed by Climate, its weight was set to 0.053.
The least important criterion was Cultural Opportunities. It was given the lowest weights. Its weight was set to 0.041.
Los Angeles scored highest with respect to Financial Considerations followed by New York City, Atlanta and Washington D.C. respectively. Portland scored lowest with respect to Financial Considerations. Los Angeles scored highest with respect to Employee Availability followed by Portland, Atlanta and Washington D.C. respectively. New York City scored lowest with respect to Employee Availability. New York City scored highest with respect to Support Services followed by Los Angeles, Portland and Atlanta respectively. Washington D.C. scored lowest with respect to Support Services. Los Angeles scored highest with respect to Cultural Opportunities followed by New York City, Washington D.C. and Atlanta respectively. Portland scored lowest with respect to Cultural Opportunities. Portland scored highest with respect to Leisure Activities followed by Los Angeles, Atlanta and Washington D.C. respectively. New York City scored lowest with respect to Leisure Activities. Los Angeles scored highest with respect to Climate followed by Portland and Atlanta respectively. Washington D.C. and New York City had similar climate. Both cities were given the same value and scored lowest with respect to Climate.
The first LSTM Neural Network used in this paper considered thirty-six inputs and five outputs.
Inputs to the Network were the six criteria weights and the thirty performance measures in Table  1 comparing all alternatives with respect to all criteria as shown in Figure 4 where Wj represented the weight of criterion j and ai,j represented the performance measure of alternative i with respect to criterion j. The structure of the LSTM Neural Network used is shown in Figure 5. It consisted of five layers: Layer 1. Sequence Input Layer with thirty-six inputs. Layer 2.
Classification Layer. MS Excel was used to create randomly generated criteria weights and performance measures. Twenty thousand sets were generated, each set containing six criteria weights and thirty performance measures. Randomly generated sets and the overall scores were used as input sets to train a Long Short Term Memory (LSTM) Neural Network. The overall score of the alternatives was calculated using Weighted Sum Model (WSM): In the above equation Pi represents the overall score of alternative i, Wj represents the weight of criterion j, and aij represents the performance measure of alternative i with respect to criterion j. The best solution to a problem was often the alternative that maximized the value of Pi. The overall scores were used to identify the best alternative and to rank the set of alternatives.
MATLAB was used to set the training options of the LSTM Network, the layers of the LSTM Network and the structure of the layers. A LSTM Neural Network was created with thirty-six inputs, 100 hidden units in the bilstm Layer, and five output classes. An adaptive momentum estimation algorithm was used in this architecture with an initial learning rate of 0.01 and maximum number of 100 epochs.
A (20,000x36) matrix was imported to MATLAB and used as training and testing sets. The matrix was split in a ratio 3:1 for training and testing sets respectively. Figure 6 shows Network training progress, the Network accuracy shown as a blue curve in the upper part of Figure 6 and the Network loss shown as a red curve in the lower part of Figure 6. performance, the Network accuracy should be maximized and the Network loss should be minimized.
The initial learning rate was set to 0.01 and the maximum number of epochs was set to 100. As Network training progressed, the Network training accuracy increased. The Network loss decreased. By the end of 100 epochs the Network training accuracy was above 70%.   The Network accuracy reached 69.4% when tested with the testing set. Figure 8 shows the confusion chart produced from testing the Network with the testing set. A confusion chart shows the predictions produced from the Network (Vertical Axis) and the number of correct labels from the testing set (Horizontal axis). Network correct predictions are shown diagonally in the darker blue boxes. A1 was correctly predicted by the Network 695 times but was incorrectly predicted as A2 56 times, 89 times as A3, 91 times as A4 and 70 times as A5. A2 was correctly predicted by the Network 650 times but was incorrectly predicted 100 times as A1, 89 times as A3, 91 times as A4 and 72 times as A5. A3 was correctly predicted by the Network 696 times but was incorrectly predicted 75 times as A1, 62 times as A2, 96 times as A4 and 72 times as A5. A4 was correctly predicted by the Network 788 times but was incorrectly predicted 49 times as A1, 61 times as A2, 73 times as A4 and 71 times as A5. A5 was correctly predicted by the Network 641 times but was incorrectly predicted 84 times as A1, 61 times as A2, 85 times as A3 and 73 times as A4. The Network produced acceptable accuracy. Different values for initial learning rate and maximum number of epochs were considered. A best compromise between learning time and overall accuracy of the Network was made.
Inputs to the LSTM Network were modified to improve Network accuracy. Performance measures were multiplied by the criteria weights to reduce the number of inputs to the Network. The performance measures of each alternative with respect to each criterion were scaled by that criterion weight to create thirty inputs to the Network as shown in Figure 9 where Wj represented the weight of criterion j and ai,j represented the performance measure of alternative i with respect to criterion j. The same Network structure shown in Figure 5 was used. An adaptive momentum estimation algorithm was used in this architecture with an initial learning rate of 0.001 and maximum number of 100 epochs. A (20,000x30) matrix was imported to MATLAB and used as training and testing sets. The matrix was split in a ratio 3:1 for training and testing sets respectively. Figure 10 shows Network training progress with an initial learning rate 0.001 and 100 epochs considering thirty inputs and five output classes. As Network training progressed, the Network training accuracy increased, shown as a blue curve in the upper section of Figure 10 and the Network loss decreased, shown as a red curve in the lower section of Figure 10. By the end of 100 epochs the Network training accuracy was around 86%.  Figure 11 shows the training outcome showing that the Network required 13 minutes and 48 seconds to complete 100 epochs using an initial learning rate 0.001, as the training progressed Network accuracy was increasing and Network loss was decreasing. By the end of the 100 epochs the Network training accuracy reached 85.96%. The Network accuracy reached 86.58% when tested with the testing set. Figure 12 shows the confusion chart produced from testing the Network with the testing set considering thirty inputs and five output classes. The confusion chart shows the predictions produced from the Network. Network correct predictions are shown diagonally in the darker blue boxes. A1 was correctly predicted by the Network 904 times but was incorrectly predicted as A2 25 times, 23 times as A3, 35 times as A4 and 14 times as A5. A2 was correctly predicted by the Network 880 times but was incorrectly predicted 48 times as A1, 42 times as A3, 26 times as A4 and 16 times as A5. A3 was correctly predicted by the Network 873 times but was incorrectly predicted 47 times as A1, 27 times as A2, 42 times as A4 and 12 times as A5. A4 was correctly predicted by the Network 912 times but was incorrectly predicted 46 times as A1, 21 times as A2, 39 times as A3 and 24 times as A5. A5 was correctly predicted by the Network 760 times but was incorrectly predicted 45 times as A1, 37 times as A2, 52 times as A3 and 50 times as A4. Comparing results from Figures 8 and 12 shows that the Network produced more accurate results when the number of inputs were reduced to thirty and the initial learning rate was decreased to 0.001.  Figure 11 showed that the Network required 13 minutes and 48 seconds to complete 100 epochs with an initial learning rate of 0.001. Using smaller initial learning rate required more time to complete the 100 epochs.
The Network produced higher accuracy when the data was pre-processed so that the number of inputs were reduced to thirty as shown in Figure 11. Different values for initial learning rate and maximum number of epochs were considered. A best compromise between time and overall accuracy of the Network was conducted. The initial learning rate and maximum number of epochs were set to 0.001 and 150 respectively. Figures 13 shows Network training progress, with an initial learning rate 0.001 and 150 epochs considering thirty inputs and five output classes. As Network training progressed, the Network training accuracy increased, shown as a blue curve in the upper section of Figure 13 and the Network loss decreased, shown as a red curve in the lower section of Figure 13. By the end of 150 epochs the Network training accuracy was around 90%.   The Network accuracy reached 91.76% when tested with the testing set. Figure 15 shows the confusion chart produced from testing the Network with the testing set considering thirty inputs and five output classes after completing 150 epochs. The confusion chart shows the predictions produced from the Network, the Network correct predictions are shown diagonally in the darker blue boxes. A1 was correctly predicted by the Network 879 times but was incorrectly predicted as A2 25 times, 32 times as A3, 30 times as A4 and 35 times as A5. A2 was correctly predicted by the Network 943 times but was incorrectly predicted 14 times as A1, 26 times as A3, 12 times as A4 and 17 times as A5. A3 was correctly predicted by the Network 904 times but was incorrectly predicted 21 times as A1, 22 times as A2, 23 times as A4 and 31 times as A5. A4 was correctly predicted by the Network 973 times but was incorrectly predicted 11 times as A1, 12 times as A2, 16 times as A3 and 30 times as A5. A5 was correctly predicted by the Network 889 times but was incorrectly predicted 3 times as A1, 15 times as A2, 16 times as A3 and 21 times as A4. The confusion chart shown in Figure 18 shows a higher number of correct predictions (shown diagonally in the darker blue boxes) than the number of correct predictions shown in the confusion charts in Figures 8 and 12. Comparing results from Figures 8 ,12 and 15 shows that the Network produced more accurate results when the number of inputs were reduced to thirty, Initial learning rate was decreased to 0.001 and increasing the maximum number of epochs to 150.

THE NEW APPROACH APPLIED TO A CORPORATE RELOCATION PROBLEM
Haddad et al. [1] applied two well-known MCDM methods to the corporate relocation problem shown in Table 1: The Analytical Hierarchy Process (AHP) and the Preference Ranking Organization METHod for Enrichment Evaluations II (PROMETHEE II).
AHP is a MCDM method created by Thomas L. Saaty in 1971-1975 [20]. AHP helps in solving multiple conflicting criteria problems. AHP breaks down a composite problem into simpler subproblems then, combines the solutions of all sub-problems into one overall solution [21]. AHP applies pairwise comparisons between alternatives to assess the level by which one alternative dominates another with respect to each criterion [20]. Since its development, AHP has been used to solve multiple criteria problems in most fields of science [20 & 21]. PROMETHEE II is an outranking MCDM method used to produce a total ranking of alternatives. PROMETHEE II consist of a preference function describing each criterion and weights characterizing their significance. The idea behind PROMETHEE II is to apply pairwise comparisons among alternatives with respect to each criterion then broadly assess them with respect to all criteria [22].
The aim behind applying MCDM methods to this problem was to rank the set of alternatives (cities) based on their suitability using pairwise comparisons to achieve a total order of alternatives (cities). AHP and PROMETHEE II delivered the same ranking of cities: Los Angeles > New York City > Atlanta > Portland > Washington D.C.
The new approach presented in this paper was applied to the same corporate relocation problem considering six decision criteria and five potential cities to relocate [1]. Criteria weights and performance measures for all the alternative (cities) with respect to all the criteria are shown as a decision matrix in Table 1.
The trained and tested Network was used to provide an overall outcome for each city based on the criteria weights and performance measures. WSM was used to create inputs for the LSTM Neural Network. Thirty inputs were considered.

DISCUSSION AND RESULTS
The new approach identified A4 (Los Angeles) as the best city to relocate and produced the following ranking of cities: Los Angeles > New York City> Atlanta > Portland > Washington D.C. The overall score of the cities were: New York = 0.211, Washington D. C. = 0.181, Atlanta = 0.189, Los Angeles = 0.236 and Portland = 0.182.
Los Angeles was identified as the best city to relocate by AHP and PROMETHEE II in [1]. The new approach also provided the same ranking of cities to relocate based on their suitability. In this way the results were validated against results published in [1].
MCDM methods do not produce a "correct" decision, rather they aid decision makers to find an appropriate solution that corresponds to their aims as well as their appreciation of the problem [22]. Since there was no perfect decision making method, Glatte [3] suggested using more than one decision making method to validate the outcome of the decision process, and to check for the