OPTIMIZATION OF OIL EXTRACTION FROM GARCINIA KOLA USING ARTIFICIAL NEURAL NETWORK AND RESPONSE SURFACE METHODOLOGY

The target of this investigation was to model and optimize selected process parameters when extracting oil from Garcinia kola. Artificial neural network (ANN) and Box-Behnken design (BBD) in response surface methodology (RSM) were used for the modelling and optimization of the process parameters. The optimized process values were 397.86 mL and 399.99 mL for solvent volume; 109.32 min and 107.55 min for extraction time; 72.64 g and 70 g for sample mass and maximum yields of 20.839 wt% and 20.488 wt% for RSM and ANN respectively. The highly positively correlated experimental and anticipated values validated the models.


INTRODUCTION
Over the years, plants have been of great benefit to humankind especially because of the many health benefits of the chemicals found in them [1]. Plants are made up of various biologically active chemicals called phytochemicals. These phytochemicals are produced through the process of primary and secondary metabolism. However, because the health benefits of some of these phytochemicals have not been scientifically proven, they are regarded as research compounds rather than essential nutrients. Different parts of the plants have been claimed to have been used over the years for the treatment of flu and its associated illnesses [2]. Garcinia kola is traditionally used as a medication because of the various bioactive components such as tannins, flavonoids and alkaloids which are believed to have anti-parasitic, purgative, and anti-microbial properties [3], and it is also being investigated for possible hop substitution in beer production [4]. Consumption and use of bitter kola in Nigeria are low due to inadequate information on the physicochemical and nutritive properties [5].
Various extraction methods commonly used for the extraction of bioactive components include regular traditional methods. During the last three decades, the focus has shifted from the traditional to the use of other techniques such as ultrasound-assisted solvent extraction and the conventional solvent extraction [6].
The yield of oil from Garcinia kola obtained from solvent extraction processes is greatly affected by the type of solvent used, the mass of solid, particle size of the seed, volume of solvent, extraction time, pressure and temperature [7]. However, there is a need to employ statistical design of experiments which will clearly show the interaction between the process parameters. From the design, it is shown whether a parameter has any vital impact on the target or not, while also showing how a parameter changes in value as the values of other parameters change [8][9][10][11][12][13]. Thus, the extraction of oil can be improved by optimizing these parameters to give optimum yield. This can be achieved by the design of experiments for Artificial Neural Network (ANN) together with Response Surface Methodology (RSM).
Although several studies have been carried out on the extraction of oil from Garcinia kola, there is however a dearth of information on the optimum conditions for the extraction of the oil. The present study was therefore undertaken to model the yield of oil from Garcinia kola via Soxhlet extraction, considering the interaction of mass of Garcinia kola, extraction time and volume of extraction solvent as process parameters, using RSM and ANN, in order to determine the optimal extraction conditions.

Feedstock assemblage, preparation and pretreatment
Fresh Garcinia kola seeds were procured from a vendor in Edo State, Nigeria. The clean seeds were cut manually into smaller pieces to facilitate the drying process. The Garcinia kola seeds were dried continuously under the sun for three weeks and thereafter dehulled. The dehulled dried samples were then milled and sieved to the particle size of 250 µm or less to increase the surface area to facilitate extraction. The sample was stored under dry conditions before use.

Extraction and characterization of oil
Soxhlet apparatus was used in the extraction of the oil from the prepared sample using ethanol as solvent. The extraction was carried out at 70 °C, using different volumes of solvent and amounts of ground Garcinia kola, at different extraction times according the design of experiments.
Garcinia kola oil yield was calculated using equation (1): The extracted oil was characterized in terms of its physicochemical properties. The colour of the oil obtained was noted by visual inspection, specific gravity was determined using a pycnometer, while the peroxide value was determined using the AOAC International methods of analysis [14].

Experimental design
Box-Behnken experimental design was generated for the selected three factors, using three levels with the Design Expert 7.0.0 (Stat-ease, Inc. Minneapolis, USA). Table 1 shows the levels (high, low and centre points) of the independent process parameters. The values of the levels were chosen based on results from preliminary experiments. The coded values are related to the actual values by the relation given in equation (2): where Y is the response as vaticinated, Xi and Xj are the self-independent process parameters, bo is counterbalancing term, bi and bij are the regression coefficients respectively for all possible effects and E is a term that accounts for any mistakes.
The model equation (3) was used to determine all possible singular and combined effects on the oil yield. For the model, the linear and cross regression calculations were performed. A calculation was made of the R 2 value, the associated errors derived from the replication measurements and the lack of fit. ANOVA was utilized to estimate the measurable attributes of the model fitting.

ANN
NeuralPower Professional Version 2.5, a commercial ANN software, was deployed for modelling and was also used to determine the optimal condition. Also used in the design of ANN were the same data obtained from the laboratory work that was initially used for a design using RSM. The input variables were mass of Garcinia kola, extraction time and volume of extraction solvent with Garcinia kola oil yield as the target.
Two neural grid frameworks and topology of the ANN were chosen, tested and utilized to predict the yield of Garcinia kola oil. The experimental data were trained with the five carefully selected algorithms in ANN platform. The selected model was the one with the least RMSE value and highest R 2 value.
The ANN framework comprised a three-neuron entry stratum, one result stratum through only one neuron, and one concealed stratum. The optimum grid topology was calculated using only one concealed stratum while the variety of neurons in this stratum and the system function of the concealed and output strata were determined iteratively by the development of different grids. Each ANN was trained using 100,000 iterations of default stop criteria. Two data sets were created from the experimental data on the Box-Behnken outline: the data sets for coaching and the other for checking. The first consisting of thirteen was used for coaching while the second set consisting of four used for checking. Table 2 gives a summary of the physicochemical properties of the oil extracted from Garcinia kola seed. The color of the extracted oil is light-brown as shown in Table 2. The specific gravity of the extracted oil is 0.905, indicating that it is lighter than water. Peroxide value gives a measure of the extent to which an oil sample has undergone primary oxidation. With a peroxide value of 9.25 meq/kg, the extracted oil will have a good shelf life. The peroxide value of 9.25 meq/kg is within the maximum permissible limit of 15 meq/kg for virgin oils set by Codex standards for fats and oils from vegetable sources.

Modeling using response surface methodology
Different models were systematically examined to determine the model which most appropriately relates the response to the input process parameters. Table 3 gives a summary of the statistics of the model.  Table 3, it was observed that the quadratic model has very high R 2 and adjusted R 2 values of 0.9901 and 0.9774 respectively. It was therefore concluded that the quadratic model best relates the response to the independent process parameters, as it was the model suggested by the software.
Analysis of the regression was performed to fit data of the oil yield. The model developed represents Garcinia kola oil yield (Y) and how it relates to the volume of the solvent (X3), extraction time (X2) and sample mass (X1). The quadratic statistical model obtained after analysis via applying multiple regression to the data obtained from bench work is given in Equation (4). To test if the second-order polynomial has a significant fit in predicting the yield of Garcinia kola oil, an analysis of variance (ANOVA) was carried out, with the outcomes shown in Table 4 and Table 5.   Table 4 suggested that there was an appropriate description of the relationship between response and the process parameters whose values were varied. The model was considered statistically significant since it has a p-value of less than 0.05 and a high F-value of 77.85 [15]. From Table 4, it was observed that it was only the term X1X2, that had an insignificant effect as it has a p-value greater than 0.05.
With an R 2 value of 0.9901 (Table 5), this is a clear indication that this model satisfactorily represents the mathematical relations describing how changes in values of the process parameters (volume of solvent, extraction time and sample mass) affect the response (Garcinia kola oil yield). The R 2 value indicates 99.01 % variability was explained by the model, and it could not explain only 0.99 %. Coefficient of variance (CV) is a ratio of the standard error of the estimates to the average value. A low value of CV indicates high reliability and precision of the experiment. It is considered reproducible once it is not greater than 10% [9,16]. The value of the CV realized in this work was 1.42 %. The indicated quotient of the signal to the noise is measured by adequate precision. It is generally desirable to have an adequate precision value greater than 4 [17]. The obtained adequate precision value was 32.023, and this is indicative of a sufficient signal.

Modeling using ANN
The R 2 and root mean square error (RMSE) obtained for the multilayer normal feed forward (MNFF) and multilayer full feed forward (MFFF) using back propagation (BP), batch-back propagation (BBP), quickprob (QP), genetic algorithm (GA) and Levenberg-Marquadt (LM) training pathways to predict which model is most appropriate are given in Table 6.  Table 6, IBP was the best training algorithm to predict the Garcinia kola oil yield. Iterations on different transfer functions were also carried out to establish the optimal network topology. The optimal number of neurons in the network was also determined using the best transfer function. The Hyperbolic-Tangent transfer function gave the highest R 2 and the least RMSE values for both the hidden and output layer compared to other transfer functions. The network was chosen because it gave the least RMSE value of 0.0696 and an R 2 value of 0.9992 which is closest to 1. The best ANN model obtained in this present study was the MFFF Incremental Back Propagation (IBP) grid with Hyperbolic-Tangent as the system function for both the concealed and output strata. This network topology was further used to forecast the outcome at varying conditions. Table 7 gives the R 2 values for MFFF and MNFF at learning neurons of one to 4. It is seen that the highest R 2 value was 0.9992, and this value was obtained at the neuron of 3 for MFFF and neuron of 4 for both MNFF and MFFF. The best ANN model obtained in this present study was the MFFF, with Hyperbolic-Tangent as the transfer function for both the hidden and output layers since it gave this R 2 value at a neuron of 3. This network topology was used for further studies to predict the yield of Garcinia kola oil.  Table 8 shows the systematic experimental design arrangement and the actual and anticipated yields by RSM and ANN.  Table 8 gives the experimental design matrix using Box-Behnken, the experimental yields of Garcinia kola oil and the predicted yields by RSM and ANN. Comparing the predicted yields from the two simulation tools with the actual yield obtained from the experiments showed very little deviations. This small deviation between experimental yields and predicted yields is indicative of the goodness of predictive models.
The model performance indicator chosen was RMSE. The predicted data, when compared to the experimental data, gave RMSE values for the predictive empirical models as shown in Table 8. The values of RMSE calculated (Table 8) show that the empirical models derived from the experimental data were able to predict the yields of oil from Garcinia kola quite well. The two models produced excellent predictions as indicated by their low RMSE values of 0.1641 and 0.0675 for RSM and ANN respectively. It was noted however that ANN with an RMSE value of 0.0675 gave a better predictive model.

Comparison of the prediction by RSM and ANN
The predicted Garcinia kola oil yield by RSM and ANN were compared using the absolute percentage error for each run as shown in Table 9.
The predicted outcomes by the RSM and ANN models were carefully compared from one run to the next using absolute values of the % error. It was observed that the absolute values of the percentage errors of the outcomes predicted by ANN were generally less than those predicted by RSM, except for runs 5, 9, 11 and 14. Furthermore, the average absolute value of the percentage error of 0.151 for the predicted outcomes by ANN is lower than the average absolute value of the percentage error of 0.776 for the predicted outcomes by RSM. This is a clear indication that ANN gave a better model for the prediction of yield of oil from Garcinia kola.

Numerical optimization using ANN and RSM
The optimal yields of oil from Garcinia kola from the numerical optimization using RSM and ANN were 20.839 wt % and 20.488 wt % respectively. These optimal yields of Garcinia kola oil were obtained at a solvent volume of 397.86 mL, extraction time of 109.32 minutes and a sample mass of 72.64 g for RSM; and at a solvent volume of 399.9 mL, extraction time of 107.5 minutes and sample mass of 70 g for ANN.

Validation of models
To ascertain the validity of the models developed, confirmatory experiments in triplicate sets were carried out at the obtained optimal parameter values representing the maximum yield of oil from Garcinia kola. Experiments conducted at the optimal conditions showed that there was no significant deviation between the actual yield of Garcinia kola oil of 20.664 wt% and the predicted yield of 20.839 wt% by RSM model, and this gave an error of 0.847 %. The results obtained from the other set of replication showed that there was no significant deviation in maximum yields of 20.344 % and 20.488 % respectively from the experiment and the ANN model predicted values. This gave an error of 0.708 %. The high positive correlation of the predicted yields and values obtained from actual experiments validated both models.

CONCLUSIONS
The extraction of oil from Garcinia kola was investigated via a three-variable Box-Behnken scheme. RSM and ANN were used to develop statistical models that were subsequently used to investigate the effect of solvent volume, extraction time and mass of ground Garcinia kola cake on the yield of oil. Not only did the solvent volume, extraction time and mass of ground Garcinia kola have significant effect on the yield of oil, but their interactive effects were also found to have significant effect on the yield of oil. The models developed using both RSM and ANN for the yield of oil from Garcinia kola predicted to a considerable degree of reliability the yield of oil from Garcinia kola during extraction as the RMSE values were less than 0.2. Optimization of oil yield from Garcinia kola using RSM and ANN gave 20.839 wt% and 20.488 wt% respectively. It was noticed, however, that although optimization using RSM gave a higher predicted oil yield than that of ANN, the ANN model gave a better prediction as demonstrated by the very low value of its RMSE.