Tuesday, September 24, 2019

Rewrite Essay Example | Topics and Well Written Essays - 1250 words - 1

Rewrite - Essay Example WEKA enables the one of two options such as pruned tree or not pruned tree as shown in the figure. Figure 1: Properties of the Decision tree in the WEKA (J48) In addition to above features, the WEKA also performs the test options for data use and data classification. Usage of the Training set: Evaluation of the classifier is based on the prediction of the instances of a class, which is trained on. Supplied Test: Evaluation of the classifier is also performed on the prediction of the instances of a class, which is loaded from the file. Cross Validation: By entering the number of fold into the text field of the Fold in the WEKA explorer the classifier is evaluated. Percentage Split: Data percentage is predicted by the evaluation of a classifier that takes the data out for the testing. The percentage field determines the specification of data held. During the training, data is used and provided the value of percentage field that makes the important part. Value of the reminder is reserve d for the testing purposes. By the default, value of percentage split is stated as the 66%. Data about 34% is used for testing and remaining 66% is trained. Figure 2: WEKA with testing options Decision tree performance is determined by examining the cross validation and percentage split in the provided medical dataset. Usage of Cross Validation for generation of decision tree: In order to control the factors such as training’s set size and confidence by the process of cross validation, the flexibility is found in the decision tree of J48. Confidence factor is used to minimise or reduce the error rate of the classification. It is said that confidence factor is used to settle the problem of tree pruning. In order to classify the instances in a more accurate way, the classifier is given an opportunity by increasing the confidence factor and removing the noise of the training. The value of the confidence factor is 95% used for the dataset and leads to an outstanding outcome of 89 .2% for the correct and classified instances and only 10.7% is the classified incorrectly as shown in the following figure. Figure 3: Use of cross validation based on the option J-48 decision tree to generate the results by WEKA. In the above figure, the calculation of J48 decision tree has been shown which includes correct values in details. Confusion Matrix is the important point in the given figure, which describes the ways in which a classifier makes an error in the prediction of a class type. According to Dunham (2003) the confusion matrix provides the correctness of the solution for the given classification problem. Another term used as an alternative to the confusion matrix is the contingency table. Two classes having a single dataset contain a column and two rows for the confusion matrix as shown in the figure 4. Predicted Actual Figure 4: Confusion Matrix Here FP represents the incorrectly classified number of negatives as positives and called as the commission errors. TP r epresents correctly classified number of positives. TN represents the correct classification of negative numbers, and FN shows the incorrect classification of positive numbers as negative. These are called as the omission errors. Predictive accuracy becomes the way for measuring the performance of a classifier. Predictive accuracy is known as the calculated success rate determined by the use of predictive accuracy as the confusion mat

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.