Decision Tree Learning

 

quickstart
general help
tutorials

 

Creating a Decision Tree

There are two ways to acquire a data set to build a decision tree for. You can create a new data set and input data examples, or you can load a sample data set. If you wish to create a new data set, see the detailed help section on this topic.

To load a sample data set, click "Load Sample Data Set" from the "File" menu. Then select a data set from the drop-down menu and click "Load." The next step is to view and manipulate the examples in the data set. Click the "View/Edit Examples" button near the top of the left-side control panel. This opens a dialog window that can be used to add examples, remove examples, and transfer examples between the training set (left side) and the test set (right side). Make sure there are several examples in the test set before proceeding to build the decision tree.

Now that you have a data set ready, you can begin building the decision tree automatically by clicking the "Step" button until the tree is complete. Alternatively, you can use the "Split Node" and "Set as Leaf" graph modes on the control panel to construct the tree yourself. The other graph modes: "View Node Information," "View Mapped Examples," and "Toggle Histogram" can be used to gain information about the data at a particular node to guide your splitting.

Before constructing a decision tree, you may wish to click the "Show Plot" button to view the changes in training set and test set error as the tree is built.

Testing a Decision Tree

Once the decision tree is built, select the "Test" tab at the top of the control panel. Now click the "Test" button to see how well your decision tree is able to classify the test examples. This will open a window that shows which examples were correctly predicted, which were incorrectly predicted, and those for which the tree cannot make a prediction. The pie chart at the bottom provides a quick indication of the decision tree's performance.

You can also use the graph modes on the control panel to investigate the nodes individually to see their "Node Information," "Mapped Examples," and toggle a probability distribution view.