- Column Usage Type:
- Not selected: the field will not be included into processing but it will be included into the resulting sample without changes.
- Input: the field values will be used as the input ones.
- Input (external binning): it appears if External binning ranges port is set. Parameters of the configured external binning are used for the input field values.
- Input (locked): it appears upon the first execution and training of the Coarse Classes node. When reconfiguring or retraining the column, only statistics will be recalculated, and the coarse classes binning will be the same for all columns. It is recommended when a user does not intend to change the coarse classes binning taking into account the new data.
- Output: the field values will be used as the output ones (target).
- Input Continuous Field Settings:
- Prequanting sets the initial quantizing bin count of the input indicator (fine classes) from which the coarse classes will be generated according to the set parameters. It is recommended to use it when there are many unique values in the continuous field.
- Include upper bin bounds: the upper bound value will be included into the current bin. If the checkbox is inactive, the upper bin bound value relates to the next bin (namely, it is included into it as the lower bound). For example, the checkbox is active: 10 <...≤ 20, the checkbox is inactive: 10 ≤...< 20.
- Bin count: the prequanting bin count.
- Input Discrete Field Settings:
- Fine classes as binning: the unique values (fine classes) are used as coarse classes. Namely, the coarse classes count will meet the unique values count at the input, maximum possible count - 1000.
- Configure Output Field:
- Custom "event" value: it is required to select the value of the binary target variable that is an event. It is defined by the objective and logics of the performed task. Selection of this parameter affects interpretation of the WoE analysis results. It is recommended to assign the rare class (such option is offered by default) always as an event.
- Configure External Binning:
- External binning identifier: it appears if "Input (external binning)" usage type is selected, when the binning is taken from the table, and it is not calculated using the algorithm based on the current data (refer to Configure External Binning).
- Algorithm Settings:
- Minimum class weight, %. Class weight means the ratio of observations count for which the input indicator value is included into this class to the total count of the initial data set observations. 5% is set be default. It is not allowed to generate the classes with the weight that is less than the set one. The low class weight denotes its low significance and necessity of union with another class.
- Maximum class count. The highest allowable count of classes generated by the handler for a column. Value 5 is set by default. This value can be changed. The high count of classes causes decrease of their weight, whereas the low one brings about decrease of information value. The count of generated classes can be less than the set value due to the class weight restriction.
- Uniformity. It defines the method of partitioning of the input indicator variation range into classes (bins). This parameter is equal to 0 by default, and it can vary from 0 to 1. The value that is equal to 0 provides for such partitioning into classes that maximizes the information value. When the parameter value is equal to 1, the algorithm generates classes in such a way to include roughly the same number of observations into each of them. Thus, in the first case, the indicator significance is increased, whereas in the second case, interpretability of the coarse classes is increased.
|