Usual way to do feature selection in GLM framework is to set l1 regularization. This type of regularization will tend to zero weights of unrellevant variables. Although this a good way to go, setting l1 usually decreases model quality.
Vowpal wabbit has two workarounds. One of them is ftrl-proximal optimization algorithm. This algorithm accumulates updates in a buffer vector and changes actual model weights only if updates are big enough. As result, the produced model tends to be very sparse. I order to get benefit from this option you still need to set l1. Also this is the algorithm described in famous google paper "ctr-modeling, a view from trenches".
Other option is feature mask. The idea of feature mask is that you train a model with very big l1 regularization to do feature selection in a first pass. Then in a second pass you use previous model to zero unrellevant features and you can train without l1 regularization at all.
Copyright © 2017. All rights reserved.