Using machine learning to select variables in data envelopment analysis : Simulations and application using electricity distribution data
Agencies that regulate electricity providers often apply nonparametric data envelopment analysis (DEA) to assess the relative efficiency of each firm. The reliability and validity of DEA are contingent upon selecting relevant input variables. In the era of big (wide) data, the assumptions of traditional variable selection techniques are often violated due to challenges related to high-dimensional