·

Cursos Gerais ·

Inferência Estatística 2

Send your question to AI and receive an answer instantly

Ask Question

Preview text

Decision Analytics Journal 3 2022 100061 Contents lists available at ScienceDirect Decision Analytics Journal journal homepage wwwelseviercomlocatedajour Stochastic Data Envelopment Analysis applied to the 2015 Brazilian energy distribution benchmarking model Marcelo Azevedo Costa Cláudio Vítor Maquiné Salvador Aline Veronese da Silva Department of Production Engineering Federal University of Minas Gerais 6627 Antônio Carlos Avenue Belo Horizonte MG Brazil A R T I C L E I N F O Keywords Data envelopment analysis Stochastic data envelopment analysis Stochastic frontier analysis Benchmarking A B S T R A C T The Brazilian energy regulator has been applying data envelopment analysis to set operating costs for distribution system operators since 2013 In addition to data envelopment analysis further adjustments using Bootstrap and a reference efficiency estimated as 79 allow companies to have cost efficiencies above their observed costs Thus companies are allowed to cross the efficiency frontier Similarly stochastic frontier models also allow companies to cross the efficiency frontier This work proposes the use of stochastic data envelopment analysis as an alternative for estimating efficient costs thus providing a much simpler alternative A new estimation algorithm is proposed in which the number of companies crossing the frontier comprises one important parameter Simulation studies provide convergence evidence of the proposed model and results using the Brazilian database show that the stochastic data envelopment analysis is a promising model for upcoming tariff review cycles 1 Introduction The Brazilian regulator ANEEL Agência Nacional de Energia Elétrica has been estimating regulatory revenues for energy distribu tion companies such as regulatory operating costs since 2003 during the first tariff review cycle 1TRC Operating costs comprise a small part of the energy tariff and the regulatory operating cost or effi cient cost represents the pricecap value that each company hereafter named DSO Distribution System Operator can charge consumers In 2011 the regulator started to estimate regulatory operating costs using efficiency frontier methods such as corrected ordinary least squares OLS and data envelopment analysis DEA The proposed benchmark ing methods estimate the efficient cost based on DSO characteristics such as number of consumers distribution network length energy market nontechnical losses observed operating costs among others Further details about the Brazilian DSOs benchmarking models are found in Costa et al 1da Silva et al 2 and Lopes et al 3 In 2015 during the fourth tariff review cycle 4TRC the regulator proposed only one frontier model a DEA model to estimate efficient costs The current model is expected to be revised in the upcoming tariff review cycle starting in 2020 The current DEA model uses non decreasing returns of scale with operating costs as the input variable and number of consumers weighted power consumption high level network extension low level network extension underground network extension nontechnical losses and duration of interruption of energy as output variables In addition weight restrictions 4 are included in the DEA model The total number of DSOs is 61 and the database Corresponding author Email address macostaufmgbr MA Costa comprises average yearly data observed for each DSO between tariff review cycles every 4 to 5 years Thus the sample size is 61 Estimated cost efficiencies vary from 27 to 100 meaning that some DSOs must reduce their observed cost by 73 Cost efficiency is the ratio between efficient cost and observed cost da Silva et al 2Costa et al 5Gil et al 6 and Lopes et al 3 have argued that the lower efficiencies are due to the lack of important output variables in the model and the lack of environmental adjustments In addition to the DEA model the regulator is applying secondary adhoc adjustments After estimating the cost efficiencies using DEA a reference cost efficiency is estimated using the average of the cost effi ciencies above 55 In the last TRC the reference cost was estimated as 79 In addition a Bootstrap simulation proposed by Simar and Wilson 7 and Bogetoft and Otto 8 is applied to generate confidence intervals for the cost efficiencies Finally the DEA cost efficiencies the confidence intervals and the reference cost efficiency are combined generating final cost efficiencies varying from 37 to 119 Further details are found in Technical Note 662015 9 The regulator argues that fully efficient companies ie DSOs with cost efficiencies of 100 estimated using the DEA model must be rewarded for being fully efficient Thus their final efficiencies can be greater than 100 This adhoc procedure also increases the minimum value of the cost efficien cies The regulator also argues that the adhoc procedures adjusts for potential missing variables in the DEA model One may argue that the adhoc procedure simply allows DSOs to cross the efficiency frontier httpsdoiorg101016jdajour2022100061 Received 1 March 2022 Received in revised form 17 April 2022 Accepted 25 April 2022 Available online 4 May 2022 27726622 2022 The Authors Published by Elsevier Inc This is an open access article under the CC BYNCND license httpcreativecommonsorglicensesbyncnd40 MA Costa CVM Salvador and AV da Silva Decision Analytics Journal 3 2022 100061 An alternative class of benchmarking methods known as stochastic frontier methods SFA Stochastic Frontier Analysis allows DSOs to cross the efficiency frontier Stochastic frontier methods were originally developed by Aigner et al 10 using a parametric equation for the ef ficiency frontier and the sum of two independent random components One random component represents the technical inefficiency and the second random component represents the noise component The noise component allows the crossing of the efficiency frontier Banker 11 first proposed a nonparametric stochastic method using DEA known as Stochastic Data Envelopment Analysis SDEA in which the analyst chooses the number of points crossing the fron tier in advance thus the efficient frontier is estimated using a linear programming model Banker and Maindiratta 12 presented the use of maximum likelihood to estimate the SDEA model and claims that SDEA with multipleoutputs is somewhat harder to solve Therefore we leave the development of efficient solution method to future research Al ternatively Kuosmanen and Kortelainen 13 proposed the Stochastic NonSmooth Envelopment of Data StoNED which can be seen as an SDEA based on maximum likelihood but with an estimation algorithm based on Modified Ordinary Least Squares MOLS 14 The StoNED was adopted to regulate electricity distribution companies in Finland in 2012 15 achieving better performance than DEA and SFA Both SDEA and StoNED are semiparametric frontier models that combine a piecewise linear efficiency frontier and a stochastic ho moskedastic composite error Recently Jradi and Ruggiero 16 com pared the deterministic and stochastic DEA frontier using simulations considering the error component as normally distributed and the ineffi ciency as halfnormally distributed The authors proposed an algorithm to estimate the efficiency frontier using maximum likelihood The present work proposes the use of SDEA to estimate the Brazilian DSO cost efficiencies A new algorithm is presented based on the work of Jradi and Ruggiero 16 Simulation studies provide convergence properties of the proposed algorithm under different returns of scale assumptions Results show that the proposed SDEA model achieves similar cost efficiencies as compared to the ANEEL adhoc methodology Therefore we advocate the use of SDEA in upcoming tariff review cycles This paper is organized as follows Section 2 presents the Brazilian electricity benchmarking model the literature review and the proposed Stochastic Data Envelopment Analysis algorithm Section 3 presents the simulation results and the case study Section 4 presents the conclusion 2 Materials and methods 21 Historical background The Brazilian electricity distribution sector comprises a monopoly market with 61 regulated companies Each company or distribution system operator DSO provides service to its own concession area within the 26 Brazilian states To protect consumers from abusive tariffs energy regulation is provided by the regulator agency ANEEL Until 1993 every consumer would paid the same energy price regard less of the state Companies with negative revenues would get subsidies from the federal government In 1994 different prices were set for each company based on their own characteristics such as number of consumers length of the distribution network and cost of the energy In 2011 ANEEL started to apply benchmarking methodologies to regulate prices so that the energy costs could be covered by the revenues while protecting the consumers from abusive tariffs The methodology for the estimate of energy tariffs is reviewed every 4 or 5 years in a process named tariff review cycle TRC In the beginning of the TRC tariff prices are defined for each company Thus companies can optimize their costs and reach profitability At the end of the TRC tariff prices are revised and new values are defined The fourth TRC was concluded Table 1 Tradeoffs ie weight restrictions between input and outputs variables imposed by ANEEL in the 2015 DEANDRS model Tradeoffs Lower and upper bounds weight restrictions Input versus Network Distribution 580 𝑣𝑛𝑒𝑡𝑑𝑖𝑠𝑡 𝑢 2200 Underground Network versus 100 𝑣𝑢𝑛𝑑𝑒𝑟𝑛𝑒𝑡 𝑣𝑛𝑒𝑡𝑑𝑖𝑠𝑡 200 Network Distribution High Level Network versus 040 𝑣ℎ𝑖𝑔ℎ𝑛𝑒𝑡 𝑣𝑛𝑒𝑡𝑑𝑖𝑠𝑡 100 Network Distribution Input versus Total number of consumers 30 𝑣𝑐𝑜𝑛𝑠 𝑢 145 Input versus Delivered MWh 1 𝑣𝑀𝑊 ℎ 𝑢 60 Input versus NonTechnical Losses 10 𝑣𝑁𝑜𝑛𝑇 𝑒𝑐ℎ𝐿𝑜𝑠𝑠 𝑢 150 Input versus Interrupted services 𝑣𝑖𝑛𝑡𝑒𝑟𝑟𝑢𝑝𝑡 𝑢 2 in 2015 and the duration of the cycles are set individually for each company making the process more effective 22 The 2015 Brazilian electricity distribution regulation model During the fourth tariff review cycle 20152018 the regulator applied a DEANDRS model to calculate cost efficiencies for each DSO The Brazilian database is provided by ANEEL and comprises informa tion about 61 electricity distribution companies The input variable is operational cost The outputs variables are number of consumers weighted power consumption high level network extension low level network extension network distribution underground network exten sion nontechnical losses energy loss and duration of interruption of energy amount of time without electricity service as previously mentioned The database comprises mean values from 2014 to 2016 A large number of companies has efficiency equal to one thus weight restrictions are imposed on the DEANDRS model limiting upper and lower values on the tradeoffs between inputs and outputs The DEANDRS model which is currently used by ANEEL 9 is shown in Eq 1 max 𝑢𝑣𝜑 ℎ0 𝑚1 𝑗1 𝑣𝑗𝑦0 𝑗 𝑚2 𝑖1 𝑣𝑖𝑦0 𝑖 𝜑 subject to 𝑢 𝑥0 1 𝑚1 𝑗1 𝑣𝑗𝑦𝑛 𝑗 𝑚2 𝑖1 𝑣𝑖𝑦𝑛 𝑖 𝜑 𝑢 𝑥𝑛 0 𝑛 1 2 𝑁 𝑣𝑟 𝛼𝑟𝑢 0 𝑟 1 𝑅 𝑣𝑡 𝛽𝑡𝑢 0 𝑡 1 𝑇 𝑢 𝑣𝑗 𝜑 0 1 where ℎ0 is the efficiency of the DSO under analysis 𝑁 is the total number of DSOs 𝑚1 is the total number of positive outputs 𝑚2 is the total number of negative outputs 𝑦𝑛 𝑗 is the 𝑗th output of DSO 𝑛 𝑥𝑛 𝑖 is the 𝑖th input of DSO 𝑛 𝑢𝑖 is the input parameter 𝑣𝑗 is the 𝑗th output parameter 𝜑 is the scale parameter 𝛼𝑟 is the lower bound weight restriction between the parameters 𝑣𝑟 and 𝑢 𝛽𝑡 is the upper bound weight restriction between the parameters 𝑣𝑡 and 𝑢 𝑅 is the total number of lower bound weight restrictions and 𝑇 is the total number of upper bound weight restrictions Table 1 shows the weight restrictions where 𝑢 is the input parame ter related to the operational cost and 𝑣s are the output parameters Further details about the DEANDRS are available in Technical Note 1622017SRMANEEL 17 After calculating the efficiency scores using the DEANDRS model with weight restrictions as shown previously the regulator applies additional steps to calculate the final cost efficiencies 17 First a 2 MA Costa CVM Salvador and AV da Silva Decision Analytics Journal 3 2022 100061 Fig 1 Comparison between cost efficiencies using the brazilian DEANDRS model with weight restrictions and the procedure using DEANDRS Bootstrap Eq 2 confidence interval for the efficiency score is estimated for each DSO using a bootstrap method Thus lower 𝜃𝑖 𝑖𝑛𝑓 and upper 𝜃𝑖 𝑠𝑢𝑝 bounds for the efficiencies are estimated ie 𝜃𝑖 𝑖𝑛𝑓 𝜃𝑖 𝜃𝑖 𝑠𝑢𝑝 Although the cost efficiency methodology is applied separately for each DSO at different years under the hypothetical scenario in which all cost efficiencies for all DSOs are estimated in the first year of the TRC it can be shown that the final efficiencies are calculated using Eq 2 𝜃𝑖 𝑓𝑖𝑛𝑎𝑙 𝑚𝑖𝑛𝑚𝑎𝑥1 𝜃𝑖 𝑖𝑛𝑓 𝜃𝑟𝑒𝑓 𝜃𝑖 𝑠𝑢𝑝𝜃𝑟𝑒𝑓 2 where 𝜃𝑖 𝑓𝑖𝑛𝑎𝑙 is the new adjusted efficiency score for the 𝑖th DSO 𝜃𝑖 𝑖𝑛𝑓 is lower bound of the cost efficiency 𝜃𝑖 𝑠𝑢𝑝 is the upper bound of the cost efficiency and 𝜃𝑟𝑒𝑓 is the reference score estimated in the 4TRC as 079 79 As mentioned the value of 𝜃𝑟𝑒𝑓 is calculated as the mean value of the efficiencies greater than 055 55 generated by the DEANDRS model with weight restrictions The value of 55 was arbitrarily chosen by the regulator The complete procedure DEANDRS Bootstrap Eq 2 generates larger efficiencies as compared to the originals In some cases final efficiencies are greater than 1 100 which comprises DSOs crossing the efficiency frontier Briefly a DSO with bootstrap lower bound greater than the reference score has its ef ficiency score calculated as 𝜃𝑖 𝑖𝑛𝑓 𝜃𝑟𝑒𝑓 Consequently the final efficiency is greater than 1 A DSO with efficiency within the interval 𝜃𝑖 𝑖𝑛𝑓 𝜃𝑟𝑒𝑓 𝜃𝑖 𝑠𝑢𝑝 has its final efficiency equal to 1 Finally a DSO with bootstrap upper bound lower than the reference score 𝜃𝑖 𝑠𝑢𝑝 𝜃𝑟𝑒𝑓 has its efficiency calculated as 𝜃𝑖 𝑠𝑢𝑝 𝜃𝑟𝑒𝑓 Consequently the final efficiency is greater or equal to the original In the last tariff review cycle 20152018 the DEANDRS resulted in 6 companies with efficiency scores equal to 1 After the recalculation 11 companies 1803 achieved efficiencies greater than 1 and 15 companies 2459 achieved efficiencies equal to 1 Thus 4262 of the companies achieved efficiencies greater or equal to 1 This is illustrated in Fig 1 which compares the cost efficiencies calculated by the DEANDRS and using Eq 2 as proposed by ANEEL The DSOs were sorted in increasing order of the DEANDRS cost efficiencies The horizontal line represents the 100 cost efficiency Thus DSOs located below the horizontal line comprise companies with efficiencies below 100 Points located on the horizontal line comprise companies with efficiencies of 100 and points located above the horizontal line comprise companies with efficiencies greater than 100 In this scenario the Stochastic Data Envelopment Analysis SDEA can be a more suitable option since the SDEA allows a portion of DSOs to cross the frontier Furthermore SDEA has the advantage of estimating linear equations to calculate efficiency cost for each DSO Thus the analyst can compare the linear coefficients among DSOs evaluating the variables affecting their efficient costs 23 Data envelopment analysis Data Envelopment Analysis DEA is a benchmarking tool proposed by Charnes et al 18 and extended by Banker et al 19 applied worldwide which uses mathematical linear programming to measure the efficiency of DSOs using input and output variables In general DEA can be applied to minimize inputs or maximize outputs Using an inputoriented approach the DEA evaluates whether a DSO can reach the same outputs with fewer inputs Using an outputoriented approach DEA evaluates whether a DSO can produce more outputs with the same amount of inputs In both cases a DSO is fully efficient if there is no need either to minimize inputs or maximize outputs 20 DEA models can assume different returns to scale properties Con stant returns to scale CRS or variable returns to scale VRS are the most common The choice of the orientation must rely on the data and the objectives of the research 21 Furthermore DEA has a major advantage which is the nonparametric estimate of the frontier ie without specifying the parametric equation of the production or cost function Further details about DEA are found in Cook et al 20 Bogetoft and Otto 8 and elsewhere 24 Stochastic Frontier Analysis DEA and SFA Stochastic Frontier Analysis have been used for both managerial and economic research mainly in the last decade SFA is more widely used in Economics Lampe and Hilgers 22 claim that DEA research activity is not as fast to adopt new concepts as SFA SFA is a stochastic frontier model first proposed by Aigner et al 10 and Meeusen and van Den Broeck 23 which has the advantage of distinguishing two type of errors inefficiency and noise The structure of the compound error for production frontier is given by 𝜖𝑖 𝑣𝑖 𝑢𝑖 where 𝑣𝑖 and 𝑢𝑖 are independent random variables 𝑣𝑖 is normally distributed 𝑣𝑖 𝑁0 𝜎2 𝑣 and 𝑢𝑖 follows a onesided distribution such as a halfnormal distribution 𝑢𝑖 𝑁0 𝜎2 𝑢 For cost frontier 𝜖𝑖 𝑣𝑖 𝑢𝑖 In order to apply SFA the parametric equation of the efficiency frontier must be specified For production frontier the CobbDouglas function 𝐶𝑦𝑗 𝑦𝑚 𝛽0𝑦𝛽 1 𝑦𝛽𝑚 𝑚 24 is widely applied where 𝑦𝑗 are the outputs and 𝛽0 𝛽𝑚 are the parameters of the production function In the case of cost function the Translog function 25 can be applied However even using the Translog function the assumption of monotonicity and convexity of the cost function can be violated creating perverse incentives to produce less outputs to improve the efficiency 15 Further details about cost frontier models are shown in Section 26 3 MA Costa CVM Salvador and AV da Silva Decision Analytics Journal 3 2022 100061 25 Stochastic data envelopment analysis Let 𝐱𝑗 𝑥1𝑗 𝑥𝐾𝑗 be the vector of inputs of dimension 𝐾 for the 𝑗th decision making unit DMU and 𝑦𝑗 the respective output ie a single output The SDEA model that estimates the production function is given by Eq 3 as originally proposed by Banker 11 𝑚𝑖𝑛 𝑛 𝑖1 𝜏𝑒1𝑖 1 𝜏𝑒2𝑖 subject to 𝑦𝑖 𝛼𝑖 𝛽1𝑖𝑥1𝑖 𝛽𝑘𝑖𝑥𝑘𝑖 𝑒1𝑖 𝑒2𝑖 𝑖 1 𝑁 𝛼𝑖 𝛽1𝑖𝑥1𝑖 𝛽𝑘𝑖𝑥𝑘𝑖 𝛼𝑗 𝛽1𝑗𝑥1𝑖 𝛽𝑘𝑗𝑥𝑘𝑗 𝑖 𝑗 1 𝑁 𝛽𝑘𝑖 0 𝑘 1 𝐾 𝑖 1 𝑁 𝑒1𝑖 𝑒2𝑖 0 𝑖 1 𝑁 3 where 𝛼𝑖 𝛽1𝑖𝑥1𝑖 𝛽𝑘𝑖𝑥𝑘𝑖 comprises the piecewise linear production frontier for each DMU 𝑖 Similar to SFA models an SDEA compound error structure can be written as 𝑒𝑖 𝑒1𝑖 𝑒2𝑖 Thus if 𝑒𝑖 0 then the 𝑖th DMU have an output above the production frontier Likewise if 𝑒𝑖 0 then the 𝑖th DMU have an output below the production frontier If 𝑒𝑖 0 the DMU output is located at the efficiency frontier 𝜏 is the parameter previously selected by the analyst which controls the proportion of points crossing the production frontier Returns to scale properties in the SDEA model are implemented by restricting the value of 𝛼𝑖 Constant returns to scale are implemented assuming 𝛼𝑖 0 Nondecreasing returns to scale are implemented using 𝛼𝑖 0 and variable returns to scale are implemented using 𝛼𝑖 R The SDEA model shown in Eq 3 is similar to a quantile mul tiple linear regression model where convexity and monotonicity are assumed 16 The solution of Eq 3 assumes that approximately 100𝜏 percent of the data will be above the production frontier If 𝜏 is equal to 1 the model becomes deterministic and the compound error expresses technical inefficiency only It can be shown that both mathematical representations of error 𝑒𝑖 𝑒1𝑖 𝑒2𝑖 SDEA and 𝑒𝑖 𝑣𝑖 𝑢𝑖 SFA are equivalent Therefore the optimal value of 𝜏 can be chosen assuming an SFA stochastic compound error structure as presented by Jradi and Ruggiero 16 Eq 4 shows the probability density distribution pdf of the compound error for the production function 𝑓𝜖 2 𝜎 𝜙 𝜖 𝜎 𝛷 𝜖𝜆 𝜎 4 where 𝜎 𝜎2 𝑢 𝜎2 𝑣 𝜆 𝜎𝑢 𝜎𝑣 𝜙 is the probability density function of a standard normal random variable and 𝛷 is the respective ac cumulated probability function Jradi and Ruggiero 16 proposes an algorithm to estimate the optimal value of 𝜏 by searching over a grid of values For each value of 𝜏 05 0001𝑘 1 𝑘 1 491 the frontier and the errors are estimated using Eq 3 Using the estimated residuals 𝑒𝑖 𝑒1𝑖 𝑒2𝑖 and the statistical properties of the first and second moments of the 𝜖 random variable values for 𝜎2 and 𝜆 are computed The optimal estimate of 𝜏 is the value of 𝜏 that achieves the maximum likelihood value based on Eq 4 Finally one may argue that the stochastic estimate of production SDEA model is the solution of Eq 5 shown below max 𝛼𝑖𝛽𝑖𝜎𝜆 𝑛 𝑖1 log 𝜙 𝑒𝑖 𝜎 log 𝛷 𝜆 𝑒𝑖 𝜎 log 𝜎 subject to 𝑦𝑖 𝛼𝑖 𝛽1𝑖𝑥1𝑖 𝛽𝑘𝑖𝑥𝑘𝑖 𝑒𝑖 𝑖 1 𝑁 𝛼𝑖 𝛽1𝑖𝑥1𝑖 𝛽𝑘𝑖𝑥𝑘𝑖 𝛼𝑗 𝛽1𝑗𝑥1𝑖 𝛽𝑘𝑗𝑥𝑘𝑗 𝑖 𝑗 1 𝑁 𝛽𝑘𝑖 0 𝑘 1 𝐾 𝑖 1 𝑁 5 Eq 5 was first presented by Banker and Maindiratta 12 How ever the solution of Eq 5 which comprises a nonlinear maximiza tion problem subject to linear constraints is still an open topic for research 26 Comparison of deterministic and stochastic cost frontier models As mentioned the present work proposes a new SDEA algorithm to estimate the cost efficiencies of the Brazilian DSOs The main frontier models for cost regulation are based on Eq 6 ln 𝑥 ln 𝐶𝑦1 𝑦𝑘 𝛿𝑧 𝑢 𝑣 6 where 𝑥 is the observed cost inputs 𝐶 is the function that character izes the efficiency cost frontier 𝑦1 𝑦𝑘 are the outputs 𝑧 represents exogenous components 𝛿 is the coefficient associated with the exoge nous component 𝑢 is the random variable representing inefficiency and 𝑣 is the random variable representing statistical noise In deterministic models deviations from the cost frontier are associated with ineffi ciency ie 𝑣 0 and 𝑢 0 As mentioned the DEA model assumes only inefficiency components and a nonparametric frontier equation Stochastic models 26 assume both noise 𝑣 and inefficiency 𝑢 components Consequently the probability distributions for the 𝑣 and 𝑢 components must be specified In general the simplest model con sists of assuming a truncated normal distribution halfnormal for the inefficiency component 𝑢 𝑁0 𝜎2 𝑢 and a normal distribution for the noise component 𝑣 𝑁0 𝜎2 𝑣 The compound error is written as 𝜖 𝑢𝑣 Using the probability distributions of 𝑢 𝑓𝑢𝑢 and 𝑣 𝑓𝑣𝑣 the probability distribution of the compound error 𝜖 is written as 𝑓𝜖𝜖 0 𝑓𝑣𝜖 𝑢𝑓𝑢𝑢𝑑𝑢 7 The solution is given by 𝑓𝜖𝜖 2 𝜎 𝜙 𝜖 𝜎 𝛷𝜆 𝜖 𝜎 where 𝜎2 𝜎2 𝑢 𝜎2 𝑣 and 𝜆 𝜎𝑢 𝜎𝑣 Other probability distributions can also be defined 102327 Given the density function of the compound error the parameters of the cost function ln 𝑥 ln 𝐶𝑦1 𝑦𝑘 as well as the parameters 𝜆 and 𝜎2 can be estimated using the maximization of the likelihood function 28 Nonetheless according to Sartori 29 the maximum likelihood estimate for parameter 𝜆 can be infinite with a nonzero probability for smaller samples Consequently the SFA model can wrongly identify all companies as fully efficient The procedure above provides the estimates of the frontier or the cost function parameters The estimates of the efficiency scores ie the ratio between efficient cost and observed cost imply the estimates of the inefficiency compo nents 𝑢 In this case the conditional density of 𝑢𝜖 can be calculated as shown in Eq 8 𝑓𝑢𝜖𝑢𝜖 𝑓𝑢 𝜖 𝑓𝜖𝜖 𝑓𝑢𝑢𝑓𝑣𝜖 𝑢 𝑓𝜖𝜖 8 Given the conditional density distribution Jondrow et al 24 and Bogetoft and Otto 8 present three possible equations to estimate the efficiency score 𝜃1 𝑖 𝑒𝐸𝑢𝜖𝑖 9 𝜃2 𝑖 𝐸𝑒𝑢𝜖𝑖 10 𝜃3 𝑖 𝑒𝑀𝑢𝜖𝑖 11 where 𝐸𝑢𝜖 0 𝑢 𝑓𝑢𝜖𝑢𝜖𝑑𝑢 𝐸𝑒𝑢𝜖𝑖 0 𝑒𝑢 𝑓𝑢𝜖𝑢𝜖 and 𝑀𝑢𝜖𝑖 is the conditional mode equation In general Eq 10 is the most commonly applied 8 However Eq 10 comprises an estimate for the mean value of the efficiency scores Consequently it is unlikely that one company will reach an efficiency score equal to one 100 Thus compound error models in general do not estimate fully efficient DSOs Kuosmanen et al 15 mention that companies regulated by compound error models and the conditional mean estimator are not able 4 MA Costa CVM Salvador and AV da Silva Decision Analytics Journal 3 2022 100061 to reach the efficiency frontier even if its efficiency is adjusted according to the amount indicated by the model The StoNED model 13 also applies a compound error structure but the estimate of the cost function is accomplished in the first stage using a nonparametric convex least squares as shown in Eq 12 min 𝛽𝛼𝑢 𝑁 𝑖1 ln 𝑥𝑖 ln 𝜑𝑖 2 subject to 𝜑𝑖 𝛼𝑖 𝛽1𝑖𝑦1 𝛽𝑘𝑖𝑦𝑖 𝑖 1 𝑁 𝛼𝑖 𝛽1𝑖𝑦1 𝛽𝑘𝑖𝑦𝑖 𝛼𝑗 𝛽1𝑗𝑦1 𝛽𝑘𝑗𝑦𝑖 𝑗 𝑖 1 𝑁 𝛽𝑘𝑖 0 𝑖 1 𝑁 12 StoNED first apply a nonparametric minimum least squares to estimate the parameters of the frontier and the compound error In the case of the StoNED the compound error estimate is defined by the residual of the model ie 𝜖𝑖 ln 𝑥𝑖 ln 𝜑𝑖 In the original proposal the normal and halfnormal distributions are also used for the noise and the inefficiency components respectively 𝜖𝑖 𝑢𝑖 𝑣𝑖 The compound error parameters 𝜆 and 𝜎2 are estimated in a second stage using the method of moments 13 The StoNED resembles a COLS estimate in which the parameters of the frontier are estimated using minimum least squares However the StoNED uses a nonparametric frontier equation From the residuals of nonparametric OLS estimate bias correction of the frontier and the compound error parameters are estimated Final estimates of the efficiency scores using StoNED is similar to the SFA model in which the conditional distribution is required The SDEA resembles the StoNED and DEA methods by assuming a nonparametric form for the efficiency frontier It is worth noticing that SDEA StoNED and SFA allow DSOs to cross the efficiency frontier According to Tone 30 other methods also allow points to cross the frontier such as order𝑚 and order𝛼 frontier 31 and chanceconstrained programming 32 Thus these methodologies assume directly as in the case of SFA and StoNED or indirectly in case of SDEA a compound error structure As mentioned by Sartori 29 da Silva et al 2 and Azzalini 33 the composed error models as SFA StoNED and SDEA presents serious problems of convergence particularly related to the parameter 𝜆 Those problems can be minimized using a large data base or a smaller number of variables as compared to the number of DSOs Nonetheless this is not the case of Brazilian electricity distribution system operators DSOs da Silva et al 2 identified convergence problems in the compound error model using data from the 4TRC Alternatives to adjust compound error models are shown in literature such as the Bayesian approach described by Bayes and Branco 34 However the Bayesian approach requires the use of MCMC Markov Chain Monte Carlo methods which are sensitive to initial conditions of the algorithm In short compound error models such as SFA and StoNED may present convergence problems when estimating their parameters es pecially the efficiency scores In addition different stochastic assump tions for noise and inefficiency components may generate different results Consequently estimated operational cost efficiencies may be unreliable Thus international regulation agencies prefer DEA 35 27 Proposed SDEA algorithm The proposed SDEA estimation algorithm is based on the maximiza tion of the compound error likelihood subject to an SDEA piecewise linear cost frontier model as shown in Eq 13 max 𝛼𝑖𝛽𝑖𝜎𝜆 𝑛 𝑖1 log 𝜙 𝑒𝑖 𝜎 log 𝛷 𝜆𝑒𝑖 𝜎 log 𝜎 𝑠𝑢𝑏𝑗𝑒𝑐𝑡 𝑡𝑜 𝐶𝑦1𝑖 𝑦𝐾𝑖 𝛼𝑖 𝛽1𝑖𝑦1𝑖 𝛽𝐾𝑖𝑦𝐾𝑖 𝑒𝑖 log 𝑥𝑖 log 𝐶𝑦1𝑖 𝑦𝐾𝑖 𝑖 1 𝑁 𝛼𝑖 𝛽1𝑖𝑦1𝑖 𝛽𝐾𝑖𝑦𝐾𝑖 𝛼𝑗 𝛽1𝑗𝑦1𝑖 𝛽𝐾𝑗𝑦𝐾𝑖 𝑖 𝑗 1 𝑁 𝛽𝑘𝑖 𝛼𝑖 0 𝑖 1 𝑁 𝑘 1 𝐾 𝜎 𝜆 0 13 Eq 13 comprises the original SDEA problem shown in Eq 5 but assuming a cost frontier formulation as described in Eq 12 Follow ing Jradi and Ruggiero 16 our proposed algorithm also applies the proportion of points crossing the frontier as a proxy for the estimation of 𝜆 Nevertheless a different representation is proposed as follows From Eq 4 the probability of points crossing a production frontier can be written as a function of the 𝜆 parameter as shown in Eq 14 𝑃 𝜖 0 0 2 𝜎 𝜙 𝜖 𝜎 𝛷 𝜖𝜆 𝜎 𝑑𝜖 05 1 𝜋 𝑎𝑟𝑐𝑡𝑎𝑛𝜆 14 For the cost frontier the compound error is written as 𝑣 𝑢 Thus the probability of points crossing the cost frontier can be written as a function of the 𝜆 parameter as follows Let 𝜖1 𝑣 𝑢 and 𝜖2 𝑣 𝑢 where 𝜖1 comprises the production function compound error and 𝜖2 comprises the cost frontier compound error The pdf s of 𝜖1 and 𝜖2 can be written as follows 𝑓𝜖1𝑥 2 𝜎 𝜙 𝑥 𝜎 𝛷 𝑥𝜆 𝜎 15 𝑓𝜖2𝑥 2 𝜎 𝜙 𝑥 𝜎 𝛷 𝑥𝜆 𝜎 16 It can be shown that 𝜙 𝑥 𝜎 𝜙 𝑥 𝜎 Consequently 𝑓𝜖1𝑥 𝑓𝜖2𝑥 or 𝑓𝜖1𝑥 𝑓𝜖2𝑥 Thus it can be shown that 𝑃 𝜖1 0 𝑃 𝜖2 0 𝑃 𝜖1 0 1 𝑃𝜖2 0 𝑃 𝜖2 0 1 𝑃𝜖1 0 17 From Eq 14 the probability of points crossing the cost frontier can be written as follows 𝑃 𝜖2 0 1 05 1 𝜋 arctan𝜆 05 1 𝜋 arctan𝜆 18 Finally from Eq 18 the 𝜆 parameter can be written as a function of 𝑃 𝜖2 0 as shown in Eq 19 𝑃𝜖2 0 1 05 1 𝜋 arctan𝜆 1 𝜋 arctan𝜆 05 𝑃 𝜖2 0 𝜆 tan𝜋05 𝑃𝜖2 0 19 Eq 19 shows that using the SDEA cost frontier model it is possible to estimate the 𝜆 parameter by defining the probability of points crossing the frontier 𝑃 𝜖2 0 Given the sample of size 𝑁 we propose to estimate the 𝜆 parameter by gradually allowing points to cross the efficiency frontier thus approximating the value of 𝑃 𝜖2 0 as the observed proportion of points crossing the frontier 𝑃𝜖2 0 𝑘𝑁 as show in Eq 20 𝜆 tan𝜋05 𝑘𝑁 20 where 𝑘 0 1 𝑁2 5 MA Costa CVM Salvador and AV da Silva Decision Analytics Journal 3 2022 100061 In order to estimate the cost efficiency frontier ie the piecewise linear regression parameters the SDEA linear model using 𝜏 1 is applied as shown in Eq 21 𝑚𝑖𝑛 𝑛 𝑖1 𝑒𝑖 subject to 𝑦𝑖 𝛼𝑖 𝛽1𝑖𝑥1𝑖 𝛽𝑘𝑖𝑥𝑘𝑖 𝑒𝑖 𝑖 1 𝑁 𝛼𝑖 𝛽1𝑖𝑥1𝑖 𝛽𝑘𝑖𝑥𝑘𝑖 𝛼𝑗 𝛽1𝑗𝑥1𝑖 𝛽𝑘𝑗𝑥𝑘𝑗 𝑖 𝑗 1 𝑁 𝛽𝑘𝑖 0 𝑘 1 𝐾 𝑖 1 𝑁 𝑒𝑖 0 𝑖 1 𝑁 21 The proposed algorithm starts by assuming that no points are cross ing the efficiency frontier thus the frontier is estimated using Eq 21 and 𝑃 𝜖2 0 0 Next the points located on the frontier are the candidates to cross the frontier Each of these points is temporarily removed from the data and a new frontier is estimated also using Eq 21 Using the complete data the residuals of the cost frontier model are calculated as 𝑒𝑖 log 𝑥𝑖 log 𝐶𝑦1𝑖 𝑦𝐾𝑖 as shown in Eq 13 Assuming that 𝜆 tan𝜋05 1𝑁 the 𝜎2 estimate is calculated using an univariate maximum likelihood search applying for instance the Goldensection search algorithm 36 Therefore one maximum likelihood value is computed for each point on the frontier The point with the maximum value is selected as the candidate to cross the frontier The procedure is repeated until a maximum number of points defined by the analyst say 𝑁2 crosses the frontier Briefly the proposed algorithm comprises the following steps 1 Set 𝑘 1 2 Compute 𝜆 𝑡𝑎𝑛𝜋05 𝑘𝑁 3 Solve the SDEA model Eq 21 4 Select the points located on the frontier ie 𝑒𝑖 0 5 For each point located on the frontier a Remove temporarily the point and solve the SDEA model Eq 21 b Using the complete dataset calculate the residuals 𝑒𝑖 log 𝑥𝑖 log 𝐶𝑦1𝑖 𝑦𝐾𝑖 c Using the residuals calculate 𝜎2 𝑎𝑟𝑔 max 𝜎 𝐿𝐿𝜎 where 𝐿𝐿𝜎 𝑛 𝑖1 log 𝜙 𝑒𝑖 𝜎 log 𝛷 𝜆 𝑒𝑖 𝜎 log 𝜎 is the likelihood function d Select the point located on the frontier that achieved the maximum likelihood value and remove it from the data set e Set 𝑘 𝑘 1 and return to Step 2 6 Repeat steps 2 to 5 until a maximum of points defined by the analyst crosses the frontier 7 The final solution comprises the value of 𝑘 and the respec tive efficiency frontier which achieved the maximum likelihood function value Fig 2 illustrates the proposed algorithm Using Eq 21 the cost frontier and the points located on the frontier are estimated as shown in Fig 2a The points are gradually evaluated and the candidate with the maximum likelihood value crosses the frontier as shown in Fig 2b The procedure is repeated until a maximum number of points crosses the frontier and the likelihood function is evaluated at each step as shown in Fig 2c The final solution comprises the proportion of points crossing the frontier and the estimated cost frontier with the maximum likelihood value as shown in Fig 2d It is worth mentioning that by evaluating each point located on the frontier and gradually increasing the number of crossing points the proposed algorithm avoids the estimate of overlapping frontiers or crossing quantiles 37 28 Simulation study In this section a simulation study is presented to evaluate the statistical properties of the proposed SDEA algorithm The simulation model is based on the statistical distribution and the statistical cor relation between observed cost input and weighted market output of the Brazilian DSOs Data is generated for 𝑁 61 observations DSOs The output log 𝑦𝑖 𝑈𝑛𝑖𝑓𝑜𝑟𝑚𝑚𝑖𝑛 8 𝑚𝑎𝑥 16 and 𝜎2 𝜎2 𝑢 𝜎2 𝑣 07 Different values for 𝜆 are evaluated using different numbers of points crossing the frontier The selected numbers of points crossing the frontier are 𝑘 1 3 5 8 15 Thus based on Eq 20 the respective values of 𝜆 are 𝜆 194 642 38 229 102 Consequently 𝜎2 𝑢 𝜎2 𝜆21 𝜆2 and 𝜎2 𝑣 𝜎21 𝜆2 Therefore 𝜖 𝑣 𝑢 with 𝑣 𝑁0 𝜎𝑣 and 𝑢 𝑁0 𝜎𝑢 The simulated costs are generated as 𝑥𝑖 007 𝑦𝑖 𝑒𝜖𝑖 denoting a constant returns to scale model 29 The use of SDEA to approximate DEA results with weight restrictions SDEA can be used to estimate DEANDRS cost frontier To illustrate this approach an SDEA model using one input 𝑥 and one output 𝑦 is presented in Eq 22 min 𝑛 𝑖1 𝑒𝑖 𝑠𝑡 𝑥𝑖 𝛼𝑖 𝛽𝑖𝑦𝑖 𝑒𝑖 𝑖 1 𝑁 𝛼𝑖 𝛽𝑖𝑦𝑖 𝛼𝑗 𝛽𝑗𝑦𝑖 𝑖 𝑗 1 𝑁 𝛽𝑖 𝑒𝑖 𝛼𝑖 0 𝑖 1 𝑁 22 As mentioned for nondecreasing returns to scale property 𝛼𝑖 0 For each DSO 𝑖 the efficient cost is given by 𝑥𝑖 𝛼𝑖 𝛽𝑖𝑦𝑖 and the cost efficiency input oriented is given by 𝑥𝑖 𝑥𝑖 It is worth noticing that Eq 22 comprises a specific SDEA formulation in which no DSO crosses the frontier ie 𝜏 0 Consequently both DEA and SDEA results are identical If the DEA model applies weight restrictions as proposed by the Brazilian regulator one alternative to estimate a similar SDEA model is to use a data set of fully efficient DSOs as follows First the efficiency score 𝜃𝑖 is calculated using the DEA model with weight restrictions Second the DEA cost efficiency 𝜃𝑖 is included in the SDEA optimization model as shown in Eq 23 min 𝑛 𝑖1 𝑒𝑖 𝑠𝑡 𝜃𝑖𝑥𝑖 𝛼 𝑖 𝛽 𝑖 𝑦𝑖 𝑒𝑖 𝑖 1 𝑁 𝛼 𝑖 𝛽 𝑖 𝑦𝑖 𝛼 𝑗 𝛽 𝑗 𝑦𝑖 𝑖 𝑗 1 𝑁 𝛽 𝑖 𝑒𝑖 𝛼 𝑖 0 𝑖 1 𝑁 23 Briefly by including DEA efficiencies which were estimated using weight restrictions in Eq 23 the SDEA model is estimated using all DSOs at the frontier The SDEA cost frontier with weight restriction is given by 𝑥𝑖 𝛼 𝑖 𝛽 𝑖 𝑦𝑖 Thus the estimated SDEA models using Eqs 22 and 23 allows the analyst to compare the piecewise linear equations of the efficiency costs with and without weight restrictions for each DSO If 𝑝 outputs are available then the piecewise cost frontier is written as 𝑥𝑖 𝛼𝑖 𝛽1𝑖𝑦1𝑖 𝛽𝑝𝑖𝑦𝑝𝑖 24 Eq 24 can also be written as 𝑥𝑖 𝛼𝑖 𝐲𝑇 𝑖 𝛽 where 𝐲𝑖 is a vector of outputs 𝐲𝑖 𝑦1𝑖 𝑦2𝑖 𝑦𝑝𝑖 3 Results Table 2 presents the simulated results for the estimated number of points crossing the efficiency frontier Three SDEA models using different returns to scale were evaluated As mentioned the simulated data sets were generated assuming constant returns do scale CRS 6 MA Costa CVM Salvador and AV da Silva Decision Analytics Journal 3 2022 100061 Fig 2 Proposed algorithm to estimate the optimal number of points crossing the cost frontier and the maximum likelihood parameters Therefore it is expected that the SDEACRS model achieves better performance Using the SDEACRS model the median values indicate that the model was able to detect the correct number of points crossing the frontier in most cases except for simulations with the largest number of points crossing the frontier 𝑘 15 In these cases the estimated number of points was slightly greater Furthermore the mean value indicates an estimation bias ie a slightly larger number of points crossing the frontier Differences between mean and median values indicate that the empirical distribution is asymmetric In fact the distribution of the estimated values is truncated at 𝑘 29 It is worth noticing that the real value of the number of points crossing the frontier is within the quartile intervals ie between the first and third quartiles for all simulated values of 𝑘 Using the SDEANDRS model in general the median values were slightly larger as compared do the SDEACRS model and the quartile intervals were also larger as shown in Table 2 Nevertheless simulated results using SDEACRS and SDEANDRS are closer Using the SDEAVRS model an interesting behavior is found For 𝑘 1 the SDEAVRS presented similar results to SDEACRS For 𝑘 3 5 8 and 15 the SDEAVRS model underestimated the number of points crossing the efficiency frontier It is worth mentioning that the SDEA VRS model is more flexible than SDEACRS and SDEANDRS In general the SDEAVRS requires more piecewise linear functions Consequently more points are estimated on the frontier and therefore fewer points crosses the frontier Table 2 Summary statistics for the estimated number of points crossing the efficiency frontier using the simulated data RTS 𝑘𝜆 Number of points crossing the frontier Min 1st Qu Median Mean 3rd Qu Max CRS 1 194 1 1 1 14 1 22 3 642 1 1 3 31 4 28 5 38 1 3 5 55 6 29 8 229 1 6 8 98 12 29 15 102 1 11 17 175 25 29 NDRS 1 194 1 1 1 17 2 24 3 642 1 1 2 34 4 29 5 38 1 3 5 65 8 29 8 229 1 6 9 114 15 29 15 102 1 11 17 177 25 29 VRS 1 194 1 1 1 14 1 29 3 642 1 1 1 28 3 29 5 38 1 1 3 5 6 29 8 229 1 2 7 9 13 29 15 102 1 7 14 146 22 29 Table 4 shows the simulated results for the estimated 𝜎2 parameter In general both the median and mean values are slightly underesti mated Nevertheless a small bias is expected for small samples In general results using SDEACRS SDEANDRS and SDEAVRS are very close 7 MA Costa CVM Salvador and AV da Silva Decision Analytics Journal 3 2022 100061 Table 3 Summary statistics for the 𝜆 parameter using the simulated data RTS 𝜆 𝜆 summary statistics Min 1st Qu Median Mean 3rd Qu Max CRS 194 047 1940 1940 1686 1940 1940 642 013 479 642 972 1940 1940 38 008 313 380 563 642 1940 229 008 141 229 296 313 1940 102 008 029 083 132 157 1940 NDRS 194 013 967 1940 1574 1940 1940 642 008 479 967 1039 1940 1940 38 008 229 380 594 642 1940 229 008 103 200 307 313 1940 102 008 029 083 128 157 1940 VRS 194 008 1940 1940 1773 1940 1940 642 008 642 1940 1316 1940 1940 38 008 313 642 1005 1940 1940 229 008 126 265 616 967 1940 102 008 047 114 321 265 1940 Table 4 Summary statistics for the 𝜎2 parameter using the simulated data 𝜎2 07 RTS 𝜎2 summary statistics Min 1st Qu Median Mean 3rd Qu Max CRS 033 063 068 072 072 089 036 063 068 068 073 096 032 062 069 068 075 096 033 060 068 067 076 109 043 058 066 069 079 122 NDRS 029 061 066 066 071 090 036 062 068 067 074 095 034 058 067 066 074 097 033 055 065 065 075 113 041 058 065 068 076 125 VRS 034 059 064 064 069 089 034 061 067 066 072 093 031 061 069 067 075 099 034 056 069 068 079 112 042 060 070 074 086 152 Fig 3 Pairwise Spearman correlation plot of the efficiency costs estimated by the Brazilian regulator and the proposed SDEA models Table 3 presents the simulated results for the estimated 𝜆 parameter As shown in Eq 20 there is a nonlinear correlation between the number of points crossing the efficiency frontier 𝑘 and the respective Table 5 Summary statistics for the estimated number of points crossing the efficiency frontier using the simulated data 𝑛 122 RTS 𝑘𝜆 Number of points crossing the frontier Min 1st Qu Median Mean 3rd Qu Max CRS 2 194 1 1 2 202 3 15 6 642 1 4 5 560 7 18 10 38 1 8 10 1029 12 59 16 229 2 13 16 1768 20 60 30 102 7 24 32 3554 49 60 NDRS 2 194 1 1 1 21 3 21 6 642 1 3 5 607 8 60 1038 1 7 10 1144 14 60 16 229 1 13 17 19 23 60 30 102 6 25 37 3757 51 60 Table 6 Summary statistics for the 𝜆 parameter using the simulated data 𝑛 122 RTS 𝜆 𝜆 summary statistics Min 1st Qu Median Mean 3rd Qu Max CRS 194 245 1291 1940 2603 3882 3882 642 200 548 772 955 967 3882 38 005 313 379 448 478 3882 229 002 176 228 244 287 1939 102 002 031 092 097 140 548 NDRS 194 166 1291 3882 2671 3882 3882 642 002 478 772 1066 1291 3882 38 002 265 379 487 548 3882 229 002 148 213 253 287 3882 102 002 026 071 091 133 642 Table 7 Summary statistics for the 𝜎2 parameter using the simulated data 𝑛 122 𝜎2 07 RTS 𝜎2 summary statistics Min 1st Qu Median Mean 3rd Qu Max CRS 054 065 069 069 072 085 052 065 069 069 073 086 041 064 069 069 073 087 042 062 068 068 074 093 045 060 067 068 075 105 NDRS 050 065 068 068 071 083 038 065 069 069 073 092 037 062 068 068 073 097 040 060 067 066 073 103 048 058 064 066 074 113 𝜆 parameter In general the larger the number of points crossing the frontier the lower the value of 𝜆 In our proposed algorithm the value of 𝜆 depends on the proportion of points crossing the efficiency frontier and therefore is sensitive to the sample size 𝑛 Consequently a finite grid of 𝜆 values are evaluated Using SDEACRS for fewer points crossing the frontier the median value of the estimated 𝜆 are closer to the real value and the mean value is slightly larger Using SDEA NDRS and SDEAVRS the median values are closer to the real values but the mean values are overestimated As mentioned the 𝜆 estimate using maximum likelihood is problematic and usually requires large data sets On the contrary the estimate of the number of points crossing the efficiency frontier has shown promising results To illustrate the effect of the sample size Tables 57 shows the simulation study results for the estimates of 𝜆 and 𝜎2 using a large data set with 𝑛 122 observations The simulated values of 𝜆 are identical to the simulated values using 𝑛 61 Consequently the largest the sample size the larger the number of points crossing the frontier using the same 𝜆 value As expected the quantiles using a larger sample size are narrower as compared to a smaller sample size Furthermore the mean and median values are closer to the simulated true parameters using a larger sample size Table 4 shows that using a larger sample 8 MA Costa CVM Salvador and AV da Silva Decision Analytics Journal 3 2022 100061 Table 8 Estimated SDEANDRS models using the Brazilian database with different output combinations Number of outputs Outputs 𝜆 𝜎2 Number of crossing points Number of piecewise linear functions LogLikelihood 1 Weighted power 008 036 29 3 2468 Number of consumers 479 054 4 4 1617 High voltage network 265 120 7 3 7453 Underground network 008 148 29 3 11070 Aerial network 380 079 5 3 4217 2 Number of consumers and Weighted power 479 052 4 5 1462 Number of consumers and High voltage network 479 052 4 7 1450 Number of consumers and Aerial network 479 045 4 10 914 High voltage network and Aerial network 479 081 4 9 4133 Weighted power and Underground network 379 065 5 8 2865 Weighted power and High voltage network 379 046 5 7 1050 Underground network and High voltage network 176 087 10 9 6205 Underground network and Aerial network 313 067 6 6 3493 3 Weighted power Aerial network and High voltage network 380 046 5 9 1010 4 Weighted power Aerial network High voltage network and Underground network 380 045 5 21 1037 Weighted power Aerial network High voltage network and Number of consumers 1939 045 1 14 405 5 Weighted power Aerial network Weighted power Underground network and High voltage network 1940 045 1 25 400 7 Full Model 1940 036 1 57 387 642 035 3 55 158 313 034 6 48 008 140 031 12 41 270 7 DEANDRS ANEEL without weight restrictions 0 60 DEANDRS ANEEL with weight restrictions 0 61 size the mean and median values of the estimated 𝜎2 parameters are very close to the true value of 𝜎2 07 Using the Brazilian dataset Table 8 presents SDEANDRS models using different combinations of the outputs Results with minimum likelihood function values are indicated in boldface for models with different numbers of outputs Initially models using one output are presented Results show that using weighted power and underground network the estimate of the number of points crossing the frontier is 29 ie close to half of the sample size Similar results were described in the simulated study in which the estimate of the number of points crossing the frontier was 29 see Table 2 even though the real number of points was lower It is worth mentioning that most DSOs have underground network equal to zero On the contrary using number of consumers high voltage network and aerial network the estimated number of points crossing the frontier is 4 7 and 5 respectively By combining two outputs four models presented 4 points crossing the frontier one model with 6 points crossing the frontier one model with 14 points crossing the frontier and two models with 24 and 29 points crossing the frontier The latter two models have underground network as one of the outputs By combining three outputs weighted power aerial network and high voltage network results show that the estimated number of points crossing the frontier is 5 By combining four output variables two models were adjusted to indicate 5 points and 1 point crossing the frontier respectively By combining five outputs the estimated number of points crossing the frontier is 1 This preliminary analysis indicates that the more outputs included in the model the lower the estimate of the number of points crossing the frontier One may argue that the more outputs included in the model the more points are estimated on the frontier and the lower the estimate of the number of points crossing the frontier A similar behavior was found in the SDEAVRS simulation study In addition to the number of points crossing the frontier Table 8 shows the number of piecewise linear functions and the maximum likelihood value estimated for each model In general the more outputs included in the model the larger the number of piecewise linear func tions One may argue that a large number of piecewise linear functions indicates a complex frontier driven by outliers and specific outputs Surprisingly the greater the number of points crossing the frontier the lower the estimated number of piecewise linear functions indicating that allowing points to cross the efficiency frontier reduces the com plexity of the frontier Thus effects of outliers and the presence of additional outputs are reduced Additionally Table 8 shows the SDEA NDRS results using all Brazilian output variables and different numbers of points crossing the frontier If one point crosses the frontier then 57 9 MA Costa CVM Salvador and AV da Silva Decision Analytics Journal 3 2022 100061 Fig 4 Comparison between cost efficiencies using the Brazilian DEANDRS model with weight restrictions the procedure using DEANDRS Bootstrap Eq 2 and the SDEA model using two outputs and one crossing point Fig 5 Comparison between cost efficiencies using the Brazilian DEANDRS model with weight restrictions the procedure using DEANDRS Bootstrap Eq 2 and the SDEA model using seven outputs and twelve crossing point piecewise linear functions are estimated whereas if 12 points cross the frontier then 41 piecewise linear functions are estimated Finally Table 8 also shows the number of piecewise linear functions using the DEANDRS Brazilian models with and without weight restrictions with no points crossing the frontier In this case the number of piecewise lin ear functions was estimated using an SDEA approximation as described in Section 29 Results show that the current DEANDRS ANEEL model has 61 piecewise linear functions This is the largest number of piece wise linear functions among the evaluated models which indicates that the regulator model is highly complex As mentioned the ANEEL model includes output variables with a large proportion of null observations such as the underground network extension Interestingly the SDEA estimate using the complete set of outputs and one DMU crossing the efficiency frontier has 57 piecewise linear functions and the largest value of the likelihood function Thus a slightly simpler model as compared to the current ANEEL model can be achieved letting one DMU cross the efficiency frontier On the contrary the maximum likelihood SDEA solution may also indicate a saturated model ie a model with too many output variables Fig 3 shows the Spearman correlation comparing the two models proposed by the regulator DEANDRS and adhoc and the proposed SDEA models with different number of outputs and different number of points crossing the efficiency frontier For example SDEA 21 comprises a SDEA model with two outputs and one crossing point As expected the DEANDRS and the adhoc DEANDRS Bootstrap Eq 2 mod els share the largest correlation of 99 In sequence the SDEA model with 2 outputs and one crossing point SDEA 21 has a correlations of 91 with the adhoc model Thus the SDEA 21 generates a very similar ranking of the DSOs as compared to the Brazilian regulator proposal but using a fraction of the outputs Nevertheless the more points crossing the frontier the lower the correlation as shown in models SDEA 212 and SDEA 26 Similarly the greater the number of outputs in the SDEA model the lower the Spearman correlation as shown in SDEA 41 and SDEA 46 Finally the correlations using the complete number of outputs 7 and using varying numbers of crossing points are the lowest Using seven outputs only the SDEA model with twelve points crossing the frontier SDEA 712 shows the greatest correlation of 80 As mentioned the adhoc benchmarking model proposed by the Brazilian regulator applies weightrestrictions in order to manage the large number of outputs As shown in Table 8 the adhoc model has the greatest number of piecewise linear functions therefore it is the most complex model as compared to the proposed SDEA models Fig 4 compares the cost efficiencies of the DEANDRS and adhoc models proposed by ANEEL and the SDEA model with two outputs and one crossing point The latter model achieved the largest Spearman correlation with the adhoc model Results of DEANDRS ANEEL and SDEA 21 are very similar A major difference is shown for the DSO with the largest DEANDRS cost efficiency As shown in Fig 4 the adhoc model generates greater cost efficiencies with 11 DSOs having efficiencies greater than 100 Fig 5 compares the final benchmarking model proposed by ANEEL adhoc and the proposed SDEA model using all the original seven outputs and with twelve points crossing the efficiency frontier In general the ANEEL model achieves greater efficiency scores for most DSOs as compared to the SDEA 712 For two DSOs the SDEA 712 gen erates cost efficiencies much greater than the ANEEL superefficiencies ie efficiencies greater than 100 estimated using the ANEEL adhoc 10 MA Costa CVM Salvador and AV da Silva Decision Analytics Journal 3 2022 100061 Table 9 Comparison of observed operational expenditures OPEX and regulated OPEX using the adhoc DEANDRS model proposed by the Brazilian regulator and the proposed SDEA model with seven outputs and twelve points crossing the efficiency frontier DSO OPEX adhoc SDEA 712 adhoc OPEX SDEA 712 OPEX NOVA PALMA R 576258820 119 100 R 685747996 R 576259010 RGE R 30402993800 118 114 R 35875532684 R 34627475735 MUXFELDT R 230474820 118 100 R 271960288 R 230474996 PIRATININGA R 30081476480 118 100 R 35496142246 R 30081498271 COELCE R 57199282750 116 134 R 66351167990 R 76428025318 CPFL PAULISTA R 79372087810 116 102 R 92071621860 R 81269509561 JAGUARI R 1232481870 116 100 R 1429678969 R 1232482853 ELEKTRO R 52299130590 116 100 R 60666991484 R 52299161185 ETO R 25056717330 110 134 R 27562389063 R 33491153974 JOAO CESA R 229486700 108 100 R 247845636 R 229486752 EMT R 53223392340 106 109 R 56416795880 R 58158801358 COSERN R 29444620360 100 103 R 29444620360 R 30403252677 BANDEIRANTE R 35221142630 100 101 R 35221142630 R 35503723700 SANTA MARIA R 3478948760 100 100 R 3478948760 R 3478950529 EBO R 4478072170 100 100 R 4478072170 R 4478074140 EMS R 34507371570 100 100 R 34507371570 R 34507384427 CEMAR R 49734257790 100 100 R 49734257790 R 49734272516 EPB R 30463312260 100 97 R 30463312260 R 29460001023 ELETROPAULO R 145782676800 100 94 R 145782676800 R 137178551980 CPEE R 1736400160 100 93 R 1736400160 R 1614169620 RGE SUL R 32597941540 100 93 R 32597941540 R 30301873196 CSPE R 2126942230 100 92 R 2126942230 R 1960950812 ESCELSA R 33622453890 100 92 R 33622453890 R 30902426567 MOCOCA R 1197028620 100 90 R 1197028620 R 1082156986 SANTA CRUZ R 5203496160 100 84 R 5203496160 R 4356132345 VALE PARANAPANEMA R 4798164670 100 83 R 4798164670 R 3984428517 LIGHT R 92288232390 97 77 R 89519585418 R 71218355355 CELESC R 83757856050 96 110 R 80407541808 R 92040577450 EMG R 12632548290 96 100 R 12127246358 R 12632549690 CFLO R 1579596510 95 113 R 1500616685 R 1784404716 COELBA R 134127553200 95 90 R 127421175540 R 120866843824 CAIUÁ R 6678217050 95 73 R 6344306198 R 4898254632 NACIONAL R 3271839670 95 72 R 3108247687 R 2352801688 COPEL R 132679630000 94 86 R 124718852200 R 113872551449 CEMIG R 226048357700 92 86 R 207964489084 R 193442832991 CHESP R 1561719750 91 77 R 1421164973 R 1200867441 ESE R 18588739900 87 90 R 16172203713 R 16794949489 CEPISA R 39325285610 87 83 R 34212998481 R 32574848768 CELPE R 86111720240 87 83 R 74917196609 R 71248606139 CELG R 94840394260 87 82 R 82511143006 R 78055151086 BRAGANTINA R 4799195730 84 69 R 4031324413 R 3289874328 CELPA R 67812417650 81 77 R 54928058297 R 52294136118 AMPLA R 65825731710 80 74 R 52660585368 R 48954605076 SULGIPE R 4302763440 78 85 R 3356155483 R 3670671444 COOPERALIANA R 1318298290 77 85 R 1015089683 R 1114071651 CEB R 38149098600 76 100 R 28993314936 R 38149127667 ELETROCAR R 1621077490 76 63 R 1232018892 R 1022661749 FORCEL R 428540680 75 100 R 321405510 R 428540680 HIDROPAN R 756870080 75 91 R 567652560 R 692465025 IGUAU R 1760634780 72 64 R 1267657042 R 1122489421 DEMEI R 1138935880 70 55 R 797255116 R 627863463 COCEL R 2053821780 68 76 R 1396598810 R 1556234468 CEAL R 32408803360 68 66 R 22037986285 R 21350289613 ENF R 3574243120 66 68 R 2359000459 R 2442634338 ELETROACRE R 11488867240 65 59 R 7467763706 R 6797334818 URUSSANGA R 620435650 63 88 R 390874460 R 545127994 CERON R 31950532910 63 61 R 20128835733 R 19468313506 CEEE R 64351596010 57 52 R 36680409726 R 33300806507 AMAZONAS R 46812068750 44 43 R 20597310250 R 20033128643 DMED R 4684459720 39 48 R 1826939291 R 2259627983 BOA VISTA R 9165672110 37 29 R 3391298681 R 2677265393 TOTAL R 2072812368520 R 1919263006165 R 1842351542648 model One may claim that these two largest efficiency costs represent outliers It is worth mentioning that the proposed SDEA model does not include weight restrictions and most of the DSOs crossing the frontier do not achieve large cost efficiencies as compared to the ANEEL ad hoc proposal Finally Table 9 shows the original operational costs and the regulated costs estimated by the adhoc ANEEL and the proposed SDEA 712 models Highlighted cells show regulated OPEX greater than observed OPEX and efficiency costs greater than 100 Results show that using the adhoc model the total regulated costs comprise 93 of the total original costs Whereas using the proposed SDEA 712 model the total regulated costs comprise 89 of the total original costs The difference comprises a total value of R 76911463517 or US 23665065697 considering an exchange rate of R 325US1 in December 30 2016 according to the Federal Reserve website Therefore the proposed SDEA 712 model rewards a few DSOs with a lower value of the total regulated costs as compared to the current Brazilian DEA adhoc model 11 MA Costa CVM Salvador and AV da Silva Decision Analytics Journal 3 2022 100061 4 Conclusion The traditional DEA and SFA benchmarking models do not estimate an efficiency cost greater than 100 The SFA model applies a com pound error structure in which the socalled noise component allows points to cross the efficiency frontier Nevertheless the original SFA cost efficiency estimates are below 100 On the contrary the Brazilian regulator has proposed an adhoc procedure to allow DSOs to achieve cost efficiencies greater than 100 As shown in the present paper the Brazilian adhoc procedure mimics an SDEA model in which the extent to which a DSO crosses the frontier is counted as an operating cost prize for overefficiency In practice the Brazilian regulator assumes that the noise component is meaningful and should not be counted as pure random error On the contrary it suggests that the compound error structure is the sum of inefficiency and a symmetric random variable component that allows DSOs to cross the frontier As mentioned for those DSOs crossing the frontier a reward is given Thus one may suggest that the random noise component should be reevaluated as random reward component It is worth mentioning that the Brazilian cost efficiency model is in effect and has been applied since 2015 to estimate efficient operating costs for the Brazilian DSOs In summary this work has successfully proposed an algorithm to es timate a SDEA model using maximum likelihood The proposed model can replace the complex adhoc procedure the Brazilian regulator is current using to estimate efficient costs Properties of the adjusted SDEA model such as the estimated number of points crossing the frontier and the number of piecewise linear functions can be used as complexity measures By allowing points to cross the frontier the effects of potential outliers is minimized thus reducing the complexity of the model Declaration of competing interest The authors declare that they have no known competing finan cial interests or personal relationships that could have appeared to influence the work reported in this paper Acknowledgments The authors thank CNPq Brazil Grant number PQ3031192019 5 and CAPES Brazil PROBRAL Process number 888813707962019 01 for financial support References 1 MA Costa ALM Lopes GBB de Pinho Matos Statistical evaluation of Data Envelopment Analysis versus COLS CobbDouglas benchmarking models for the 2011 Brazilian tariff revision SocioEcon Plan Sci 49 2015 4760 2 AV da Silva MA Costa ALM Lopes GM do Carmo A close look at second stage data envelopment analysis using compound error models and the tobit model SocioEcon Plan Sci 65 2019 111126 3 ALM Lopes B de Almeida Vilela MA Costa EA Lanzer Critical evaluation of the performance assessment model of Brazilian electricity distribution companies Rev Gestão Tecnol 16 3 2016 530 4 VV Podinovski Production tradeoffs and weight restrictions in data envelopment analysis J Oper Res Soc 55 12 2004 13111322 5 MA Costa LB Mineti VD Mayrink ALM Lopes Bayesian detection of clusters in efficiency score maps An application to Brazilian energy regulation Appl Math Model 68 2019 6681 6 GDR Gil MA Costa ALM Lopes VD Mayrink Spatial statistical methods applied to the 2015 Brazilian energy distribution benchmarking model Ac counting for unobserved determinants of inefficiencies Energy Econ 64 2017 373383 7 L Simar PW Wilson Sensitivity analysis of efficiency scores How to bootstrap in nonparametric frontier models Manage Sci 44 1 1998 4961 8 P Bogetoft L Otto Benchmarking with Dea Sfa and R vol 157 Springer Science Business Media 2010 9 ANEEL NT 662015SRMSGTANEEL Metodologia de Custos Operacionais 2015 URL wwwaneelgovbr 10 D Aigner CK Lovell P Schmidt Formulation and estimation of stochastic frontier production function models J Econometrics 6 1 1977 2137 11 RD Banker Stochastic Data Envelopment Analysis CarnegieMellon University Pittsburgh 1986 12 RD Banker A Maindiratta Maximum likelihood estimation of monotone and concave production frontiers J Prod Anal 3 4 1992 401415 13 T Kuosmanen M Kortelainen Stochastic nonsmooth envelopment of data semi parametric frontier estimation subject to shape constraints J Prod Anal 38 1 2012 1128 14 WH Greene Maximum likelihood estimation of econometric frontier functions J Econometrics 13 1 1980 2756 15 T Kuosmanen A Saastamoinen T Sipiläinen What is the best practice for benchmark regulation of electricity distribution Comparison of DEA SFA and StoNED methods Energy Policy 61 2013 740750 16 S Jradi J Ruggiero Stochastic data envelopment analysis A quantile regression approach to estimate the production frontier European J Oper Res 278 2 2019 385393 17 ANEEL NT 662017SRMANEEL Abertura de Audiência Pública para at ualização dos parâmetros relacionados à definição dos Custos Operacionais Regulatórios 2017 URL wwwaneelgovbr 18 A Charnes WW Cooper E Rhodes Measuring the efficiency of decision making units European J Oper Res 2 6 1978 429444 19 RD Banker A Charnes WW Cooper Some models for estimating technical and scale inefficiencies in data envelopment analysis Manage Sci 30 9 1984 10781092 20 WD Cook K Tone J Zhu Data envelopment analysis Prior to choosing a model Omega 44 2014 14 21 AE LaPlante J Paradi Evaluation of bank branch growth potential using data envelopment analysis Omega 52 2015 3341 22 HW Lampe D Hilgers Trajectories of efficiency measurement A bibliometric analysis of DEA and SFA European J Oper Res 240 1 2015 121 23 W Meeusen J van Den Broeck Efficiency estimation from CobbDouglas production functions with composed error Internat Econom Rev 1977 435444 24 J Jondrow CK Lovell IS Materov P Schmidt On the estimation of technical inefficiency in the stochastic frontier production function model J Econometrics 19 23 1982 233238 25 LR Christensen DW Jorgenson LJ Lau Transcendental logarithmic utility functions Am Econ Rev 65 3 1975 367383 26 SC Kumbhakar CK Lovell Stochastic Frontier Analysis Cambridge University Press 2003 27 RE Stevenson Likelihood functions for generalized stochastic frontier estimation J Econometrics 13 1 1980 5766 28 G Casella RL Berger Statistical Inference vol 2 Duxbury Pacific Grove CA 2002 29 N Sartori Bias prevention of maximum likelihood estimates for scalar skew normal and skew t distributions J Statist Plann Inference 136 12 2006 42594275 30 K Tone Advances in DEA Theory and Applications With Extensions to Forecasting Models John Wiley Sons 2017 31 C Daraio L Simar Conditional nonparametric frontier models for convex and nonconvex technologies a unifying approach J Prod Anal 28 12 2007 1332 32 KC Land CK Lovell S Thore Chanceconstrained data envelopment analysis Manag Decis Econ 14 6 1993 541554 33 A Azzalini The SkewNormal and Related Families vol 3 Cambridge University Press 2013 34 CL Bayes MD Branco Bayesian inference for the skewness parameter of the scalar skewnormal distribution Braz J Prob Stat 2007 141163 35 R de Barros Mesquita Regulação de custos de distribuição de energia elétrica uma análise comparativa das abordagens de benchmarking utilizadas em países europeus e latinoamericanos Universidade Federal de Minas Gerais 2017 36 J Kiefer Sequential minimax search for a maximum Proc Amer Math Soc 4 3 1953 502506 37 Y Wang S Wang C Dang W Ge Nonparametric quantile frontier estimation under shape restriction European J Oper Res 232 3 2014 671678 12 Stochastic Data Envelopment Analysis applied to the 2015 Brazilian energy distribution benchmarking model Análise do Artigo Marcelo Azevedo Costa Professor Titular no Departamento de Engenharia de Produção da UFMG e membro do Programa de Pósgraduação em Engenharia de Produção linha Modelagem Estocástica e Simulação da mesma instituição Possui graduação em Engenharia Elétrica pela Universidade Federal de Minas Gerais 1999 doutorado em Engenharia Elétrica pela Universidade Federal de Minas Gerais 2002 na área de Inteligência Computacional pósdoutorado pela Harvard Medical School Harvard Pilgrim Health Care 2007 na área de Estatística Espacial e Vigilância Epidemiológica e pósdoutorado pela Linköping University Suécia na área de Análise Estatística Diagnóstico e Detecção de Faltas em Ambientes Industriais Atualmente é professor pesquisador do Laboratório de Apoio a Decisão e Confiabilidade LADECUFMG atuando em projetos na área de modelos de regulação do setor elétrico professor colaborador do Laboratório de Gestão de Serviços Ambientais LAGESAUFMG e membro do comité gestor e subcoordenador do Centro de Sensoriamento Remoto CSRUFMG Já ministrou minicursos e palestras em empresas como CEMIG Eletropaulo COPASA e Itaú Possui publicações em importantes revistas internacionais como SocioEconomic Planning Sciences IEEE Transactions on Power Delivery Statistical Methods in Medical Research PLOS One dentre outras É autor do livro Tópicos em Ciência dos Dados Introdução dos Modelos Paramétricos e suas aplicações utilizando o R É revisor de periódicos internacionais e nacionais além de possuir capítulos de livros publicados em língua inglesa É coordenador de projeto de PD CEMIG e atua como pesquisador em projetos de PD Orienta alunos de graduação especialização mestrado e doutorado nos temas estatística aplicada modelos estatísticos aplicados ao setor elétrico análise de redes estatística espacial análise de séries temporais teoria e aplicações de redes neurais artificiais Contexto do trabalho Mercado de Energia Elétrica Brasileiro é regulamentado pela ANEEL devido à exclusividade das empresas Ao todo o trabalho analisou 61 DSO Distribution System Operator A cada ciclo as tarifas são revisadas Empresas Eficientes podem aumentar o valor da energia como um bônus pela eficiência Como medir a eficiência A fórmula mais simples é dada por Eficiência outputinput Por exemplo Um carro faz 12kmL e outro faz 4 kmL qual o mais eficiente Como medir a eficiência com mais de 1 inputs Como medir a eficiência com DSOs tão diferentesEx São Paulo e Amazonas Modelo Atual O modelo atual utilizado pela ANEEL é o DEA Análise Envoltória de Dados utiliza retornos não decrescentes de escala com custos operacionais como variável de entrada e o número de consumidores consumo de energia ponderado extensão de rede em alto nível extensão de rede em baixo nível extensão de rede subterrânea perdas não técnicas e duração da interrupção de energia como variáveis de saída DEA O objetivo da DEA é determinar quão eficientemente uma unidade utiliza seus insumos para produzir seus resultados em comparação com as outras unidades no conjunto Essa análise é realizada por meio da criação de um envelope que envolve as unidades mais eficientes definindo assim o padrão de eficiência máxima alcançável A DEA é amplamente utilizada em avaliações de desempenho e benchmarking em setores como indústria saúde educação e serviços públicos Ela oferece uma abordagem não paramétrica o que significa que não requer a especificação de uma função de produção subjacente tornandoa flexível e aplicável em diversas situações Illustration of Production Function Output Input Table 1 Tradeoffs ie weight restrictions between input and outputs variables imposed by ANEEL in the 2015 DEANDRS model Tradeoffs Lower and upper bounds weight restrictions Input versus Network Distribution 580 wvariationw 2200 Underground Network versus 100 winddownwvariation 200 Network Distribution High Level Network versus 040 wdeephood winddown 100 Network Distribution Input versus Total number of consumers 30 wconsw 145 Input versus Delivered MWh 1 wdelivw 60 Input versus NonTechnical Losses 10 wnontechnicalw 150 Input versus Interrupted services winterruptw 2 Procedimento da ANEEL Após utilizar o DEA a ANEEL faz procedimentos adhoc utilizando uma Eficiência de referência Também é utilizado o Bootstrap cria várias amostras com reposição a partir dos dados existentes para simular a distribuição do estimador O resultado final após todos esses procedimentos é que algumas DSOs possuem valor de eficiência maior que 100 A ANEEL diz que este processo serve para corrigir possíveis falhas do DEA Alguns pesquisadores podem afirmar que se trata de uma benevolência por parte da agência para aumentar os escores de eficiência θifinal minmax1 θiinf θref θisup θref Alternativas para o procedimento atual Técnicas de SFA Stochastic Frontier Analysis permitem que DSOs cruzem a fronteira de eficiência Alguns autores sugerem um modelo de SDEA que permite um número determinado de DSOs a cruzar a fronteira que deve ser escolhido pelo analista O número ótimo de DSOs nesse método seria muito complexo de ser determinado SDEA Original min i1n τ e1i 1 τe2i subject to yi αi β1ix1i βkixki e1i e2i i 1 N αi β1ix1i βkixki αj β1jx1i βkjxkj i j 1 N βki 0 k 1 K i 1 N e1i e2i 0 i 1 N SDEA Proposto min i1n ei subject to yi αi β1ix1i βkixki ei i 1 N αi β1ix1i βkixki αj β1jx1i βkjxkj i j 1 N βki 0 k 1 K i 1 N ei 0 i 1 N Algoritmo para estimar quantas DSOs ultrapassam a fronteira Exemplo prático a Step 1 The cost frontier and the points located on the frontier are estimated b Step 2 The points crossing the frontier are selected gradually using likelihood maximization c Step 3 The optimal proportion of points is chosen based on the maximum likelihood value d Final cost frontier solution Resultados SDEA 76 098 095 068 072 062 062 085 076 075 SDEA 71 094 07 075 062 061 065 076 076 SDEA 712 074 077 07 07 072 079 08 SDEA 46 097 094 094 093 098 086 SDEA 41 09 091 092 086 086 SDEA 26 099 097 09 089 SDEA 212 098 09 089 SDEA 21 091 091 adhoc 099 Comparação das Eficiências ANEEL DEANDRS ANEEL adhoc SDEA with 7 outputs and 12 crossing points cost efficiencies 12 10 08 06 04 02 00 D43 D61 D35 D04 D16 D33 D25 D29 D60 D23 D12 D55 D06 D13 D11 D56 D31 D46 D09 D10 D05 DSO index 100 Comparison of observed operational expenditures OPEX and regulated OPEX using the adhoc DEANDRS model proposed by the Brazilian regulator and the proposed SDEA model with seven outputs and twelve points crossing the efficiency frontier DSO OPEX adhoc SDEA 712 adhoc OPEX SDEA 712 OPEX NOVA PALMA R 576258820 119 100 R 685747996 R 576259010 RGE R 30402993800 118 114 R 35875532684 R 34627475735 MUXFELDT R 230474820 118 100 R 271960288 R 230474996 PIRATININGA R 30081476480 118 100 R 35496142246 R 30081498271 COELCE R 57199282750 116 134 R 66351167990 R 76428025318 CPFL PAULISTA R 79372087810 116 102 R 92071621860 R 81269509561 JAGUARI R 1232481870 116 100 R 1429678969 R 1232482853 ELEKTRO R 52299130590 116 100 R 60666991484 R 52299161185 ETO R 25056717330 110 134 R 27562389063 R 33491153974 JOAO CESA R 229486700 108 100 R 247845636 R 229486752 EMT R 53223392340 106 109 R 56416795880 R 58158801358 COSERN R 29444620360 100 103 R 29444620360 R 30403252677 BANDEIRANTE R 35221142630 100 101 R 35221142630 R 35503723700 SANTA MARIA R 3478948760 100 100 R 3478948760 R 3478950529 EBO R 4478072170 100 100 R 4478072170 R 4478074140 EMS R 34507371570 100 100 R 34507371570 R 34507384427 CEMAR R 49734257790 100 100 R 49734257790 R 49734272516 EPB R 30463312260 100 97 R 30463312260 R 29460001023 ELETROPAULO R 145782676800 100 94 R 145782676800 R 137178551980 Vantagem do método proposto Transparência na eficiência respaldo estatístico e conhecer como a eficiência é formada a partir da equação de regressão linear y b x1b1 x2b2 Referência COSTA Marcelo Azevedo SALVADOR Cláudio Vítor Maquiné DA SILVA Aline Veronese Stochastic data envelopment analysis applied to the 2015 Brazilian energy distribution benchmarking model Decision Analytics Journal v 3 p 100061 2022