With the advent of Long-Term Evolution (LTE) networks and the spread of a highly varied range ofservices, mobile operators are increasingly aware of the need to strengthen their maintenance andoperational tasks in order to ensure a quality and positive user experience. Furthermore, the co-existence of multiple Radio Access Technologies (RAT), the increase in the traffic demand and the needto provide a great variety of services are steering the cellular network toward a new scenario wheremanagement tasks are becoming increasingly complex. As a result, mobile operators are focusing theirefforts to deal with the maintenance of their networks without increasing either operationalexpenditures (OPEX) or capital expenditures (CAPEX). In this context, it is becoming necessary toeffectively automate the management tasks through the concept of the Self-Organizing Networks (SON).In particular, SON functions cover three different areas: Self-Configuration, Self-Optimization and Self-Healing. Self-Configuration automates the deployment of new network elements and their parameterconfiguration. Self-Optimization is in charge of modifying the configuration of the parameters in order toenhance user experience. Finally, Self-Healing aims reduce the impact that failures and servicesdegradation have on the end-user. To that end, Self-Healing (SH) systems monitor the network elementsthrough several alarms, measurements and indicators in order to detect outage and degraded cells,then, diagnose the cause of their problem and, finally, execute the compensation or recovery actions.Even though mobile networks are become more prone to failures due to their huge increase incomplexity, the automation of the troubleshooting tasks through the SH functionality has not been fullyrealized. Traditionally, both the research and the development of SON networks have been related toSelf-Configuration and Self-Optimization. This has been mainly due to the challenges that need to befaced when SH systems are studied and implemented. This is especially relevant in the case of faultdiagnosis. However, mobile operators are paying increasingly more attention to self-healing systems,which entails creating options to face those challenges that allow the development of SH functions.On the one hand, currently, the diagnosis continues to be manually done since it requires considerablehard-earned experience in order to be able to effectively identify the fault cause. In particular,troubleshooting experts thoroughly analyze the performance of the degraded network elements bymeans of measurements and indicators in order to identify the cause of the detected anomalies andsymptoms. Therefore, automating the diagnosis tasks means knowing what specific performanceindicators have to be analyzed and how to map the identified symptoms with the associate fault cause.This knowledge is acquired over time and it is characterized by being operator-specific based on theirpolicies and network features. Furthermore, troubleshooting experts typically solve the failures in anetwork without either documenting the troubleshooting process or recording the analyzed indicatorsalong with the label of the identified fault cause. In addition, because there is no specific regulation ondocumentation, the few documented faults are neither properly defined nor described in a standardway (e.g. the same fault cause may be appointed with different labels), making it even more difficult toautomate the extraction of the expert knowledge. As a result, this a lack of documentation and lack ofhistorical reported faults makes automation of diagnosis process more challenging.On the other hand, when the exact root cause cannot be remotely identified through the statisticalinformation gathered at cell level, drive test are scheduled for further information. These drive tests aimto monitor mobile network performance by using vehicles to personally measure the radio interfacequality along a predefined route. In particular, the troubleshooting experts use specialized testequipment in order to manually collect user-level measurements. Consequently, drive test entail a heftyexpense for mobile operators, since it involves considerable investment in time and costly resources(such as personal, vehicles and complex test equipment). In this context, the Third GenerationPartnership Project (3GPP) has standardized the automatic collection of field measurements (e.g.signaling messages, radio measurements and location information) through the mobile traces featuresand its extended functionality, the Minimization of Drive Tests (MDT). In particular, those features allowto automatically monitor the network performance in detail, reaching areas that cannot be covered bydrive testing (e.g. indoor or private zones). Thus, mobile traces are regarded as an important enabler forSON since they avoid operators to rely on those expensive drive tests while, at the same time, providegreater details than the traditional cell-level indicators. As a result, enhancing the SH functionalitiesthrough the mobile traces increases the potential cost savings and the granularity of the analysis. Hence,in this thesis, several solutions are proposed to overcome the limitations that prevent the developmentof SH with special emphasis on the diagnosis phase. To that end, the lack of historical labeled databaseshas been addressed in two main ways. First, unsupervised techniques have been used to automaticallydesign diagnosis system from real data without requiring either documentation or historical reportsabout fault cases. Second, a group of significant faults have been modeled and implemented in adynamic system level simulator in order to generate an artificial labeled database, which is extremelyimportant in evaluating and comparing the proposed solutions with the state-of- the-art algorithm. Then,the diagnosis of those faults that cannot be identified through the statistical performance indicatorsgathered at cell level is automated by the analysis of the mobile traces avoiding the costly drive test. Inparticular, in this thesis, the mobile traces have been used to automatically identify the cause of eachunexpected user disconnection, to geo-localize RF problems that affect the cell performance and toidentify the impact of a fault depending on the availability of legacy systems (e.g. Third Generation, 3G).Finally, the proposed techniques have been validated using real and simulated LTE data by analyzing itsperformance and comparing it with reference mechanisms.
展开▼