Naslov (srp)

Model predikcije otkaza magnetnih diskova zasnovan na detekciji anomalija

Autor

Đurašević, Slađana, 1985-

Doprinosi

Luković, Vanja, 1976-
Đorđević, Borislav, 1964-
Protić, Jelica, 1962-
Milošević, Marjan, 1979-
Jovanović, Željko, 1982-

Opis (srp)

-

Opis (eng)

Čovečanstvo svojim aktivnostima generiše ogromne količine digitalnih podatakačija se veličina iz godine u godinu ubrzano povećava. Većina ovih podataka danasspecijalizovanim objektima centara podataka u kojima se nalaze računarski sistemikoji se koriste za deljenje aplikacija i podataka. U njima se za skladištenje podatakaprvenstveno koriste magnetni diskove na kojima je uskladišteno oko 90% ukupnekoličine podataka, dok se preostalih 10% podataka čuva na poluprovodničkimdiskovima. Glavna prednost magnetnih diskova je njihov veliki kapacitet i niska cenauskladištenih podataka, što ih čini idealnim za opšte skladištenje podataka irezervne kopije podataka. Elektromehanički dizajn magnetnih diskova čini ihpodložnijim otkazima u odnosu na ostale komponente računarskog sistema. Kaoposledica otkaza magnetnog diska najčešće dolazi do gubitka korisničkih podataka,čija ekonomska vrednost značajno nadmašuje cenu samog diska.Upotrebom SMART (eng. Self-Monitoring, Analysis and Reporting Technology)tehnologije, računarski sistem je u stanju da upozori korisnika ako je neki od radnihparametara diska odstupi od unapred definisane vrednost praga. Metode mašinskogučenja koriste prednost zavisnosti između više SMART atributa kako bi sepoboljšala stopa predviđanja otkaza diskova. U ovoj doktorskoj disertacijipredstavljen je model predikcije otkaza diskova koji je zasnovan na metodi detekcijeanomalija. Rad diska može biti predstavljen u višedimenzionalnom prostoru kao niztačaka čija je pozicija definisana vrednostima njegovih SMART atributa. Tačkepodataka koje opisuju regularni rad diska će imati tendenciju da se grupišu okoodređene tačke definisane kao srednja vrednost ili centar mase, dok će tačke podatakasa neispravnih diskova obično biti rasute dalje oko ovog centra mase. UpotrebomMahalanobisovog rastojanja, moguće je izmeriti udaljenost, izraženu u standardnimodstupanjima, tačke podataka u višedimenzionalnom prostoru u odnosu na centar mase,čime se uklanja uticaj skaliranja i korelacije između atributa. Korišćenjem podesivegranice odlučivanja obučen je model za detekciju anomalija tako da predvidi otkaze sanajvećom mogućom stopom detekcije, pri čemu minimizira broj lažnih detekcija.Realizovani model testiran je na CMRR skupu podataka tako što je metodomrekurzivne eliminacije atributa kreiran optimalni skup od sedam najznačajnijihatributa. Na pomenutom skupu podataka na deset nasumičnih testova ostvarena jeprosečna stopa detekcije otkaza od 96.11% , pri čemu nije bilo pogrešnih detekcijaotkaza. Predloženi model je predvideo više od 80% kvarova, 24 sata pre njihovogstvarnog nastanka, što omogućava blagovremenu izradu rezervne kopije podataka štoopravdava njegovu primenu u praksi. Model za detekciju anomalija testiran je i napoluprovodničkim diskovima na kojima je takođe postigao visok nivo prediktivnihperformansi.

Opis (eng)

Humanity generates enormous amounts of digital data through its activities, the size ofwhich is increasing rapidly from year to year. Most of this data is now stored in specialized datacenter facilities that house computer systems used to share applications and data. Magneticdisks are primarily used for data storage, with about 90% of the total amount of data stored,while the remaining 10% of the data is stored on semiconductor disks. The main advantage ofmagnetic disks is their large capacity and low cost of stored data, which makes them ideal forgeneral data storage and data backup. The electromechanical design of magnetic disks makesthem more susceptible to failure than other components of a computer system. The failure of amagnetic disk most often leads to the permanent loss of user data, the economic value of whichsignificantly exceeds the cost of the magnetic disk itself.Using SMART (Self-Monitoring, Analysis and Reporting Technology) technology, acomputer system can alert the user if any of the disk's operating parameters deviate from apredefined threshold value. Machine learning methods take advantage of the dependenciesbetween multiple SMART attributes to improve the disk failure prediction rate. This doctoraldissertation presents a disk failure prediction model based on the anomaly detection method.The operation of a disk can be presented in a multidimensional space with a series of pointswhose positions are defined by the values of its SMART attributes. Data points describingregular disk operation will tend to cluster around a certain point defined as the mean value orcenter of mass, while data points from failed disks will usually be scattered further around thiscenter of mass. Using the Mahalanobis distance, it is possible to measure the distance of datapoints in a multidimensional space from a point representing the center of mass expressed instandard deviations, thereby removing the influence of scaling and correlation betweenattributes. Using an adjustable decision boundary, an anomaly detection model was trained topredict failures with the highest possible detection rate, while minimizing the number of falsedetections.The implemented model was tested on the CMRR dataset by creating an optimal set of theseven most significant attributes using the recursive attribute elimination method. On theaforementioned dataset, an average failure detection rate of 96.11% was achieved in ten randomtests, with no false detections of failures. The proposed model predicted more than 80% offailures 24 hours before their actual occurrence, which allows for timely data backup, whichjustifies its application in practice. The anomaly detection model was also tested onsemiconductor disks, where it also achieved a high level of predictive performance. , Humanity generates enormous amounts of digital data through its activities, the size ofwhich is increasing rapidly from year to year. Most of this data is now stored in specialized datacenter facilities that house computer systems used to share applications and data. Magneticdisks are primarily used for data storage, with about 90% of the total amount of data stored,while the remaining 10% of the data is stored on semiconductor disks. The main advantage ofmagnetic disks is their large capacity and low cost of stored data, which makes them ideal forgeneral data storage and data backup. The electromechanical design of magnetic disks makesthem more susceptible to failure than other components of a computer system. The failure of amagnetic disk most often leads to the permanent loss of user data, the economic value of whichsignificantly exceeds the cost of the magnetic disk itself.Using SMART (Self-Monitoring, Analysis and Reporting Technology) technology, acomputer system can alert the user if any of the disk's operating parameters deviate from apredefined threshold value. Machine learning methods take advantage of the dependenciesbetween multiple SMART attributes to improve the disk failure prediction rate. This doctoraldissertation presents a disk failure prediction model based on the anomaly detection method.The operation of a disk can be presented in a multidimensional space with a series of pointswhose positions are defined by the values of its SMART attributes. Data points describingregular disk operation will tend to cluster around a certain point defined as the mean value orcenter of mass, while data points from failed disks will usually be scattered further around thiscenter of mass. Using the Mahalanobis distance, it is possible to measure the distance of datapoints in a multidimensional space from a point representing the center of mass expressed instandard deviations, thereby removing the influence of scaling and correlation betweenattributes. Using an adjustable decision boundary, an anomaly detection model was trained topredict failures with the highest possible detection rate, while minimizing the number of falsedetections.The implemented model was tested on the CMRR dataset by creating an optimal set of theseven most significant attributes using the recursive attribute elimination method. On theaforementioned dataset, an average failure detection rate of 96.11% was achieved in ten randomtests, with no false detections of failures. The proposed model predicted more than 80% offailures 24 hours before their actual occurrence, which allows for timely data backup, whichjustifies its application in practice. The anomaly detection model was also tested onsemiconductor disks, where it also achieved a high level of predictive performance.

Jezik

srpski

Datum

2025

Licenca

Creative Commons licenca
Ovo delo je licencirano pod uslovima licence
Creative Commons CC BY-SA 3.0 AT - Creative Commons Autorstvo - Deliti pod istim uslovima 3.0 Austria License.

http://creativecommons.org/licenses/by-sa/3.0/at/legalcode

Identifikatori