Gelişmiş Arama

Basit öğe kaydını göster

dc.contributor.authorSubasi, Omer
dc.contributor.authorYalcin, Gulay
dc.contributor.authorZyulkyarov, Ferad
dc.contributor.authorUnsal, Osman
dc.contributor.authorLabarta, Jesus
dc.date.accessioned2019-07-08T08:54:01Z
dc.date.available2019-07-08T08:54:01Z
dc.date.issued2017en_US
dc.identifier.citation2017 17TH IEEE/ACM INTERNATIONAL SYMPOSIUM ON CLUSTER, CLOUD AND GRID COMPUTING (CCGRID) Book Group Author(s):IEEE Book Series: IEEE-ACM International Symposium on Cluster Cloud and Grid Computing Pages: 452-457 DOI: 10.1109/CCGRID.2017.40en_US
dc.identifier.isbn978-1-5090-6611-7
dc.identifier.issn2376-4414
dc.identifier.otherAccession Number: WOS:000426912900048
dc.identifier.otherDOI: 10.1109/CCGRID.2017.40
dc.identifier.urihttp://acikerisim.agu.edu.tr/xmlui/handle/20.500.12573/72
dc.descriptionThis work is supported in part by the European Union Mont-blanc 2 Project (www.montblanc-project.eu), grant agreement no. 610402 and the FEDER funds under contract TIN2015-65316-P.en_US
dc.description.abstractFail-stop errors and Silent Data Corruptions (SDCs) are the most common failure modes for High Performance Computing (HPC) applications. There are studies that address fail-stop errors and studies that address SDCs. However few studies address both types of errors together. In this paper we propose a software-based selective replication technique for HPC applications for both fail-stop errors and SDCs. Since complete replication of applications can be costly in terms of resources, we develop a runtime-based technique for selective replication. Selective replication provides an opportunity to meet HPC reliability targets while decreasing resource costs. Our technique is low-overhead, automatic and completely transparent to the user.en_US
dc.description.sponsorshipEuropean Union Mont-blanc 2 Project - 610402 FEDER funds - TIN2015-65316-Pen_US
dc.language.isoengen_US
dc.publisherIEEE, 345 E 47TH ST, NEW YORK, NY 10017 USAen_US
dc.relation.ispartofseriesIEEE-ACM International Symposium on Cluster Cloud and Grid Computing;452-457
dc.rightsinfo:eu-repo/semantics/openAccessen_US
dc.titleDesigning and Modelling Selective Replication for Fault-tolerant HPC Applicationsen_US
dc.typeotheren_US
dc.contributor.departmentAGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümüen_US
dc.contributor.institutionauthor
dc.identifier.doi10.1109/CCGRID.2017.40
dc.relation.publicationcategoryDiğeren_US


Bu öğenin dosyaları:

Thumbnail

Bu öğe aşağıdaki koleksiyon(lar)da görünmektedir.

Basit öğe kaydını göster