dc.contributor.author | Subasi, Omer | |
dc.contributor.author | Yalcin, Gulay | |
dc.contributor.author | Zyulkyarov, Ferad | |
dc.contributor.author | Unsal, Osman | |
dc.contributor.author | Labarta, Jesus | |
dc.date.accessioned | 2020-02-06T06:59:08Z | |
dc.date.available | 2020-02-06T06:59:08Z | |
dc.date.issued | 2016 | en_US |
dc.identifier.issn | 1552-5244 | |
dc.identifier.other | 10.1109/CLUSTER.2016.54 | |
dc.identifier.uri | https://hdl.handle.net/20.500.12573/141 | |
dc.description.abstract | n this paper we propose a runtime-based selective task replication technique for task-parallel high performance computing applications. Our selective task replication technique is automatic and does not require modification/recompilation of OS, compiler or application code. Our heuristic, we call App_FIT, selects tasks to replicate such that the specified reliability target for an application is achieved. In our experimental evaluation, we show that App_FIT selective replication heuristic is low-overhead and highly scalable. In addition, results indicate that complete task replication is overkill for achieving reliability targets. We show that with App_FIT, we can tolerate pessimistic exascale error rates with only 53% of the tasks being replicated. | en_US |
dc.language.iso | eng | en_US |
dc.publisher | IEEE, 345 E 47TH ST, NEW YORK, NY 10017 USA | en_US |
dc.relation.ispartofseries | Book Series: IEEE International Conference on Cluster Computing; | |
dc.relation.ispartofseries | Pages: 498-505; | |
dc.rights | info:eu-repo/semantics/openAccess | en_US |
dc.title | A runtime heuristic to selectively replicate tasks for application-specific reliability targets | en_US |
dc.type | other | en_US |
dc.contributor.department | AGÜ, Mühendislik Fakültesi, Bilgisayar Mühendisliği Bölümü | en_US |
dc.contributor.institutionauthor | | |
dc.identifier.doi | 10.1109/CLUSTER.2016.54 | |
dc.relation.publicationcategory | Diğer | en_US |