Advancing Serverless ML Training Architectures via Comparative Approach (under review)
Amine Barrak, R. Trabelssi, F Petrillo, F Jaafar.
IEEE Transactions on Parallel and Distributed Systems. Pre-print
2022
Serverless on machine learning: A systematic mapping study
A. Barrak, Fabio Petrillo, Fehmi Jaafar
IEEE Access. Pre-print
@article{DBLP:journals/access/BarrakPJ22,
author = {Amine Barrak and
F{\'{a}}bio Petrillo and
Fehmi Jaafar},
title = {Serverless on Machine Learning: \'{A} Systematic Mapping Study},
journal = {\'{IEEE} Access},
volume = {10},
pages = {99337--99352},
year = {2022},
url = {https://doi.org/10.1109/ACCESS.2022.3206366},
doi = {10.1109/ACCESS.2022.3206366},
timestamp = {Tue, 18 Oct 2022 22:17:11 +0200},
biburl = {https://dblp.org/rec/journals/access/BarrakPJ22.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Machine Learning Operations (MLOps) is an approach to managing the entire lifecycle of a machine learning model. It has evolved over the last years and has started attracting many people in research and businesses in the industry. It supports the development of machine learning (ML) pipelines typical in the phases of data collection, data pre-processing, building datasets, model training, hyper-parameters refinement, testing, and deployment to production. This complex pipeline workflow is a tedious process of iterative experimentation. Moreover, cloud computing services provide advanced features for managing ML stages and deploying them efficiently to production. Specifically, serverless computing has been applied in different stages of the machine learning pipeline. However, to the best of our knowledge, it is missing to know the serverless suitability and benefits it can provide to the ML pipeline. In this paper, we provide a systematic mapping study of machine learning systems applied on serverless architecture that include 53 relevant studies. During this study, we focused on (1) exploring the evolution trend and the main venues; (2) determining the researchers’ focus and interest in using serverless on machine learning; (3) discussing solutions that serverless computing provides to machine learning. Our results show that serverless usage is growing, and several venues are interested in the topic. In addition, we found that the most widely used serverless provider is AWS Lambda, where the primary application was used in the deployment of the ML model. Additionally, several challenges were explored, such as reducing cost, resource scalability, and reducing latency. We also discuss the potential challenges of adopting ML on serverless, such as respecting service level agreement, the cold start problem, security, and privacy. Finally, our contribution provides foundations for future research and applications that involve machine learning in serverless computing.
2021
Why do builds fail?—A conceptual replication study
Amine Barrak, Ellis E Eghan, Bram Adams, Foutse Khomh
Journal of Systems and Software. Pre-print
@article{DBLP:journals/jss/BarrakEAK21,
author = {Amine Barrak and
Ellis E. Eghan and
Bram Adams and
Foutse Khomh},
title = {Why do builds fail? - {A} conceptual replication study},
journal = {J. Syst. Softw.},
volume = {177},
pages = {110939},
year = {2021},
url = {https://doi.org/10.1016/j.jss.2021.110939},
doi = {10.1016/J.JSS.2021.110939},
timestamp = {Sat, 09 Apr 2022 12:28:16 +0200},
biburl = {https://dblp.org/rec/journals/jss/BarrakEAK21.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Previous studies have investigated a wide range of factors potentially explaining software build breakages, focusing primarily on build-triggering code changes or previous CI outcomes. However, code quality factors such as the presence of code/test smells have not been yet evaluated in the context of CI, even though such factors have been linked to problems of comprehension and technical debt, and hence might introduce bugs and build breakages. This paper performs a conceptual replication study on 27,675 Travis CI builds of 15 GitHub projects, considering the features reported by Rausch et al. and Zolfagharinia et al., as well as those related to code/test smells. Using a multivariate model constructed from nine dimensions of features, results indicate a precision (recall) ranging between 58.3% and 79.0% (52.4% and 69.6%) in balanced project datasets, and between 2.5% and 37.5% (2.5% and 12.4%) in imbalanced project datasets. Models trained on our balanced project datasets were later used to perform cross-project prediction on the imbalanced projects, achieving an average improvement of 9.3% (16.2%) in precision (recall). Statistically, the results confirm that features from the build history, author, code complexity, and code/test smell dimensions are the most important predictors of build failures.
Conferences
2024
Securing AWS Lambda: Advanced Strategies and Best Practices
A. Barrak, G. FOFE, E. KOUAM, L. MACKOWIAK and Fehmi JAAFAR
The 11th IEEE International Conference on Cyber Security and Cloud Computing (CSCLOUD 2024).
2023
SPIRT: A Fault-Tolerant and Reliable Peer-to-Peer Serverless ML Training Architecture
Amine Barrak, Mayssa Jaziri, Ranim Trabelsi, Fehmi Jaafar, Fabio Petrillo
2023 IEEE 23rd International Conference on Software Quality, Reliability and Security (QRS). Pre-print
@inproceedings{DBLP:conf/qrs/BarrakJTJP23,
author = {Amine Barrak and
Mayssa Jaziri and
Ranim Trabelsi and
Fehmi Jaafar and
F{\'{a}}bio Petrillo},
title = {\'{SPIRT:} \'{A} Fault-Tolerant and Reliable Peer-to-Peer Serverless \'{ML}
Training Architecture},
booktitle = {23rd \'{IEEE} International Conference on Software Quality, Reliability,
and Security, \'{QRS} 2023, Chiang Mai, Thailand, October 22-26, 2023},
pages = {650--661},
publisher = {\'{IEEE}},
year = {2023},
url = {https://doi.org/10.1109/QRS60937.2023.00069},
doi = {10.1109/QRS60937.2023.00069},
timestamp = {Tue, 23 Jan 2024 09:45:31 +0100},
biburl = {https://dblp.org/rec/conf/qrs/BarrakJTJP23.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
The advent of serverless computing has ushered in notable advancements in distributed machine learning, particularly within parameter server-based architectures. Yet, the integration of serverless features within peer-to-peer (P2P) distributed networks remains largely uncharted. In this paper, we introduce SPIRT, a fault-tolerant, reliable, scalable and secure serverless P2P ML training architecture. designed to bridge this existing gap. Capitalizing on the inherent robustness and reliability innate to P2P systems, we emphasized Intra-peer scalability for concurrent gradient to mitigate communication overhead from increased peer interactions. SPIRT, employs RedisAI for in-database operations, achieves an 82% reduction in model update times. This architecture showcases resilience against peer failures and adeptly manages the integration of new peers. Furthermore, SPIRT ensures secure communication between peers, enhancing the reliability of distributed machine learning tasks. Even in the face of Byzantine attacks, the system’s robust aggregation algorithms maintain high levels of accuracy. These findings illuminate the promising potential of serverless architectures in P2P distributed machine learning, offering a significant stride towards the development of more efficient, scalable, and resilient applications.
Exploring the Impact of Serverless Computing on Peer To Peer Training Machine Learning
Amine Barrak, Ranim Trabelsi, Fehmi Jaafar, Fabio Petrillo
IC2E 2023 11th IEEE International Conference on Cloud Engineering. Pre-print
@inproceedings{DBLP:conf/ic2e/BarrakTJP23,
author = {Amine Barrak and
Ranim Trabelsi and
Fehmi Jaafar and
F{\'{a}}bio Petrillo},
title = {Exploring the Impact of Serverless Computing on Peer To Peer Training
Machine Learning},
booktitle = {\'{IEEE} International Conference on Cloud Engineering, {IC2E} 2023,
Boston, MA, USA, September 25-29, 2023},
pages = {141--152},
publisher = \'{IEEE},
year = {2023},
url = {https://doi.org/10.1109/IC2E59103.2023.00024},
doi = {10.1109/IC2E59103.2023.00024},
timestamp = {Tue, 21 Nov 2023 22:37:14 +0100},
biburl = {https://dblp.org/rec/conf/ic2e/BarrakTJP23.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
The increasing demand for computational power in big data and machine learning has driven the development of distributed training methodologies. Among these, peer-to-peer (P2P) networks provide advantages such as enhanced scalability and fault tolerance. However, they also encounter challenges related to resource consumption, costs, and communication overhead as the number of participating peers grows. In this paper, we introduce a novel architecture that combines serverless computing with P2P networks for distributed training and present a method for efficient parallel gradient computation under resource constraints.
Our findings show a significant enhancement in gradient computation time, with up to a 97.34\% improvement compared to conventional P2P distributed training methods. As for costs, our examination confirmed that the serverless architecture could incur higher expenses, reaching up to 5.4 times more than instance-based architectures. It is essential to consider that these higher costs are associated with marked improvements in computation time, particularly under resource-constrained scenarios. Despite the cost-time trade-off, the serverless approach still holds promise due to its pay-as-you-go model. Utilizing dynamic resource allocation, it enables faster training times and optimized resource utilization, making it a promising candidate for a wide range of machine learning applications.
Architecting Peer-to-Peer Serverless Distributed Machine Learning Training for Improved Fault Tolerance
@article{DBLP:journals/corr/abs-2302-13995,
author = {Amine Barrak and
F{\'{a}}bio Petrillo and
Fehmi Jaafar},
title = {Architecting Peer-to-Peer Serverless Distributed Machine Learning
Training for Improved Fault Tolerance},
journal = {CoRR},
volume = {abs/2302.13995},
year = {2023},
url = {https://doi.org/10.48550/arXiv.2302.13995},
doi = {10.48550/ARXIV.2302.13995},
eprinttype = {arXiv},
eprint = {2302.13995},
timestamp = {Tue, 28 Feb 2023 14:02:05 +0100},
biburl = {https://dblp.org/rec/journals/corr/abs-2302-13995.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Distributed Machine Learning refers to the practice of training a model on multiple computers or devices that can be called nodes. Additionally, serverless computing is a new paradigm for cloud computing that uses functions as a computational unit. Serverless computing can be effective for distributed learning systems by enabling automated resource scaling, less manual intervention, and cost reduction. By distributing the workload, distributed machine learning can speed up the training process and allow more complex models to be trained. Several topologies of distributed machine learning have been established (centralized, parameter server, peer-to-peer). However, the parameter server architecture may have limitations in terms of fault tolerance, including a single point of failure and complex recovery processes. Moreover, training machine learning in a peer-to-peer (P2P) architecture can offer benefits in terms of fault tolerance by eliminating the single point of failure. In a P2P architecture, each node or worker can act as both a server and a client, which allows for more decentralized decision making and eliminates the need for a central coordinator. In this position paper, we propose exploring the use of serverless computing in distributed machine learning training and comparing the performance of P2P architecture with the parameter server architecture, focusing on cost reduction and fault tolerance.
2021
Identification of compromised IoT devices: Combined approach based on energy consumption and network traffic analysis
Fehmi Jaafar, Darine Ameyed, Amine Barrak, Mohamed Cheriet
2021 IEEE 21st International Conference on Software Quality, Reliability and Security (QRS). Pre-print Presentation
@inproceedings{DBLP:conf/qrs/JaafarABC21,
author = {Fehmi Jaafar and
Darine Ameyed and
Amine Barrak and
Mohamed Cheriet},
title = {Identification of Compromised IoT Devices: Combined Approach Based
on Energy Consumption and Network Traffic Analysis},
booktitle = {21st \'{IEEE} International Conference on Software Quality, Reliability
and Security, \'{QRS} 2021, Hainan, China, December 6-10, 2021},
pages = {514--523},
publisher = {\'{IEEE}},
year = {2021},
url = {https://doi.org/10.1109/QRS54544.2021.00062},
doi = {10.1109/QRS54544.2021.00062},
timestamp = {Wed, 16 Mar 2022 22:32:22 +0100},
biburl = {https://dblp.org/rec/conf/qrs/JaafarABC21.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
In the burgeoning age of digitalization, the Internet of Things presents a core part of the digital ecosystem. Unfortunately, as the deployment of connected devices is increasing tremendously, so are cyber-attacks. The consequences of cyber-attacks could be devastating as they gain access to sensitive data and even damages critical infrastructures. This urges the development and integration of proactive and intelligent security breach detection mechanisms in different levels of the IoT platforms including the devices themselves. Several empirical observations indicated a change in the energy consumption and network behaviour of compromised devices. Thus, we propose in this paper a machine learning based approach to identify compromised IoT devices using their energy consumption footprint and network traffic. We base our study on real data collected from real experiments using different commercially available IoT devices infected with authentic IoT botnets. Our results show that machine learning algorithms can classify correctly attacks reaching 98.40% precision for Mirai, over 99.91% for Ufonet and respectively 97.63% and 99.93% performance. Overall, our exploratory study is one of the very first of its kind to explore the energy consumption combined with network behavior analysis to detect IoT compromised devices and its outcomes will be a starting point for further research on this topic.
On the Co-evolution of ML pipelines and source code-empirical study of DVC projects
Amine Barrak, Ellis E Eghan, Bram Adams
2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). Pre-print Presentation Video
@inproceedings{DBLP:conf/wcre/BarrakEA21,
author = {Amine Barrak and
Ellis E. Eghan and
Bram Adams},
title = {On the Co-evolution of {ML} Pipelines and Source Code - Empirical
Study of {DVC} Projects},
booktitle = {28th {IEEE} International Conference on Software Analysis, Evolution
and Reengineering, {SANER} 2021, Honolulu, HI, USA, March 9-12, 2021},
pages = {422--433},
publisher = ,
year = {2021},
url = {https://doi.org/10.1109/SANER50967.2021.00046},
doi = {10.1109/SANER50967.2021.00046},
timestamp = {Sat, 09 Apr 2022 12:37:46 +0200},
biburl = {https://dblp.org/rec/conf/wcre/BarrakEA21.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
The growing popularity of machine learning (ML) applications has led to the introduction of software engineering tools such as Data Versioning Control (DVC), MLFlow and Pachyderm that enable versioning ML data, models, pipelines and model evaluation metrics. Since these versioned ML artifacts need to be synchronized not only with each other, but also with the source and test code of the software applications into which the models are integrated, prior findings on co-evolution and coupling between software artifacts might need to be revisited. Hence, in order to understand the degree of coupling between ML-related and other software artifacts, as well as the adoption of ML versioning features, this paper empirically studies the usage of DVC in 391 Github projects, 25 of which in detail. Our results show that more than half of the DVC files in a project are changed at least once every one-tenth of the project's lifetime. Furthermore, we observe a tight coupling between DVC files and other artifacts, with 1/4 pull requests changing source code and 1/2 pull requests changing tests requiring a change to DVC files. As additional evidence of the observed complexity associated with adopting ML-related software engineering tools like DVC, an average of 78% of the studied projects showed a non-constant trend in pipeline complexity.
2018
Just-in-time detection of protection-impacting changes on WordPress and MediaWiki
Amine Barrak, Marc-André Laverdière, Foutse Khomh, Le An, Ettore Merlo
Proceedings of the 28th Annual International Conference on Computer Science and Software Engineering. Pre-printThesis PresentationPresentation Bulletin
@inproceedings{DBLP:conf/cascon/BarrakLKAM08,
author = {Amine Barrak and
Marc{-}Andr{\'{e}} Laverdi{\`{e}}re and
Foutse Khomh and
Le An and
Ettore Merlo},
editor = {Iosif{-}Viorel Onut and
Andrew Jaramillo and
Guy{-}Vincent Jourdan and
Dorina C. Petriu and
Wang Chen},
title = {Just-in-time detection of protection-impacting changes on WordPress
and MediaWiki},
booktitle = {Proceedings of the 28th Annual International Conference on Computer
Science and Software Engineering, {CASCON} 2018, Markham, Ontario,
Canada, October 29-31, 2018},
pages = {178--188},
publisher = ,
year = {2018},
url = {https://dl.acm.org/citation.cfm?id=3291310},
timestamp = {Tue, 19 Feb 2019 07:16:26 +0100},
biburl = {https://dblp.org/rec/conf/cascon/BarrakLKAM08.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Access control mechanisms based on roles and privileges restrict the access of users to security sensitive resources in a multi-user software system. Unintentional privilege protection changes may occur during the evolution of a system, which may introduce security vulnerabilities; threatening user's confidential data, and causing other severe problems. In this paper, we use the Pattern Traversal Flow Analysis technique to identify definite protection differences in WordPress and MediaWiki systems. We analyse the evolution of privilege protections across 211 and 193 releases from respectively WordPress and Mediawiki, and observe that around 60% of commits affect privileges protections in both projects. We refer to these commits as protection-impacting change (PIC) commits. To help developers identify PIC commits just-in-time, we extract a series of metrics from commit logs and source code, and build statistical models. The evaluation of these models revealed that they can achieve a precision up to 73.8% and a recall up to 98.8% in WordPress and for MediaWiki, a precision up to 77.2% and recall up to 97.8%. Among the metrics examined, commit churn, bug fixing, author experiences and code complexity between two releases are the most important predictors in the models. We performed a qualitative analysis of false positives and false negatives and observe that PIC commits detectors should ignore documentation-only commits and process code changes without the comments.
Software organizations can use our proposed approach and models, to identify unintentional privilege protection changes as soon as they are introduced, in order to prevent the introduction of vulnerabilities in their systems.
The state of practice on virtual reality (VR) applications: An exploratory study on github and stack overflow
Naoures Ghrairi, Segla Kpodjedo, Amine Barrak, Fabio Petrillo, Foutse Khomh
2018 IEEE International Conference on Software Quality, Reliability and Security (QRS). Pre-print Presentation
@inproceedings{DBLP:conf/qrs/GhrairiKBPK18,
author = {Naoures Ghrairi and
Segla Kpodjedo and
Amine Barrak and
F{\'{a}}bio Petrillo and
Foutse Khomh},
title = {The State of Practice on Virtual Reality {(VR)} Applications: An Exploratory
Study on Github and Stack Overflow},
booktitle = {2018 {IEEE} International Conference on Software Quality, Reliability
and Security, {QRS} 2018, Lisbon, Portugal, July 16-20, 2018},
pages = {356--366},
publisher = ,
year = {2018},
url = {https://doi.org/10.1109/QRS.2018.00048},
doi = {10.1109/QRS.2018.00048},
timestamp = {Wed, 16 Oct 2019 14:14:57 +0200},
biburl = {https://dblp.org/rec/conf/qrs/GhrairiKBPK18.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Virtual Reality (VR) is a computer technology that holds the promise of revolutionizing the way we live. The release in 2016 of new-generation headsets from Facebook-owned Oculus and HTC has renewed the interest in that technology. Thousands of VR applications have been developed over the past years, but most software developers lack formal training on this technology. In this paper, we propose descriptive information on the state of practice of VR applications' development to understand the level of maturity of this new technology from the perspective of Software Engineering (SE). To do so, we focused on the analysis of 320 VR open source projects from Github to determine which are the most popular languages and engines used in VR projects, and evaluate the quality of the projects from a software metric perspective. To get further insights on VR development, we also manually analyzed nearly 300 questions from Stack Overflow. Our results show that (1) VR projects on GitHub are currently mostly small to medium projects, and (2) the most popular languages are JavaScript and C#. Unity is the most used game engine during VR development and the most discussed topic on Stack Overflow. Overall, our exploratory study is one of the very first of its kind for VR projects and provides material that is hopefully a starting point for further research on challenges and opportunities for VR software development.
Posters and Short Papers
2024
Best Practices for Scalable and Efficient Distributed Machine Learning with Serverless Architectures
A. Barrak
The 38th IEEE International Parallel & Distributed Processing Symposium (IPDPS 2024). Pre-print
Incorporating Serverless Computing into P2P Networks for ML Training: In-Database Tasks and Their Scalability Implications (Student Abstract)
Amine Barrak
The 38th Annual AAAI Conference on Artificial Intelligence. Pre-print
@inproceedings{DBLP:conf/aaai/Barrak24a,
author = {Amine Barrak},
editor = {Michael J. Wooldridge and
Jennifer G. Dy and
Sriraam Natarajan},
title = {Incorporating Serverless Computing into \'{P2P} Networks for \'{ML} Training:
In-Database Tasks and Their Scalability Implications (Student Abstract)},
booktitle = {Thirty-Eighth \'{AAAI} Conference on Artificial Intelligence, \'{AAAI}
2024, Thirty-Sixth Conference on Innovative Applications of Artificial
Intelligence, \'{IAAI} 2024, Fourteenth Symposium on Educational Advances
in Artificial Intelligence, \'{EAAI} 2014, February 20-27, 2024, Vancouver,
Canada},
pages = {23439--23440},
publisher = {\'{AAAI} Press},
year = {2024},
url = {https://doi.org/10.1609/aaai.v38i21.30419},
doi = {10.1609/AAAI.V38I21.30419},
timestamp = {Tue, 02 Apr 2024 16:32:10 +0200},
biburl = {https://dblp.org/rec/conf/aaai/Barrak24a.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Distributed ML addresses challenges from increasing data and model complexities. Peer to peer (P2P) networks in distributed ML offer scalability and fault tolerance. However, they also encounter challenges related to resource consumption, and communication overhead as the number of participating peers grows. This research introduces a novel architecture that combines serverless computing with P2P networks for distributed training. Serverless computing enhances this model with parallel processing and cost effective scalability, suitable for resource-intensive tasks. Preliminary results show that peers can offload expensive computational tasks to serverless platforms. However, their inherent statelessness necessitates strong communication methods, suggesting a pivotal role for databases. To this end, we have enhanced an in memory database to support ML training tasks.
The Promise of Serverless Computing within Peer-to-Peer Architectures for Distributed ML Training
Amine Barrak
The 38th Annual AAAI Conference on Artificial Intelligence - DOCTORAL CONSORTIUM. Pre-print
@inproceedings{DBLP:conf/aaai/Barrak24a,
author = {Amine Barrak},
editor = {Michael J. Wooldridge and
Jennifer G. Dy and
Sriraam Natarajan},
title = {Incorporating Serverless Computing into \'{P2P} Networks for \'{ML} Training:
In-Database Tasks and Their Scalability Implications (Student Abstract)},
booktitle = {Thirty-Eighth \'{AAAI} Conference on Artificial Intelligence, \'{AAAI}
2024, Thirty-Sixth Conference on Innovative Applications of Artificial
Intelligence, {IAAI} 2024, Fourteenth Symposium on Educational Advances
in Artificial Intelligence, \'{EAAI} 2014, February 20-27, 2024, Vancouver,
Canada},
pages = {23439--23440},
publisher = {\'{AAAI} Press},
year = {2024},
url = {https://doi.org/10.1609/aaai.v38i21.30419},
doi = {10.1609/AAAI.V38I21.30419},
timestamp = {Tue, 02 Apr 2024 16:32:10 +0200},
biburl = {https://dblp.org/rec/conf/aaai/Barrak24a.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
My thesis focuses on the integration of serverless computing with Peer to Peer (P2P) architectures in distributed Machine Learning (ML). This research aims to harness the decentralized, resilient nature of P2P systems, combined with the scalability and automation of serverless platforms. We explore using databases not just for communication but also for indatabase model updates and gradient averaging, addressing the challenges of statelessness in serverless environments.
2022
On securing the communication in IoT infrastructure using elliptic curve cryptography
Hugo Bourreau, Emeric Guichet, Amine Barrak, Benoît Simon, Fehmi Jaafar
2022 IEEE 22nd International Conference on Software Quality, Reliability, and Security Companion (QRS-C). Pre-print
@inproceedings{DBLP:conf/qrs/BourreauGBSJ22,
author = {Hugo Bourreau and
Emeric Guichet and
Amine Barrak and
Beno{\^{\i}}t Simon and
Fehmi Jaafar},
title = {On Securing the Communication in IoT Infrastructure using Elliptic
Curve Cryptography},
booktitle = {22nd \'{IEEE} International Conference on Software Quality, Reliability,
and Security, \'{QRS} 2022 - Companion, Guangzhou, China, December 5-9,
2022},
pages = {758--759},
publisher = {\'{IEEE}},
year = {2022},
url = {https://doi.org/10.1109/QRS-C57518.2022.00121},
doi = {10.1109/QRS-C57518.2022.00121},
timestamp = {Sat, 22 Apr 2023 17:02:06 +0200},
biburl = {https://dblp.org/rec/conf/qrs/BourreauGBSJ22.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
Internet of Things (IoT) is widely present nowadays, from businesses to connected houses, and more. IoT is considered a part of the Internet of the future and will comprise billions of intelligent communication. These devices transmit data from sensors to entities like servers to perform suitable responses. The problem of securing these data from cyberattacks increases due to the sensitive information it contains. In addition, studies have shown that most of the time data transiting in IoT devices does not apply encrypted communication. Thus, anyone has the ability to listen to or modify the information. Encrypting communications seems mandatory to secure networks and data transiting from sensors to servers. In this paper, we propose an approach to secure the transmission and the storage of data in IoT using Elliptic Curve Cryptography (ECC). The proposed method offers a high level of security at a reasonable computational cost. Indeed, we present an adequate architecture that ensures the use of a state-of-the-art cryptography algorithm to encrypt sensitive data in IoT.