The Project
Description - Objectives
HAR.S.H. aims to address significant challenges in the processing of large-scale time series collections arising from real-world applications. Large-scale time series collections are found in almost every scientific field. HAR.S.H. will design and implement an extensive collection of algorithms, data structures, and mechanisms to address the scalability problem in large-scale time series analysis, using modern and emerging hardware technologies. The algorithms, data structures, and mechanisms developed will form a powerful library, thus ensuring their easy and effective use by a wide range of applications. Specifically, HAR.S.H. aims to:
- design and develop a new generation of algorithms and data structures that enable efficient parallel/distributed similarity search in large time series collections,
- leverage modern hardware technologies by studying their impact on the performance and scalability of such software,
- enable analysis on multimodal data, including text, images, and video, through integrations using deep learning models.
Pilot Applications
HAR.S.H. will prove the value of the technology it develops through the three following pilot applications:
Pilot Application 1 – Similar document and file search. This application focuses on finding similar documents within large-scale document databases.
Pilot Application 2 – Photo analysis for enhanced travel profiles. This application focuses on analyzing photographic content to enrich travel profiles for personalized travel recommendation systems. The main goal is to maximize visitor satisfaction with a travel destination.
Pilot Application 3 – Public opinion nanalysis application. This application aims to manage the complexity and cost involved in monitoring and categorizing content from social media, providing valuable insights into public opinion across various contexts.
To meet the needs of the aforementioned applications, HAR.S.H. will innovate in the following areas:
- Compact and Descriptive Representation of Multimodal Data. HAR.S.H. aims to develop efficient techniques for processing multimodal data. The project will explore different data sources, in particular images, videos, and natural language text, which can be integrated in an end-to-end manner.
- Robust Mechanisms for High-Performance Processing of Large-Scale Time Series Collections HAR.S.H. will increase performance and robustness in answering similarity search queries over large-scale data collections by 1) leveraging the full computing power of modern platforms, 2) developing hardware-aware processing mechanisms to minimize costs and enable fast parallel and distributed processing, and 3) by devising robust techniques that support thread failures and enable fast recovery of computation after total system failures. HAR.S.H. will focus primarily on emerging memory, synchronization, and communication technologies and will study how the use of such technologies can affect data stream processing.
Technical Work Packages
The HAR.S.H. project will be implemented through the following four technical Work Packages:
Work Package 2 (WP2) will enable the transformation of diverse datasets, including images, video, and text, into unified embeddings using algorithms and deep learning models.
Work Package 3 (WP3) will provide low-level software components for time series processing on modern computing platforms, taking into account both the computational power of the platform itself and modern underlying hardware technologies.
Work Package 4 (WP4) will define the project's datasets, the needs of pilot (and other contemporary) applications, and design the HAR.S.H. user interface.
Work Package 5 (WP5) will integrate the software produced into the HAR.S.H. platform and use it to develop the pilot applications.

HAR.S.H. project diagram
Gender Equality
The proposed research is gender-neutral and, therefore, does not touch upon dimensions that are sensitive to gender-related issues. The research team, including all members of the working groups across the project's various beneficiaries, recognizes the importance of gender issues, especially in technological fields related to science and engineering. The project members are committed to striving for gender neutrality in all aspects of this (and future) project(s).
The Principal Investigator (PI), Prof. Panagiota Fatourou, is a woman who is actively involved in promoting gender equality in computer science. She is the founder and first chair of the Greek Chapter of ACM for Women in Computing and has contributed significantly to gender equality initiatives, including as chair of the evaluation committee for the Minerva Informatics Equality Award in 2018. She has also served as a member of the Management Committee of the COST EUGAIN: European Network For Gender Balance in Informatics action and as a member of the advisory committee for the European Commission-funded RESET: Redesigning Equality and Scientific Excellence Together project.
P. Fatourou has co-organized and chaired two Summits on Gender Equality in Computing (GEC 2019, GEC 2020) and has served as scientific director and chair of the organizing committee of the 1st Summer School for Women in Science, Technology, Engineering, and Mathematics (WISTEM 2019). She also contributed to the organization of ACM-W Europe Celebration of Women in Computing (womENcourage) in 2015, 2016, and 2017.
P. Fatourou has published the following related work:
- Panagiota Fatourou, Yota Papageorgiou, Vasiliki Petousi, “Women are needed in STEM: European policies and incentives”, Communications of the ACM (CACM), pp. 52-57, Vol. 62, No. 4, 2019.
- Panagiota Fatourou, Chris Hankin, and Bran Knowles, “Gender Bias in Automated Decision Making Systems”, Endorsed by the ACM Europe Technology Policy Committee, pp. 1-28, March 2021.
- The EU Mutual Learning Programme in Gender Equality – Artificial Intelligence and Gender Biases in Recruitment and Selection Processes. 12-13 November 2020. Comments paper – Greece, by P. Fatourou.
Prof. Fatourou will make every effort to support women in science and will continue contributing to achieving gender neutrality in the field of computer science.
Similarity Search in Large-Scale Time Series Collections with Hardware Awareness
HAR.S.H.: Hardware-Aware extReme-scale Similarity search
- Project Code: ΥΠ3ΤΑ-0560901
- Project Start Date: April 15, 2025
- Project End Date: 31 May, 2026
- Funding Body: Ministry of Education, Religious Affairs and Sports, Greece 2.0, National Recovery and Resilience Plan
- Host Institution: University of Crete (UoC), Specialized Research Fund Account

