Research engineer positions: Nano 2017 ESPRIT
Stream processing & pattern mining with hardware support
To apply, contact Vincent Leroy
Data production grows continuously. The development of the Internet of things and sensors produce terabits of activity traces. Pattern mining algorithms are a cornerstone of data mining. They consist in detecting frequently occurring patterns (sets, sequences, sub-graphs) In data. These patterns can later be used in many applications, including classification, recommendation and prediction. Existing approaches to pattern mining focus on batch processing, that is offline computation. However, more recent work considers stream (online) processing. Stream processing has the advantage of reading data only once, which limits the complexity at the cost of approximate results. They also allow continuous analysis, hence results can be obtained with low latency to detect anomalies in real time.
The goal of the Nano2017 ESPRIT project is to propose a hardware solution for pattern mining in high throughput data streams. This solution, which could be proposed as a support hardware card, will be able to test simultaneously the presence of a large number of patterns in the data. The benefits of such a solution are (i) to process faster streams than purely software approaches, and (ii) to use less servers to process data streams, thus reducing energy consumption.
DESIRED SKILLS AND EXPERIENCE
- A strong desire to implement systems that use the latest scientific results
- A good command of English
- Ability to work as part of a team
- Sufficient educational background to understand the science and mathematics involved in machine learning/ data mining algorithms
- Experience working in Linux/Unix environment
- Experience in C/C++ or Java
- Experience with at least one of the following: Python, Torch/Lua, Matlab
- Practical experience with machine learning, deep learning is a plus