Research Group:

Eric Hoang: Project “ROXANNE”, PhD Topic: “Representation learning for Attributed Network Analysis”.
Hubert Truchan: Project “KI-Trainer”
Monty-Maximilian Zühlke: Coordination “Zukunftslabor Gesellschaft und Arbeit”
Raneen Younis: Project “Haisem-Lab”
Ambreen Zaina: Project “Zukunftslabor Gesellschaft und Arbeit”

Former PhD supervision

Nam Khanh Tran (Graduated 2019) PhD Title: “Representation and Contextualization for Document Understanding”
Markus Rokicki (L3S)
Qiong Bu (Southampton)
Neal Reeves (Southampton)

Supervised students started PhD

Jaspreet Singh (L3S)
Tetiana Tolmachova (L3S)
Philipp Kemkes (L3S)
Tristan Wehrmaker (L3S)


WS 18/19, WS 19/20 Foundations of Human Computation and Crowdsourcing Full own lecture with Dr. Ujwal Gadiraju at the Leibniz University Hannover. Content overview.

16 May 2019 Advanced Methods of Information retrieval. IR evaluation methods using Human Computation (invited lecture) link

2018-2019 Applied Machine Learning Academy: Introduction to Machine Learning – 1 Day Course+Labour. Introduction to Data Science 1 Day Course+Labour. Machine learning for non- computer scientists. For participation please check ama-academy.eu

2018 Seminar Foundation of Human Computation. Lecture Series on Human Computation link

2010-2015 TA Web Science Course by Prof. Dr. techn. Wolfgang Nejdl link

2014-2019 Guest Lectures for Lecture on Advanced Methods of Information Retrieval by Dr. Elena Demidova link


2nd KEYSTONE Summerschool on Keyword Search in Big Linked Data: 2016 Sergej Zerr: Collective Intelligence: Crowdsourcing groundtruth data for large scale evaluation in Information Retrieval (slides)

Numerous labours/seminars/projects at the Leibniz University of Hannover

Supervised Master Thesis

2020 Mitra Safaei: Acoustic Based Methods for Process Supervision in Production Environments

In the industrial environment diagnosing a machine is working well or whether it is in the defect is important because it prevents the downtime of the production line, reduces the maintenance cost and expenses. There are many ways to prevent this problem such as preventive maintenance (PM), monitoring system and etc. But, these techniques are not sufficient and sometimes they are costly. Predicting a lifetime machine’s part based on acoustic signals and sounds in industrial environments is a new approach to overcome the current problems. But it is very challenging to identify the features, noises and real acoustic signals from the specific machine. Therefore in this research, we designed and developed an algorithm to distinguish between processes based on acoustic signals. Acoustic Based Methods (ABM) used a dataset from the Institute for Manufacturing Technology and Machine Tools (IFW) for the Gildemeister for machine (CTX420 linear). This machine performs four processes and it is chosen to make a complicated scenario for training and testing the proposed algorithm. ABM used deep learning network to predict the correct process of the machine. A new Convolutional Neural Network (CNN) and Continuous Wavelet Transformation (CWT) proposed to change the acoustic signals to CWT input images. The proposed CNN algorithm used 70% of CWT images as training input images to identify and feature extraction based on input images to classify the machining process. The result of the proposed algorithm shows the accuracy of identifying the matching process based on the acoustic signals was over 97%. The comparison result between the proposed algorithm (ABM) and the AlexNet algorithm shows around 5% improvements.

2019 Md Shakawath Hossain: Investigating General and Machine Specific Aspects for Collaborative Learning of Manufacturing Process Supervision Model

In the recent past, food quality monitoring industries facing high customer demand in a short time to complete the production task with minimal cost but with high accuracy and efficiency. Many software and sensors drive these industries, and whenever there is an upgrade needed for this software and sensors, they have to learn from the beginning to cope up with the present condition. Eventually, it does not help the process to be efficient and optimal. However, this situation can be handled if we study the production process smartly with empirical process supervision. In addition to that, in industries, early process supervision is monitoring the workflow from the very starting point with the amendment of using previously gained common knowledge. This knowledge becomes leveraging when an upgrade occurs, and they do not have to learn each machine again from scratch and can apply the learned knowledge immediately. In this research work, we addressed this situation and come out with a unique process supervision paradigm by dimensionality reduction techniques to learn a collaborative common information base, which can be reused for process optimization.

The main purpose of this research work is to build an automated system which will be able to learn specific machine aspects as well as generalize machine aspects in a collaborative production environment to supervise the production process optimally. So as to achieve our goal, we applied our methodology to a real-life production dataset and adapted each machine exclusively with multi-class support vector machine and accumulated the compulsory information. At that point, we utilized this exact machine information to learn the industry collectively with the various probable combination of machines. Principal Component Analysis, Independent Component Analysis and Factor Analysis are used to reduce the dimension of accumulated models in a latent feature space before feeding into the classification model. Therefore, we utilized this latent generalized information base for further experiments. At first, we trained accumulated machine models without supervision to learn the collaborative models and tested on the new or unseen machine. In the next step, we transferred a fraction of unseen machine data to the previously trained collaborative models which we are reporting as early process supervising by transferring knowledge in this research work. As a result, this early supervision of the new unseen machine improved our former generalized collaborative model performance with an average of ~20 to 30\% but without utilizing the entire test machine data. Since we do not have to wait for the entire new machine data it saves valuable times in the industry as well as engineers can easily decide when to cut off the model training but with optimal production rate and accuracy by using our empirical process supervision. Eventually, we achieved an average of 20\% to 34\% production gain. This paradigm improves the process early, save valuable time and boost the revenue of the industry.

2019 Hubert Truchan: Tool wear monitoring during milling using machine learning methods

Tool wear must be monitored continuously. This is because a worn tool leads to reduced workpiece quality, higher production costs and times. Tool wear is non-linear and depends on many influencing variables. Data-driven methods are suitable for the indirect determination of tool wear and the determination of the remaining useful lifetime (RUL) of tools. These methods are more flexible in contrast to model-based approaches if enough training data is available. However, feature extraction and selection is very labor-intensive process. The prediction accuracy of data-based methods depends on the quality of extracted features. The challenge is therfore to extract representative features from the raw signal. 
The aim of this work is to develop a model for automatic feature extraction and selection based on Deep Learning Methods.

2019 Siddhartha Giri: Optical inspection of tool wear based on machine learning methods

The research about the tool wear detection have attracted a great attention to the manufacturing community in the recent years. .Due to the outrageous potentials in the flexibility of the production system which becomes a crucial factor in retaining the competitiveness of quality management  combined with higher accuracy , the optical tool wear detection system has gained much popularity. It possess the advantages of spotting the condition of the tool and predicting the tool life with high level of autonomy. Our effort in this paper is to apply the machine learning techniques to determine the state of the tool along with adjusting the most appropriate parameters while not disturbing the production process. We are researching on a dataset of a twisted drill tool which has been acquired with a real time milling machine. The paper presents experimental study about the image processing routines and machine learning approach to classify the state of the tool with high accuracy. The results show that prevalent tool flank wear can be inspected vigorously in a real time production environment and furthermore the manufacturing autonomy is possible and can be improved.

2018 Stefan Schuler:Applying the Concept of Crowdsourcing to Clustering Problems

The work results in basic framework for crowd-based visual clustering. This concept describes the visual analysis of data by the means of crowd workers.
The data is transformed using dimensionality reduction methods, leading to a number of ’data views’. These different perspectives of the datasets are shown to the workers. They are then tasked to find and annotate groups of data that seem to belong together.
Finally, we define a distance metric based upon the workers’ annotations and use it to create a simple clustering model.An on-site study with 12 participants was conducted, leading to a total of 360 completed tasks. We used three different datasets and five projection methods, in order to gauge the applicability of the framework under varying circumstances.
The results of this study demonstrate that the approach leads to meaningful data annotations, enabling us to build clustering models which perform better than classical Machine Learning algorithm.

2017 Tianyu Qi: A Study of Relation Evolution in Social Networks extracted from Web Archives

Networking and communication are natural human activities and discovering patterns of social interaction within a population has wide-ranging applications including: disease modeling, cultural and information transmission, intelligence and surveillance. This work, is a first attempt to study the temporal evolution of relationships in online social network constructed from the web archive and Internet. To this end, we exploit extracted co-occurring entity pairs and their connecting textual patterns gathered from the web pages stored in the web archive. Whilst more research is still required to reach the goal due bad quality of our data set and limit of time, in this thesis we reached a first a step towards it through studying clustering approaches for modelling social relations using pattern grouping.

2015 Markus Rokicki: Reward Mechanisms for Crowdsourcing

Many data processing tasks such as semantic annotation of images, translation of texts in foreign languages, and labeling of training data for machine learning models require human input, and, on a large scale, can only be accurately solved using crowd based online work. In order to improve both cost and time efficiency of crowdsourcing we examine alternative reward mechanisms compared to the “Pay-per-HIT” scheme commonly used in platforms such as Amazon Mechanical Turk. To this end, in this presentation we explore a wide range of monetary reward schemes that are inspired by the success of competitions, lotteries, and games of luck, showing that frameworks where crowd workers compete against each other can drastically reduce crowdsourcing costs. Furthermore, we investigate how team mechanisms can be leveraged to further improve the cost efficiency of crowdsourcing competitions. To this end, we introduce strategies for team based crowdsourcing, ranging from team formation processes where workers are randomly assigned to competing teams, over strategies involving self-organization where workers actively participate in team building, to combinations of team and individual competitions.

2015 Philipp Kemkes: Scalable Web Search Based Techniques for Efficient Mining of Large Social Graphs from Unstructured Web Data

Social network analysis is leveraged in a variety of applications such as identifying influential entities, detecting communities with special interests, and determining the  ow of information and innovations. However, existing approaches for extracting social networks from unstructured Web content do not scale well and are only feasible for small graphs. In this presentation, novel methodologies are introduced for query-based search engine mining, enabling efficient extraction of social networks from large amounts of Web data. To this end, we leverage patterns in phrase queries for retrieving entity connections, and employ a bootstrapping approach for iteratively expanding the pattern set. Our experimental evaluation in different domains demonstrates that our algorithms provide high quality results and allow for a scalable and efficient construction of social graphs.

2012 Armin Doroudian: Infrastructure for Creating and Managing Mobile Social Networks between People in Shared Situations

The recent Web technology development changed our daily life. In the digital world of today new social media are virtually connecting more and more people. The recent popularity of mobile devices based on Android platform makes it possible for users to go beyond the home computer and be instantly connected through the social media and without spatial restrictions.The goal addressed in my Master Thesis is that dynamic mobile social groups can be created on the fly and turn into offline connections with the help of the experiences from social networking and with contribution of the mobile devices. This talk will be an in-depth presentation of recent developments in the area of mobile social networks and manual trust negotiation. My vision is to carry social networking on its next level out of the virtual space into reality.

2009 Tristan Wehrmaker: Interoperabilität im Web 2.0: Evaluierung und Implementierung von Mechanismen für die Integration von Social Media Diensten (link)

In Web 2.0 social media services are very popular. Many of those services have APIs to give other applications access to their data and functionality. Whenever data shall be published or discovered, the user has to interact with many different web user interfaces. In this thesis we present the analysis of such Web 2.0 services and their APIs. Therefor a conceptual framework is provided, that is used to characterize Web 2.0 services. Based on the results of the analysis a platform is implemented, which integrates 10 different Web 2.0 services and hence saves the user the access to the web user interfaces. This platform can easily be extended with additional Web 2.0 services. Furthermore, we introduce a client application that is realized as a Firefox Extension and allows the user to publish her data with just a few clicks.

2008 Francisco Javier: Advanced Policy-based Access Control in RDBMS

The current database access control mechanisms are very restricted, in the best case, it is possible to specify access control to tables or columns to users or roles. That works very well in closed environments such as within an organization or a company. Again in an open environment this access control models are not very expressive and access is dependand on the objects data information, which is relatively possible nowadays in some vendors implementa-
tion, however the expressiveness is not really powerful enough. The goal of this thesis is to solve current requirements in database access control. The idea is to enhance this access control mechanism by specifying a set of more expressive access control policies in a way that (at least partially) can be enforced at the database level increasing expressiveness power, in order to represent more complex and sophisticated scenarios of the real world such as described above, where new features like context, credential and a more expressive content dependant conditions are able to use in the access control mechanism of the relational database management systems.

Supervised Bachelor Thesis

2019 John Robertus: Visual Localization of Production Defects Using Deep Neural Networks

The continuous advancement of Artificial Intelligence (AI) opens new opportunities to revolutionize today ́s industry. In this presentation we report our work on systematically  examining performances  of  a  large  number  of  the  state-of-the-art  neural  network  architectures  to  visually  localize  surface errors  occurred  during  industrial  production.  The  results  of  our  experiments, carried out on two large real world data sets, show high applicability of existing transfer learning models with accuracy above 95% (using the winner - Xception model). The proposed approaches can be easily adapted and employed in modern industrial production environments.

2015 Paul-Hendrik Tieman: Survey and Implementation of Support Vector Machines for the Optimization of Smart Home Energy Management Systems

Currently our energy supply system is based, to a large extend, on fossil fuels and nuclear power. There is a movement towards renewable energy which could slow down extreme climatic change and save atomic waste. Unfortunately the dynamics of the power generation through renewable sources is often not controllable as it depends on the current weather situation (wind velocity and sun radiation). In the ideal case electrical power should be stored at minimal loss, or even better, consumed at the same time and amount it is being generated. In this thesis we will investigate ways based on Support Vector machines to forecast the electrical load of a household, which is a promising direction to help the optimization of the energy consumption towards renewable sources.

2014 Imène Tighane: Event Detection and Visualization in Social Web Archives

Over the last decades the World Wide Web became a central medium for social interaction. Blogging platforms and social net- working sites allow users to publish and share vast amounts of in- formation in the form of pictures, opinions, interests, activities and events happening around them in real time with people in their network. Archives of those websites can provide us with valuable information about the perspective of “real people” on different is- sues. The vision adressed in this thesis is that such information can be extracted on the fly and events can be detected and anal- ysed with the help of existing burst detection methods in order to facilitate the exploration of the web. In this view, we will investi- gate and implement a software that allow an efficient exploration of those archive and help the users to better understand this metadata thank to our events labeling methods.

2014 Ankit Sharma: Search and collaboration in LearnWeb

LearnWeb2 is a collaborative search and sharing system which brings together different online services under one umbrella. The LearnWeb2 system currently provides basic features for collaboration and sharing like the groups, forums, notification system, etc. The system also incorporates a Web2.0 adapter which connects LearnWeb2 to Web2.0 services like Flickr, Youtube, Bing and Slideshare. These services provide text and multimedia search capabilities to LearnWeb2. Being a collaborative search and sharing system, LearnWeb2 must possess features that enhance collaborative interaction and search. The system required a well-designed user interface which adheres to user interface guidelines and which can promote collaboration among the users. A user interface with a three column layout was developed to enable the user to access his/her groups. The three column layout reduces the number of page refreshes and presents the information in three logical columns without increasing the complexity of the system. A provision for suggesting queries to the users while searching is proven to be an added benefit to the users. The LearnWeb2 query logs were used for suggesting the queries as they reflect the actual behavior of the users. Since, LearnWeb2 query logs contain a small number of queries, they were combined with the Google suggestions to compute the query suggestions for a given query. Disambiguation of named entities in search and representing them in the form of a fact sheet can enable the users to gather information about the subject without actually visiting the search results. A Google style fact sheet was developed to fulfill this requirement which is populated using the structured information available from DBpedia. Very often, users of the system are required to present their collection of resources to the community. Hence, a tool that enables the users to prepare presentations on the fly using the resources from their groups was added. Impress.js, a JavaScript based solution for creating presentations for web browsers, was found to be well suited for the task and used to develop the presentation module. It was observed that all the improvements and add-ons played a major role in enhancing the search as well as the collaborative features provided by LearnWeb2.

2014 Rahil Ahora: Security and access control in LearnWeb

Collaborative systems allow their users to communicate and share information with each other. Security is an important requirement to ensure the availability, confidentiality and integrity of the elements of the collaborative systems. Access control is one such requirement that helps in deciding and managing the accessibility of resources to the users. An access control model for any collaborative system must be scalable and at the same time, provide fine grained control over the elements to its users. Absence of a generic access control model for collaborative systems makes it difficult to design a system that can meet these requirements. LearnWeb2 is a collaborative search and sharing system which provides basic features like the concept of user groups, forums, notification system, etc. The system also incorporates a Web2.0 adapter which connects LearnWeb2 to Web2.0 services like Flickr, Youtube, Bing and Slideshare. The system required an efficient, scalable and fine grained access control model that can satisfy its security needs. This goal was achieved by using the traditional Role Based Access Control (RBAC) technique as the base of the model and using the notion of contexts to provide flexibility to the mechanism. The secondary goal of this thesis was to prototype an anonymized text editor. An important concern while publishing data in collaborative systems is that it must not violate any personal privacy. Consider a text file as a shared resource in a collaborative system; some portions of the file can be sensitive and need to be protected depending on the user that is viewing the file. The file may contain some personal information like the name of a person, location, organization, etc that should not be accessible to all the users of the system. The problem is detecting and anonymizing this sensitive information in the text document before publishing it. The anonymized editor will help the data publisher to semi automatically recognize the potentially sensitive information chunks and its associated context in the text and provide means to assign an access level to each chunk. As a solution, a prototype editor has been developed using the Stanford NLP parser to identify the sensitive information and its related context within a text document

2014 Sanya Chawla: Exploring TED as Linked Open Data for Education

The Semantic Web presents general standards for making the data available on the web in a structured format that promotes Web-scale interoperability. This structured data is now referred to as Linked data, an entity mainly responsible for the interoperability among the huge amount of data available on the web. LearnWeb is a collaborative system, which empowers its users to collect and share resources from various Web2.0 tools, such as YouTube, Bing, Flickr etc within a single environment. One of the largest user groups is the YELL professional online community of English language teachers belonging to the University of Udine. This thesis describes the system for converting the data available on TED into Linked Data, as well as its integration within the LearnWeb search functionality. The system consists of a 5-step pipeline: (1) crawl TED for new videos, (2) decide for an appropriate vocabulary (schema), (3) convert videos to RDF in accordance with our schema, (4) upload the RDF files to the server and finally (5) index the files. The search API used by LearnWeb was updated to query this data using SPARQL and generate more relevant result snippets for our users. As a result of this thesis, users can now query all TED videos and their transcripts through the LearnWeb search GUI or the SPARQL endpoint. This dataset will also be published in LinkedUp Linked Open Data Catalogue for the general public.

2010 Mounika Mydukur: Improving and extending the collaborative search system “LearnWeb”

This work was carried out to enhance LearnWeb with collaborative features by integrating GroupMe module, options to add/delete comments on shared resources and enhancing the User Interface in close cooperation with the users (approx. 60) of the system.

2011 Olexandr Druzhynin: InterWebJ: Flexible and Extensible Web 2.0 Integration Service in Java

Web 2.0 media services are steadily gaining popularity. Examples of such services include social networking sites, blogs, wikis, image and video sharing sites, as well as hosted services, web applications, mashups and folksonomies. However, the most of these services support only specific media types and the users have to login to and use several Web 2.0 tools in parallel in order to get an overview of the resources distributed over different Web2.0 services and to interact with many different web interfaces. InterWeb was recently created at L3S Research Center and is a web service that provides a seamless interface to access a number of popular Web2.0 platforms. Unfortunately, it had significant scalability and security limitations, as well as performance disadvantages due to the underlying development framework. In this thesis a new, Java-based implementation of InterWeb will be presented. The new framework overcomes described problems and improves the extensibility, performance and maintainability of InterWeb.

2011 Philipp Kemkes: A Collaborative Web 2.0 Resource Exploration System

The success of Web 2.0 and specific platforms such as YouTube, Flickr, and Delicious demonstrates that people are willing to share knowledge and resources with their social community and beyond. However most of these platforms only support specific media types, and users trying to assemble learning resources have to login to and use several Web 2.0 tools at once to access all relevant
resources. LearnWeb2.0, an integrated environment we implemented for sharing Web 2.0 resources, which better supports users in sharing, discovering, and managing resources spread across different Web 2.0 platforms. LearnWeb2.0 integrates ten resource sharing systems and provides advanced features for organizing and sharing distributed resources in a collaborative environment. It provides basic search and  upload functionality. The aim of the thesis of Herr Philipp Kemkes was to develop an efficient and well maintainable Java based Web application with user-friendly GUI, to replace the existent PHP based application.

2010 Jaspreet Singh

Competence development can be seen as the process by which individuals develop the specific expertise needed for skilled performance in a particular field. It supports users in sharing, discovering, and managing resources spread across different Web 2.0 platforms. LearnWeb2.0 integrates ten popular resource sharing systems as well as social networking systems, and provides advanced features for organizing and sharing distributed resources in a collaborative environment. These were the problems that I, with the help of my team, had to tackle from the usability perspective in order to give a LearnWeb2.0 user a vastly improved experience.

Supervised Student Research Projects

2020 Rezaul Abedin: Data analysis approaches supporting efficient identification of principal components in large time-series data
2019 Sarmast Bilawal Khuhro: Anomaly Detection in Production Data
2019 Sofiane Laridi: Long-Short Term Recurrent Neural Network for Anomaly Detection
2019 Mitra Safei: Deep Learning Approach for Predictive Maintenance
2019 Hamid Reza Karimian: Timeseries handling for Industry 4.0
2018 Yehor Trembovetskyi: Information Retrieval Approaches for Production Data
2018 Fawad Abbasi: Predictive maintenance in a Manufacturing Environment