Last update on topic selection procedure: 02.10.2024 - Selection procedure for 2024W is available and starts 03.10.2024.
Open Topics for Bachelor Theses
If you are looking for a Bachelor Thesis topic, please register for the Bachelor Thesis course, either 051065 LP Softwarepraktikum mit Bachelorarbeit (old curriculum) or Group 1 of 051080 LP Softwarepraktikum mit Bachelorarbeit (new curriculum since 2022W). Please look up the list of dedicated topics offered below in Section (A) in the current semester. All those listed topics are available for bachelor theses unless there is a corresponding restriction stated in the topic description. Of course, the topic will be limited in effort and scientific claims to meet the requirements and effort (12 ECTS or 15 ECTS) of a Bachelor Thesis. If you are interested and need to clarify details, do not hesitate to contact us; send an e-mail to Prof. Wolfgang Klas, Prof. Gerald Quirchmayr, or contact a member of the research group.
» Topics for Bachelor Thesis - see the listing in Section (A) below.
Please, make sure you follow the "Instructions: How to get a topic for my SPBA Bachelor Project" given here.
Before contacting us, PLEASE read the » Recommendations & Guidelines for Bachelor Thesis available here.
Open Topics for Master Theses and Practical Courses (PR, P1, P2)
In the following some of the open topics in the area of Multimedia Information Systems are listed. If you are interested and if you have an idea on a project do not hesitate to contact us; send e-mail to Prof. Wolfgang Klas or contact a member of the research group. In case of P1 or P2 projects, please, make sure you follow the "Instructions: How to get a topic for my P1 or P2 Project" given here.
In general, topics in the area of Multimedia Information Systems technologies include:
- analyze, manage, store, create and compose, semantically enrich & play back multimedia content;
- semantically smart multimedia systems;
- security.
Possible application domains include:
- Detecting conflicting information and checking facts on the Web
- Content Authoring and Management Systems
- Multimedia Web Content Management
- Robotic and IoT Applications
- Blockchain Technologies and Applications
- Interactive Display Systems
- Game-based Learning
- Service Oriented Architecture (SOA) and Cloud Based Services
Section (A) below lists topics that can be chosen in the course of a PR Praktikum, but are in principles also available for a master thesis (usually expanded and more advanced).
Section (B) below lists topics that are intended to be chosen for a master thesis.
(A) Topics for Practical Courses (SPBA, PR P1, PR P2)
CL/GQ01 Identification and classification of cyber-attack patterns from system behaviour
The traditional way of identifying cyber-attacks is through signature-based analysis of network traffic. This approach only works when malware signatures are known but usually fails in the case of zero-day exploits. Therefore, it needs to be complemented with an analysis of the system behaviour. While not practical before, the recent development of cheaper fast hardware makes it possible to deploy AI approaches for this task, which is especially promising for the detection of APT attacks. The goal of this project is to develop such AI-based approaches using the MITRE Att&ck Framework (https://attack.mitre.org/) as primary information source.
The different AI and statistics-oriented techniques to be tested are:
CL/GQ 1a) Rule-based Exception Monitoring
CL/GQ 1b) Case-based Reasoning
CL/GQ 1c) Fuzzy Logic and Rough Sets
CL/GQ 1d) Statistics for Outlier Detection
Each of the approaches is to be applied to building a detection model and implement a prototype by one student. The prototype will then be tested on real network traffic collected from tools such as Suricata (https://suricata.io/download/) and Snort (https://www.snort.org/downloads).
The suggested structure for the paper accompanying the project is:
1. Introduction/Topic description/Motivation
2. State of the art in literature and practice
3. Modelling method and approach used
4. Development of the detection model
5. Prototype (documentation, source code, etc.)
6. Test
7. Discussion of the results
8. Outlook and conclusion
- Tags: Contact: Gerald Quirchmayr, Christian Luidold
CL/GQ02 - Threat Intelligence for Cyber Security Decision Making
Cyber Security Decision Making is becoming a core aspect of cyber defense efforts. Advanced decision models and processes, such as the OODA Loop do heavily depend on the available information. The major task of this project is to develop and implement an approach to support the OBSERVE (Information collection) and ORIENT (Information analysis) phases of this type of model.
The topic can be split into two parts:
CL/GQ 2a: Develop a support approach for the OBSERVE phase based on readily available sources, such as CVEs, NVD, and MISP. The import interface should be based on the STIX/TAXII standard.
CL/GQ 2b: Develop a support approach for the ORIENT phase exploring the potential of “emerging patterns” and “weak signals” in network defense. The goal is to monitor internal network traffic and map it on the information collected in the OBSERVE phase.
The suggested structure for the paper accompanying the project is:
- Introduction/Topic description/Motivation
- State of the art in literature and practice
- Modelling method and approach used
- Development of the model
- Prototype (documentation, source code, etc.)
- Test
- Discussion of the results
- Outlook and conclusion
- Tags:
- Contact: Gerald Quirchmayr, Christian Luidold
CL/GQ03 - Information Security Policy Repository
Different types of information security policies make it difficult to access the relevant passages in the individual policies and combine them into an actionable recommendation. A repository in which the policies are stored and their sections can be accessed in relation to their relevance in a given situation, ranging from editing and auditing policies to applying them to incident handling, would therefore be very helpful. To be effective, the developed repository needs to support the information security policy life cycle.
The goal of the project is to develop such a policy repository/store model and implement a prototype. For searching the policy store, AI technologies as well as traditional search mechanisms building on keywords and ontologies should be considered.
The following sources can serve as starting point for this project:
M. Alam and M. U. Bokhari, "Information Security Policy Architecture," International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007), Sivakasi, India, 2007, pp. 120-122, doi: 10.1109/ICCIMA.2007.275.
Kenneth J. Knapp, R. Franklin Morris, Thomas E. Marshall, Terry Anthony Byrd, Information security policy: An organizational-level process model, Computers & Security, Volume 28, Issue 7, 2009, Pages 493-508, ISSN 0167-4048, https://doi.org/10.1016/j.cose.2009.07.001.
Nader Sohrabi Safa, Rossouw Von Solms, Steven Furnell, Information security policy compliance model in organizations, Computers & Security, Volume 56, 2016, Pages 70-82, ISSN 0167-4048, https://doi.org/10.1016/j.cose.2015.10.006.
Hanna Paananen, Michael Lapke, Mikko Siponen, State of the art in information security policy development, Computers & Security, Volume 88, 2020, 101608, ISSN 0167-4048, https://doi.org/10.1016/j.cose.2019.101608.
- Tags:
- Contact: Gerald Quirchmayr, Christian Luidold
PK01 - Enhancing E-Commerce with AI-Powered Shopping Support
The unstoppable growth of online shopping brings both opportunities and challenges. Navigating digital marketplaces requires more innovative solutions. This AI project aims to meet this demand by creating an intelligent virtual shopping assistant. By integrating AI, machine learning, and natural language understanding, this project improves the online shopping experience, providing personalized recommendations and seamless assistance.
This project envisions an AI-driven shopping assistant that goes beyond static interfaces. It is powered by state-of-the-art machine learning, understands user preferences, refines suggestions, and facilitates product discovery. Natural language understanding allows users to interact in conversation, while computer vision enables effortless image-based product search.
- Technologies: Machine Learning, Natural Language Processing, Computer Vision, Large Language Models, Web Application
- Tags:
- Contact: Peter Kalchgruber
PK02 - Sound Experience Kit For Children
Children need music. "Neurobiological and anthropological reasons call for high-priority attention to the human need for music as a rhythmically organized sound experience and an expressive tool for communication." [1] This project aims to design and develop a sound toy that allows children to experience music tailored to the needs of a specific age group.
This project goes beyond software development and requires the hardware's assembly, configuration, and programming. The hardware consists of an Arduino or Raspberry Pi, sensors (e.g., RFID, buttons, lights, accelerometers, and others), displays, and actuators if necessary. You must integrate all input and output devices into a secure, user-friendly design. Adapt the difficulty level and usability of the prototype to the age group for which it is intended.
[1] Gruhn W. Children need music. International Journal of Music Education. 2005;23(2):99-101. doi:10.1177/0255761405052400
- Technologies: Arduino/Raspberry Pi, lots of handicraft, sound production
- Tags:
- Contact: Peter Kalchgruber
PK03 - Streamlining the Exam Process: Developing a Web App for Efficient Exam Creation and Evaluation
Paper-based multiple-choice exams are commonly used in various evaluation contexts (e.g., school, evaluation of Master theses). When the user pool is limited, generating a survey using basic software, such as Microsoft Word, for the printout and then, after completing the tests, transferring the collected data into a calculation program like Excel or R for evaluation purposes is straightforward. However, in cases where the user pool is more extensive, implementing a survey manager can reduce the burden of manually creating a survey and entering data and might facilitate subsequent analysis.
The primary objective of this study is to develop a web app and a background web service that facilitates the efficient creation and evaluation of surveys. The proposed solution aims to reduce the limitations associated with traditional paper-based survey methods by providing a streamlined process from survey design over data acquisition to data evaluation. One critical distinction between this solution and conventional multiple-choice tests is the placement of answer boxes adjacent to their respective answer options. Unlike traditional test formats, which often require separate question and answer sheets, surveys should be designed to allow respondents to check off their answers on a single sheet right next to the question.
- Technologies: Web Application, React, Firebase
- Tags:
- Contact: Peter Kalchgruber
PK04 - Location Intelligence: An Information System to Intelligently Display the Location of Friends Using Ambient Devices
The widespread use of mobile devices has led to a growing trend of sharing personal location information. Upon opening a mobile device's settings, one may see a notification indicating the number of applications that have access to their location, making it no longer confidential. Individuals can still obtain information about their friends' locations if they share it through a specific location-based app. For example, Google provides notifications when a person reaches a particular location, demonstrating the increasing convenience of accessing and sharing location information using modern technology.
This project aims to develop an information system (cf. Weasley Clock) that seamlessly integrates information about friends' or family's locations into an ambient device. Depending on the course chosen (SPBA, P1, or P2), this project may utilize machine learning prediction models to achieve precise predictions as individuals move between specified locations.
- Technologies: Raspberry Pi, Python, Database Management System, Web API, Location-based services, Push notifications API
- Tags:
- Contact: Peter Kalchgruber
PK05 - Intelligent Sports Analytics: A Progressive Web App for Extracting and Evaluating Team Results
Create an information system for the analysis of sports team results. The platform is implemented as a progressive web app (PWA) and should examine different ranking algorithms for team/athlete classification.
This topic aims to create a generic PWA that allows the extraction of sports results from other websites, either by parsing the websites or directly inputting the HTML output into a form. The input data should be analyzed using rules or supported by machine learning concepts. The PWA should offer various analysis options, including the choice of different ranking algorithms (e.g., ELO ranking), trends, predictions, and crowd-based data entries.
[1] Hochbaum, Dorit S. "Ranking sports teams and the inverse equal paths problem." International Workshop on
Internet
and
Network Economics. Springer, Berlin, Heidelberg, 2006.
[2] Sorensen, Soren P. "An overview of some methods for ranking sports teams." University of Tennessee.
Knoxville
(2000).
[3] https://www.sofascore.com
- Technologies: Web Application, React, Firebase
- Tags:
- Contact: Peter Kalchgruber
PK06 - Automated Sports Tracker Utilizing Mobile and Wearable Technology
This project aims to create a progressive web app that tracks athletic activities using location data from mobile devices and sports watches (such as Garmin). The system is designed to detect and identify sporting events automatically and will present notifications to the user for confirmation. The app will continually improve its accuracy based on past experiences by leveraging machine learning concepts. The recorded activities will be presented as user-friendly as possible, allowing the user to view and edit individual events as needed.
Depending on the course chosen (SPBA, P1 or P2), this project may include various evaluation options.
- Technologies: Web Application, React, Firebase
- Tags:
- Contact: Peter Kalchgruber
PK07 - Improving Transparency in the Grading Process: Developing a Web Application for Efficient and User-friendly Grade Assessment
If you reflect on your experiences as a student in school, you may recall that the process of grade assessment was not always transparent and sometimes seemed unfair. This topic proposes the development of a web app that enables teachers to efficiently and quickly grade students with minimal effort during their lectures using a simple and user-friendly interface on their mobiles. Additionally, first, a background system should be developed to manage the data. Second, a separate overview page should be designed to give students independent access to their current grading status.
The proposed system may feature basic fields for each student, such as schoolwork, homework, and collaboration. Alternatively, more advanced grading concepts, such as an XP-based grading system, may be developed. Data security and protection measures are to be considered at every time. Overall, the proposed study aims to address the lack of transparency in the grading process by providing a comprehensive and efficient grading system. The developed system is expected to be user-friendly, meet the requirements, and improve the grading process for both teachers and students.
[1] https://blog.haschek.at/xp-based-grading-system/
- Technologies: Web Application, React, Firebase
- Tags:
- Contact: Peter Kalchgruber
PK08 - Developing a Web App for the Visually Impaired: Improving Access to Sports Results
Numerous assistive technologies (e.g., ORA cam, screenreader, ...) for blind and visually impaired people have difficulty correctly interpreting and communicating sports results.
This study aims to develop a web application for this target group that can effectively capture the layout of tables at sporting events and related materials. The application must develop a comprehensive approach adaptable to different websites and allow for sports results consumption. It must also have an intuitive user interface accessible to the visually impaired and facilitates the reading aloud of results to the user.
- Technologies: Web Application, Web Service
- Tags:
- Contact: Peter Kalchgruber
PK09 - Leveraging Knowledge Graphs and Linked Data to Enhance User-generated Content
Currently, most users receive assistance in spell-checking when drafting blog posts or social media updates. Still, they do not have access to additional background information or access to actual fact-checking technologies about the topic they are covering. This project proposes a solution that leverages knowledge graphs and publicly available linked data sources to provide users with relevant background data when writing prose text. The project aims to develop a browser extension and web service that supports users with factual information about their texts.
Initially, the written text will be analyzed and structured into facts. Subsequently, information about the entities referenced in the text will be derived and compared with the data in the text, with suggestions provided to the user for potential inclusion.
The proposed system is expected to enable users to create more accurate and informed content by offering them easy access to pertinent factual information.
Heese, R., Luczak-Rösch, M., Paschke, A., Oldakowski, R., & Streibel, O. (2010, February). One Click Annotation. In SFSW.
- Technologies: FactCheck, Javascript
- Tags:
- Contact: Peter Kalchgruber
PK10 - Entity Resolution for Improved Web Data Comparison Based on Semi-structured Data
The FactCheck framework is designed to address the issue of conflicting data on the Web by providing a systematic approach to detect and resolve such discrepancies. It encompasses the entire fact comparison process, including data acquisition, comparison, presentation of results, and advanced analysis features. As a pioneering research initiative of our research group, FactCheck presents several challenging aspects and opportunities in its development and implementation.
In order to compare facts about an entity, such as a person, that are published on different websites, the system must first understand that these two representations of the entity are indeed referring to the same person. This can be challenging as data published on the Web often lack a uniform primary key.
This project aims to find a solution for identifying and linking FactSets from different datasets with entities with varying characteristics. The effectiveness of this approach will be evaluated using a range of data sets.
- Technologies: Entity Recognition, Python
- Tags:
- Contact: Peter Kalchgruber
PK11 - Developing a Web Service for Enhancing Data Understanding through Semantic Annotation
FactCheck is a framework for detecting and resolving conflicting data on the Web. It establishes an entire fact comparison process that consists of data acquisition, data comparison, the presentation of comparison results, and comprehensive analysis functions. FactCheck is a leading research topic of our research group and bears challenges in many aspects.
A fact comparison can only be carried out if at least two data providers (e.g., websites) have information about
the same subject. The more data providers supply information about the subject; the more detailed the comparison
result can be. However, a lack of understanding of the semantic structure makes the comparison process
difficult. Thus understanding the semantics behind the data is essential to use the correct data for a
reasonable comparison. For example, if website A lists inhabitants: 1.000.000
of a city and website
B lists population: 1.000.211
of the same city, we must understand that we can consider these two
values in the comparison process as they talk most likely about the same property.
Techniques like JSON-LD and schema.org help website providers to allow other services (like search engines or FactCheck plugins) to understand their contents. However, many websites do not use these techniques and therefore impede the use of their data in these services. This topic aims to develop a web service that allows services that rely on structured data to adequately understand unstructured or insufficiently annotated data.
- Technologies: FactCheck, Python
- Tags:
- Contact: Peter Kalchgruber
PK12 - Developing an Adaptive Precision Metric Manager for Fact Comparison
FactCheck is a framework for detecting and resolving conflicting data on the Web. It establishes an entire fact comparison process that consists of data acquisition, data comparison, the presentation of comparison results, and comprehensive analysis functions. FactCheck is a leading research topic of our research group and bears challenges in many aspects.
When comparing FactSets from different websites, it is necessary to specify which precision metric to apply to compare each fact pair. Different types of facts require different precision metrics; for example, the Bitcoin price of a product needs to be accurately compared to different FactSets of the same entity, while the number of inhabitants in a country can be roughly compared between different websites.
In FactCheck, the precision metric manager is responsible for selecting the appropriate metric for comparing each fact pair, resulting in either a "conflicting" or "nonconflicting" finding. The manager communicates the comparison result and the applied precision metric to the user or the requesting party. The role of the precision metrics manager is to select the appropriate metrics for different application scenarios, recognizing that different precision metrics may be required. This project aims to develop a precision metrics manager using a self-learning and self-adaptive system based on its use and user feedback to meet these requirements.
- Technologies: FactCheck, Python, HTML, Javascript
- Tags:
- Contact: Peter Kalchgruber and Daniel Berger
PK13 - FactCheck: Building a Trust System for facts
The Internet allows individuals (including you) to create content on any topic, which can inevitably lead to issues of concern about the authenticity, credibility, truth, and trustworthiness of the information. Conversely, individuals who utilize the Internet may face challenges in determining the accuracy and truthfulness of online content. To address this issue, establish a trust system for the FactCheck framework that extends existing statistics on the accuracy, correctness, and missing data of fact sets from the FactCheck Framework.
The proposed trust system will aim to enhance users' ability to assess the accuracy and reliability of online content by incorporating trustworthiness measures.
- Technologies: FactCheck, Databases, Python
- Tags:
- Contact: Peter Kalchgruber and Marie Aichinger
PK14 - Expanding FactCheck: Acquiring and Integrating New Data Sources
The FactCheck framework is designed to address the issue of conflicting data on the Web by providing a systematic approach to detect and resolve such discrepancies. It encompasses the entire fact comparison process, including data acquisition, comparison, presentation of results, and advanced analysis features. As a pioneering research initiative of our research group, FactCheck presents several challenging aspects as well as opportunities in its development and implementation.
The fact comparison process requires data from at least two providers with information on the same topic. The more data available from providers, the more detailed comparisons are possible. However, obtaining data from multiple providers delays the process. Therefore, data is stored in an extensive database for quick access during comparison tasks. The current FactBase database stores over one million entities, each with ten facts on average. This project aims to expand the FactBase by collecting new facts from previously unexplored data sources by analyzing facts expressed by different concepts on specified websites and integrating them into the FactBase.
- Technologies: FactCheck, Databases, Python
- Tags:
- Contact: Peter Kalchgruber
PK15 - Investigating Reliable and Efficient Database Solutions for FactCheck
The FactCheck framework is designed to address the issue of conflicting data on the Web by providing a systematic approach to detect and resolve such discrepancies. It encompasses the entire fact comparison process, including data acquisition, comparison, presentation of results, and advanced analysis features. As a pioneering research initiative of our research group, FactCheck presents several challenging aspects as well as opportunities in its development and implementation.
To perform a fact comparison, at least two data providers must have information about the same subject. However, retrieving data from multiple providers on the fly can result in a delayed and slow comparison. Therefore, storing data in an extensive database can ensure quick access to data for all incoming comparison requests. However, this data becomes outdated over time and needs to be updated regularly depending on the update interval. In addition, the framework requires fast and customized database solutions for efficient data storage and access to support real-time comparisons and the generation of statistics.
This project aims to compare different database solutions, including cloud-based solutions (e.g., Azure CosmosDB), and evaluate their suitability as reliable and efficient data storage for ongoing FactCheck projects. The evaluation will also include the current Apache CouchDB (NoSQL) database. The project will develop strategies for the most efficient access to data, considering the use cases triggered by FactCheck.
- Technologies: FactCheck, Databases, Python
- Tags:
- Contact: Peter Kalchgruber and Marie Aichinger
PK16 - Smart Student Support Chatbot
Students often seek timely and accurate information about curriculum details or information about lecturers. This project aims to promote student engagement, support informed decision-making, and provide convenient access to valuable academic resources by developing an AI chatbot that is a knowledgeable, educated companion.
This project uses AI and NLP to develop an intelligent chatbot that responds to student queries about curriculum specifics and faculty information. By integrating the chatbot into a user-friendly interface, students can effortlessly access the information they need, enhancing their academic experience and interactions within the university environment.
- Technologies: NLP, ML, Python, FLask, Chatbot Framework
- Tags:
- Contact: Peter Kalchgruber
PK17 - Relation Extraction using DistilBERT for Key-Value Pair Extraction
This project aims to bridge the gap between unstructured text and organized knowledge. Transforming narratives into structured knowledge is important as textual data increases in all industries. This project provides an opportunity to understand, implement, and extend the limitations of transformation models while enriching modern information extraction techniques.
In this project, we explore the potential of DistilBERT, a compact yet robust transformer-based language model. By leveraging the contextual capabilities of DistilBERT, we seek to construct a model that can autonomously uncover relationships hidden in textual narratives and eventually generate structured key-value pairs as an expression of its analytical capabilities.
- Technologies: Python, Hugging Face Transformers Library (for DistilBERT), Data preprocessing libraries (e.g., Pandas, NLTK), Evaluation metrics libraries (e.g., Scikit-learn)
- Tags:
- Contact: Peter Kalchgruber and Adrian Hofer
WK01 - Jupyter Notebooks for Dedicated Interactive Content of Courses
Jupyter Notebooks allow for the creation and sharing of documents that contain live code, equations,
visualizations,
and narrative text. Jupyter Notebooks are a well-established and well-recog
,,nized tool in academia and education in
general as well as in specific fields of research where it is important to provide for reproducibility of
scientific
results.
Goal of the project is to develop dedicated Jupyter Notebooks for specific course content relevant in the
context of
our courses (MOD, MCM, MST, MRE, MRS). The approach can be based on the existing framework that we already use
for
Juypter Notebooks in some of our courses but may also further improve or suggest new solutions for the framework
as
such. The selection of the programming language to be used needs to meet the requirements of the course content,
most probably Python, but - in fact - is very flexible as Jupyter Notebooks work with a variety of languages.
Mandatory requirement: Student must have understood the course content / material very well and should have
passed
the course already.
- Technology: Juypter Notebook, Python, Jupyter Notebook Hub of the CS-faculty, Markdown, VS Code (or similar IDE)
- Tags:
- Contact: Wolfgang Klas
WK02 - FactCheck - Precision Metrics
FactCheck is a framework for the detection and resolution of conflicting structured data on the Web. The FactCheck framework is the result of ongoing research at our research group. One of the central building blocks is the context-dependent comparison of structured data of various representations of one and the same real world object or artefact. The comparison is guided by so called precision metrics which is a flexible and sophisticated technique for logically comparing structured data values. Precision metrics consist of logical predicates used to evaluate the comparison of structured data. Goal of the project is to design and implement an appropriate model for the representation of precision metrics, the construction of such precision metrics as well as the application of the metrics for evaluating the comparison of data values. Various precision metrics should be defined and compared using a test dataset of 900.000 entities. Results of the project are to be demonstrated by a running demo application.
- Technology: Web Services, Semantic Web technologies, LOD, Microformat, JSON-LD, Docker
- Provided to the students: existing implementation of framework, test dataset
- Tags:
- Contact: Wolfgang Klas and Daniel Berger
WK03 - Demo of Blockchain Application Using Proof-of-Authority (Ethereum)
The goal of this project is the implementation of a demo application which illustrates the concept of proof-of-authority (in place of the very often used "proof-of-work" as, e.g., used in the Bitcoin Blockchain). For example, a possible application could be the implementation of the four-eyes principle (Vier-Augen-Prinzip) for officially approving documents by making use of two signers acting as proof-of-authorities. There are many other possible application scenarios feasible, e.g., the decision taking principles of a management board of an association or a company, board of managers, board of trustees or directors. The application scenario should be well-chosen in order to illustrate the general principle of proof-of-authority. It may be based on a generic, configurable implementation to show different variations of the proof-of-authority concept, e.g., 1 signer, 2 signers, N signers. The demo application has to be realized such that a short demonstration movie can be recorded, that will be published on the Lab's website.
- Technology: Ethereum, Web technologies, Docker
- Tags:
- Contact: Wolfgang Klas
WK04 - Demo of Blockchain Application Using Permissioned Blockchain
The goal of this project is the implementation of a demo application which illustrates the concept of private permissioned blockchains. The possible demo application can be discussed in more detail; examples are the decision taking principles of a management board of an association or a company, board of managers, board of trustees or directors. It should be well-chosen in order to illustrate the general principle. The demo application has to be realized such that a short demonstration movie can be recorded, that will be published on the Lab's website.
- Technology: MultiChain Blockchain Infrastructure, Version >= 2.1, on Linux of Windows, or on Cloud Infrastrcture. Web-Technologies for implementing Web-based application, Docker.
- Provided to the students: Optionally, virtual machine
- Tags:
- Contact: Wolfgang Klas
WK05 - "Studienleistungs & Prüfungspass" Based on MultiChain Blockchain Technology
The goal of this project is to implement an application for a digital "Studienleistungs & Prüfungspass" (study performance & examination pass) based on blockchain technology. The pass will record the individual, required study achievments (like milestones, tests, etc.) during a course, the final grading of a course, and the collection of gradings of courses during the entire study (like a "Sammelzeugnis" currently used by the university). There are various stakeholders in this scenario: the students, the lectures of courses, and administration (like SPL). The implementation has to be realized based on MultiChain Blockchain technology, which provides managed permissions and allows for millions of assets on a blockchain, structured asset data, multiple key-value, time series or identity databases on a blockchain. MultiChain Blockchain technology is one of the most promising implementations for private, managed blockchain systems.
- Technology: MultiChain Blockchain Infrastructure, Version >= 2.1, on Linux of Windows, or on Cloud Infrastrcture. Web-Technologies for implementing Web-based application, Docker.
- Provided to the students: Optionally, virtual machine
- Tags:
- Contact: Wolfgang Klas
WK06 - "Studienleistungs & Prüfungspass" Based on Ethereum Blockchain Technology
The goal of this project is - starting out from a given demo implementation - to implement an application for a digital "Studienleistungs & Prüfungspass" (study performance & examination pass) based on blockchain technology. The pass will record the individual, required study achievments (like milestones, tests, etc.) during a course, the final grading of a course, and the collection of gradings of courses during the entire study (like a "Sammelzeugnis" currently used by the university). There are various stakeholders in this scenario: the students, the lectures of courses, and administration (like SPL). The implementation has to be realized based on Ethereum Blockchain technology, which provides the concept of Smart Contracts. Ethereum Smart Contract technology is one of the most promising implementations for smart behaviour of blockchain systems. The focus will be on the proper design and implementation of smart contracts to capture most of the functionality of the application.
- Technology: Ethereum Blockchain Infrastructure, on Linux of Windows, or on Cloud Infrastructure, Web-Technologies for implementing Web-based application, Docker.
- Provided to the students: Optionally, virtual machine
- Tags:
- Contact: Wolfgang Klas
WK07 - Securing Images and Videos by Applying Blockchain Technology
The goal of this project is the design and the implementation of a framework based on blockchain technology that allows for the detection of manipulations in images and videos. Images or videos can be manipulated, e.g., persons (or other objects) can be added to or removed from an image, video frames (or sequences of video frames) can be added or removed from a video. Such a manipulation should be detected based on the storage of specific image encoding parts in a blockchain which allows to re-check the validity of an image encoding. E.g., essential macroblocks or portions of some macroblocks of a JPEG-encoded image could be stored in a blockchain such that it can be checked whether an image still consists of those macroblocks or includes manipulated macroblocks. The project will first have to select and specify the kind of manipulations to be considered in the scope of the project, then design an approach and a framework and implement a prototype and a demo application illustrating the approach, based on a specific blockchain platform that suits best the needs of the application.
- Technology: Blockchain Infrastructure (like Ethereum) on Linux, Windows, or on Cloud Infrastructure, JPEG, MPEG, Web-Technologies for implementing Web-based demo application, Docker.
- Provided to the students: Optionally, virtual machine
- Tags:
- Contact: Wolfgang Klas
WK08 - FactCheck - Relation Extraction from Text
FactCheck is a framework for detecting and resolving conflicting data on the Web. It establishes an entire fact comparison process that consists of data acquisition, data comparison, the presentation of comparison results, and comprehensive analysis functions. FactCheck is a leading research topic of our research group and bears challenges in many aspects.
A fact comparison by FactCheck can only be carried out if the data that should be compared is available as structured data (e.g., JSON-LD). However, not every data provider (e.g., websites) offers data in structured form. Hence, it is necessary to create structured data automatically. The first step in this direction is the analysis of the capabilities of current state-of-the-art Relation Extraction (RE) models (REBEL, UniRel, DREEAM) to extract comparable facts. To achieve this, available relations encoded in structured data (JSON-LD) should be compared to extracted relation data gathered through one of the RE models.
https://github.com/youmima/dreeam
https://github.com/wtangdev/unirel
https://github.com/Babelscape/rebel
- Technology: Python, Pytorch, SpaCy, JSON-LD
- Tags:
- Contact: Wolfgang Klas and Adrian Hofer
WK09 - FactCheck - IdaFix Browser-Extension UI for a Chatbot
The FactCheck framework is designed to address the issue of conflicting data on the Web by providing a systematic approach to detect and resolve such discrepancies. It encompasses the entire fact comparison process, including data acquisition, comparison, presentation of results, and advanced analysis features. As a pioneering research initiative of our research group, FactCheck presents several challenging aspects and opportunities in its development and implementation.
This project aims to find a solution for a user interface which allows an end user visiting a web page to understand the comparison results on conflicting information as well as to provide user feedback to the FactCheck system behaviour. The interface should be realized as an interactive chatbot. The startig point for the project is a prototypically implemented browser extension (IdaFix) which illustrates the functionality as well as the internal system API to be used.
- Technology: Web Browser technologies, e.g., JavaScript, HTML & CSS, Browser APIs (e.g., WebExtensions API), Background Scripts, Content Scripts, Popup Scripts, Messaging APIs, JSON-LD
- Tags:
- Contact: Wolfgang Klas and Marie Aichinger
AH01 - Relation Extraction Web Service - Tool for extracting relations
For many applications, it is required to extract structured or semi-structured data from unstructured text or tables on web pages. NLP methods for information extraction have made huge efforts from rule-based systems with a defined set of extractable relations totransformer-based information extraction models. Considering large language models, new approaches could be on the rise.
Your task for this project is to implement a relation extraction web service using state-of-the-art information extraction models. This web service should accept other web pages as input and output the extracted triplets. Build this web service with modularity in mind, think of requirements for the web service, design it, and implement it. Think of how the used model could be easily exchanged for new, upcoming state-of-the-art models. Create a web application that is consuming the web service and acts as a front end for the web service.
- Technology: Web Application, Web Service, Python, Hugging Face Transformers library, SpaCy library
- Tags:
- Contact: Adrian Hofer
AH04 - Entity Linking Web Service - Tool for linking entities to knowledge-bases
FactCheck is a framework for detecting and resolving conflicting data on the Web. It establishes an entire fact comparison process that consists of data acquisition, data comparison, the presentation of comparison results, and comprehensive analysis functions. FactCheck is a leading research topic of our research group and bears challenges in many aspects. To enhance data acquisition, your task is to link the extracted information to existing knowledge bases.
Use an existing approach for named entity recognition of text on a web page. Entity link those named entities to WikiData or any other knowledge base like Wikipedia or DbPedia.
Create a web service to entity link named entities on web pages to knowledge graphs. Visualize the results in an understandable fashion and make them browsable in a web application. Create the web service with modularity in mind. The chosen entity linking approach should be exchangeable for other approaches with as little friction as possible. To achieve this, iterate over a process of requirement analysis and design to come up with a promising implementation.
- Technology: Web Application, Web Service, SpaCy Library, Knowledge-base Python
- Tags:
- Contact: Adrian Hofer
AH06 - Relation Extraction Overlay
FactCheck is a framework for detecting and resolving conflicting data on the Web. It establishes an entire fact comparison process that consists of data acquisition, data comparison, the presentation of comparison results, and comprehensive analysis functions. The source of the compared values needs to be visualized on a web page to present comparison results.
Implement an existing relation extraction approach and use it on the current. This service will be the backend of your plugin. Save the location of the found relation on the web page together with the relation in a database. Visualize the strings of the relations on the web page with the help of a browser plugin.
- Technology: Browser-Extension, Python, Javascript, Hugging Face Transformers library, SpaCy library
- Tags:
- Contact: Adrian Hofer
DB01 - Developing a mapping method to consistently translate between vocabularies
FactCheck is a framework for detecting and resolving conflicting data on the Web. It establishes an entire fact comparison process that consists of data acquisition, data comparison, the presentation of comparison results, and comprehensive analysis functions. FactCheck is a leading research topic of our research group and bears challenges in many aspects.
We define facts as pieces of information that are published by data providers (e.g., as textual content in their website(s)). If two or more websites publish data on the same topic, we humans can compare the data critically. However, this task is quite difficult for a machine, as they do not have an inherent understanding of semantics. Imagine the following example:
You visit website A, which states that Vienna has 1.815.231 inhabitants, while website B states Vienna has approximately 2.000.000 inhabitants. Depending on the context, both numbers can be seen as true or not precise enough. This is a problem, as we cannot tell if the numbers are similar enough or if one of them is too far from the truth, making it a conflict. Now imagine a website C, which states the population of Vienna is 2.000.000. A new problem emerged, as website C offers us the same fact as website A and B, however they are using "population" instead of "inhabitants". As humans, we can tell that we now have two sources that agree, B and C.
However, we cannot make sure that the two websites use the same vocabulary (here, "inhabitants" and "population"). A machine is unable to understand the similarity between these concepts like a human would. Furthermore, websites may use structured data but incorrect properties by mistake. In both cases, our ability to compare facts is inhibited. This topic aims to develop a method that relies on structured data from websites to properly validate and translate between schemata.
- Technology: Python, NLTK, Scikit-learn, Azure Cloud Services
- Tags:
- Contact: Daniel Berger
DB02 - Type-based String comparison methods
FactCheck is a framework for detecting and resolving conflicting data on the Web. It establishes an entire fact comparison process that consists of data acquisition, data comparison, the presentation of comparison results, and comprehensive analysis functions. FactCheck is a leading research topic of our research group and bears challenges in many aspects.
We define facts as pieces of information that are published by data providers (e.g., as textual content in their website(s)). If two or more websites publish data on the same topic, we humans can compare the data critically. However, the comparison of data is not trivial. Imagine the following example:
Website A has information about musicians and states the name "Taylor Swift." Website B about this particular pop musician also states a name, however, as "Tailor Swift." Website C splits the name property into the first name "T." and the last name "Swift."
As humans, we can compare these strings, identify issues, and acknowledge abbreviations and spelling errors. However, for a machine, these things are a tricky challenge. Furthermore, there are other problems that may be faced in string comparison (name-to-nickname comparison, comparison of longer text, capitalization and spelling errors, homonyms,…).
This topic aims to develop a method that can reliably handle string comparisons based on the different schema types available and proves to be highly accurate in the results.
- Technology: Python, NLTK, Scikit-learn, Azure Cloud Services
- Tags:
- Contact: Daniel Berger
MSA01 - FactCheck - Chatbots as Interfaces for Fact Exploration
The FactCheck framework aims to address the issue of conflicting data on the Web by providing a systematic approach to detect and resolve such discrepancies. It encompasses the entire fact comparison process, including data acquisition, comparison, presentation of results, and advanced analysis features. As a pioneering research initiative of our research group, FactCheck presents several challenging aspects and opportunities in its development and implementation.
In recent years, Chatbots have gained widespread attention and use in various scenarios, including information retrieval. As they allow us to interact with information systems using natural languages, they could serve as an intuitive user interface for our FactCheck Project. Your task is to develop an (NLP/LLM-driven) chatbot for FactCheck that can help users explore our fact data and the statistics gained from them. You may either extend our existing chatbot (React + Azure Bot Service) embedded into the IdaFix browser extension, or develop a new chatbot for either a) one of our existing FactCheck websites/dashboards or b) a (Web) service you design yourself.- Technologies: Chatbots, Azure AI services, Azure Bot Service, React, Web Service, Python, JavaScript
- Tags:
- Contact: Marie Aichinger
MSA02 - FactCheck - IdaFix: Visualize Facts in Webpages
The FactCheck framework aims to address the issue of conflicting data on the Web by providing a systematic approach to detect and resolve such discrepancies. It encompasses the entire fact comparison process, including data acquisition, comparison, presentation of results, and advanced analysis features. As a pioneering research initiative of our research group, FactCheck presents several challenging aspects and opportunities in its development and implementation.
The visualization of facts and comparison results is a vital part of FactCheck. To address this, the IdaFix extension was developed to help users explore and compare the facts embedded into a webpage. This visualization is currently contained in the extension's popup, and there is no way to have facts highlighted directly on the webpage (where they are primarily consumed). Your task is to extend the current implementation of IdaFix to highlight facts directly on the webpage by leveraging the capabilities of content scripts for WebExtensions, and the use of technologies such as NLP and relation extraction.Note: This topic is only available for P1.
- Technologies: WebExtensions API, Content Scripts, JavaScript
- Tags:
- Contact: Marie Aichinger
MSA03 - FactCheck - SPARQL for Fact Data
The FactCheck framework aims to address the issue of conflicting data on the Web by providing a systematic approach to detect and resolve such discrepancies. It encompasses the entire fact comparison process, including data acquisition, comparison, presentation of results, and advanced analysis features. As a pioneering research initiative of our research group, FactCheck presents several challenging aspects and opportunities in its development and implementation.
In its current iteration, FactCheck collects fact data from the Web via a Web Extension and dedicated crawlers, and provides results and insights gained from them via a REST-inspired API. Allowing complex semantic queries over our collected data may be beneficial in delivering our results in a semantic-web-friendly way. Your task will be to enable semantic search over our data. First, you will revisit our current document-based data model and redesign it to a triple/graph-based model more closely aligned to OWL/RDF. This may involve finding suitable vocabularies (e.g., RDF-Cube) for our data. Then, you will investigate possible storage solutions (e.g., a triple store such as Apache Jena TBD or RDF4J) for storing the redesigned data. Finally, you will enable semantic search by configuring a SPARQL endpoint and interface.- Technologies:Web Application, Python, rdflib, SPARQL, Apache Jena, RDF4J
- Tags:
- Contact: Marie Aichinger
(B) Topics of Master Theses
Please check the listing below for possible topics for a master thesis. In principle, you may also choose from the topics listed in Section (A) above. Those topics are available for a master thesis as well, but usually in a more expanded or advanced form.
- FactChecking: Models and Languages of Precision Metrics for comparing facts on the Internet.
- FactChecking: Flexible, configurable framework for crawlers for extracting facts from web pages.
- FactChecking: AI-based text analysis tools for extracting facts from the Internet.
- FactChecking: Multimedia content (images, audio, video) analysis tools (including the use of Azure AI tools and services) for extracting facts from the Internet.
- FactChecking: Analysis of Cloud-based storage systems/services and design of a storage framework for a FactChecking prototype.
- FactChecking: Analysis and extraction of structured information from videos using state-of-the-art AI technology
- FactChecking: Analysis and extraction of structured information from images using state-of-the-art AI technology
- FactChecking: Analysis and extraction of structured information from text on the Web (news articles, scientific articles, Wikipedia, movie descriptions, etc.) using state-of-the-art AI technology and methods such as named entity recognition, key phrase recognition, and finding linked entities.
- Blockchain-based collection of semantically-correlated statements available on the Web, given by individual persons over time.
- Blockchain-based distributed media content management (e.g., using Blockchain to track images, video).
- Blockchain technology based on a microservice cloud architecture (e.g., following the approach of Edge/Fog Computing).
- Blockchain technology for providing trust in a FactCheck platform (FactCheck is a framework for the detection and resolution of conflicting structured data on the Web).
- Evaluation of platforms of specific Distributed Ledger Technology / Blockchain Technologies that vary in terms of consensus-model, validation-process, privacy-settings, e.g., technology platforms Cardano, Hashgraph, IOTA, Monero, EOS, NEO ([iteratec]).
- Blockchain-based image manipulation detection by using JPEG-specific image encoding information like macroblocks.
- Blockchain-based video manipulation detection by using MPEG-specific video encoding information like macroblocks and motion encoding.
- Enhancing blockchain technology by fast indexing and search/querying functionality using/integrating elastic-search or graph database technology.
- Enhancing blockchain technology by integrating a data model layer that offers a semantically enriched data model (e.g., XML-based, RDF-based, UML-based) to a blockchain application layer.
- Interactive course content components based on Jupyter Notebooks for a dedicated course (e.g., MRE, MRS, MCM, MST, DMP) offered in the Bachelor's or Master's program.
... additional, new topics will become available in near future. In the case of Master Theses topics you may also contact Prof. Klas, Prof. Quirchmayr, or a researcher of the MIS group to find out more about possible topics.