Open Topics for Bachelor Theses
If you are looking for a Bachelor Thesis topic, please register for the Bachelor Thesis course, Group 1 of 051080 LP Softwarepraktikum mit Bachelorarbeit (new curriculum since 2022W). Please look up the list of dedicated topics offered below in Section (A) in the current semester. All those listed topics are available for bachelor theses unless there is a corresponding restriction stated in the topic description. Of course, the topic will be limited in effort and scientific claims to meet the requirements and effort (12 ECTS) of a Bachelor Thesis. If you are interested and need to clarify details, do not hesitate to contact us; send an e-mail to Prof. Wolfgang Klas, Prof. Gerald Quirchmayr, or contact a member of the research group.
» Topics for Bachelor Thesis - see the listing in Section (A) below.
Please, make sure you follow the "Instructions: How to get a topic for my SPBA Bachelor Project" given here.
Before contacting us, PLEASE read the » Recommendations & Guidelines for Bachelor Thesis available here.
Open Topics for Master Theses and Practical Courses (PR, P1, P2)
In the following some of the open topics in the area of Multimedia Information Systems are listed. If you are interested and if you have an idea on a project do not hesitate to contact us; send e-mail to Prof. Wolfgang Klas or contact a member of the research group. In case of P1 or P2 projects, please, make sure you follow the "Instructions: How to get a topic for my P1 or P2 Project" given here.
In general, topics in the area of Multimedia Information Systems technologies include:
- analyze, manage, store, create and compose, semantically enrich & play back multimedia content;
- semantically smart multimedia systems;
- security.
Possible application domains include:
- Detecting conflicting information and checking facts on the Web
- Content Authoring and Management Systems
- Multimedia Web Content Management
- Robotic and IoT Applications
- Blockchain Technologies and Applications
- Interactive Display Systems
- Game-based Learning
- Service Oriented Architecture (SOA) and Cloud Based Services
Section (A) below lists topics that can be chosen in the course of a PR Praktikum, but are in principles also available for a master thesis (usually expanded and more advanced).
Section (B) below lists topics that are intended to be chosen for a master thesis.
(A) Topics for Practical Courses (SPBA, PR P1, PR P2)
CL/GQ01: Information Security Policy Repository
Different types of information security policies make it difficult to access relevant passages in individual policies and combine them into an actionable recommendation. A repository in which the policies are stored, and their sections can be accessed in relation to their relevance in each situation, ranging from editing and auditing policies to applying them to incident handling, would therefore be very helpful. To be effective, the developed repository needs to support the information security policy life cycle.
The goal of the project is to develop such a policy repository/store model and implement a prototype. For searching the policy store, AI technologies as well as traditional search mechanisms building on keywords and ontologies should be considered.
The following sources can serve as starting point for this project:
M. Alam and M. U. Bokhari, "Information Security Policy Architecture," International Conference on Computational Intelligence and Multimedia Applications (ICCIMA 2007), Sivakasi, India, 2007, pp. 120-122, doi: 10.1109/ICCIMA.2007.275.
Kenneth J. Knapp, R. Franklin Morris, Thomas E. Marshall, Terry Anthony Byrd, Information security policy: An organizational-level process model, Computers & Security, Volume 28, Issue 7, 2009, Pages 493-508, ISSN 0167-4048, https://doi.org/10.1016/j.cose.2009.07.001.
Nader Sohrabi Safa, Rossouw Von Solms, Steven Furnell, Information security policy compliance model in organizations, Computers & Security, Volume 56, 2016, Pages 70-82, ISSN 0167-4048, https://doi.org/10.1016/j.cose.2015.10.006.
Hanna Paananen, Michael Lapke, Mikko Siponen, State of the art in information security policy development, Computers & Security, Volume 88, 2020, 101608, ISSN 0167-4048, https://doi.org/10.1016/j.cose.2019.101608.
The suggested structure for the paper accompanying the project is:
The suggested structure for the paper accompanying the project is:
- Introduction/Topic description/Motivation
- State of the art in literature and practice
- Modelling method and approach used
- Development of the model
- Prototype (documentation, source code, etc.)
- Test
- Discussion of the results
- Outlook and conclusion
- Tags: Contact: Gerald Quirchmayr, Christian Luidold
CL/GQ02 Threat Intelligence for Cyber Security Decision Making
Cyber Security Decision Making is becoming a core aspect of cyber defense efforts. Advanced decision models and processes, such as the OODA Loop do heavily depend on the available information. The major task of this project is to develop and implement an approach to support the OBSERVE (Information collection) and ORIENT (Information analysis) phases of this type of model.
https://www.airuniversity.af.edu/Portals/10/AUPress/Books/B_0151_Boyd_Discourse_Winning_Losing.PDF
The topic can be split into two parts:
CL/GQ02a: Develop a support approach for the OBSERVE phase based on readily available sources, such as CVEs, NVD, and MISP. The import interface should ideally be based on the STIX/TAXII standard.
CL/GQ02b: Develop a support approach for the ORIENT phase exploring the potential of “emerging patterns” and “weak signals” in network defense. The goal is to monitor internal network traffic and map it on the information collected in the OBSERVE phase.
The following sources can serve as starting point for this project:
https://levelblue.com/blogs/security-essentials/incident-response-methodology-the-ooda-loop
https://cve.mitre.org/; https://nvd.nist.gov/; https://www.misp-project.org/
https://www.oasis-open.org/2021/06/23/stix-v2-1-and-taxii-v2-1-oasis-standards-are-published/
The suggested structure for the paper accompanying the project is:
- Introduction/Topic description/Motivation
- State of the art in literature and practice
- Modelling method and approach used
- Development of the model
- Prototype (documentation, source code, etc.)
- Test
- Discussion of the results
- Outlook and conclusion
- Tags:
- Contact: Gerald Quirchmayr, Christian Luidold
CL/GQ03: A containerized communication model for communication between NIST/CSF phases
The NIST Cyber Security Framework (https://www.nist.gov/cyberframework) has become an established standard for cyber security management. With version 2.0 of this framework introducing a GOVERN function (NIST CSWP 29, The NIST Cybersecurity Framework (CSF) 2.0, February 26, 2024, p. 3), the importance of communication between the functions has increased significantly, as GOVERN addresses an understanding of organizational context; the establishment of cybersecurity strategy and cybersecurity supply chain risk management; roles, responsibilities, and authorities; policy; and the oversight of cybersecurity strategy.
The goal of this project is to develop a communications model with the GOVERN function as a central command and control hub. This model should then be followed by a prototype based on container technology. The focus of the model and protype is the communication between the GOVERN function and the other functions (see figure).
NIST CSWP 29, The NIST Cybersecurity Framework (CSF) 2.0, February 26, 2024, p. 5
The following sources can serve as starting point for this project:
NIST CSWP 29, The NIST Cybersecurity Framework (CSF) 2.0, February 26, 2024, https://www.nist.gov/cyberframework
Use containers to Build, Share and Run your applications: https://www.docker.com/resources/what-container/
The suggested structure for the paper accompanying the project is:
- Introduction/Topic description/Motivation
- State of the art in literature and practice
- Modelling method and approach used
- Development of the model
- Prototype (documentation, source code, etc.)
- Test
- Discussion of the results
- Outlook and conclusion
- Tags:
- Contact: Gerald Quirchmayr, Christian Luidold
PK01 - Multi-Modal AI Shopping Assistant (Vector Search & RAG)
Problem Statement: Traditional e-commerce search engines rely on keyword matching (e.g., SQL "LIKE"), which often fails when users describe intent (e.g., "outfit for a summer wedding") rather than specific product names. While Vector-based AI search promises to solve this, implementing it requires a modern stack combining embeddings, vector databases, and multi-modal understanding (Text & Image).
Project Goal: The goal is to design and implement an intelligent "Semantic Search Engine" for a prototype web shop. The system will use Vector Embeddings to understand the meaning of a search query, allowing users to search using natural language or reference images.
Core Tasks:
- Data Pipeline: Ingest a public product dataset (e.g., Amazon Products or H&M) into a Vector Database (e.g., Qdrant, ChromaDB, or pgvector).
- Embedding Implementation: Implement a pipeline using models like OpenAI CLIP to map product images and text descriptions into a shared vector space for retrieval.
- Search Logic: Implement a backend that supports "Nearest Neighbor" retrieval for both Text-to-Product ("Find me retro red sneakers") and Image-to-Product (visual similarity).
- Frontend Prototype: Build a web interface (Streamlit/React) to demonstrate the search capabilities.
Research Focus: The research focus lies on a comparative evaluation. The student is expected to scientifically benchmark the new Vector Search against a traditional Keyword Search baseline, using standard Information Retrieval metrics such as Precision and Recall to quantify the improvement in result quality.
- Technologies: Python, Vector Databases (ChromaDB / Qdrant), Embeddings (CLIP / HuggingFace), Streamlit, Information Retrieval Metrics.
- Tags:
- Contact: Peter Kalchgruber
PK02 - Interactive "Tangible" Sound System for Children (TUI)
Problem Statement: While music is essential for child development, modern interaction often happens via touchscreens, which lack tactile feedback and fine motor skill training. "Tangible User Interfaces" (TUIs)—where physical objects control digital data—offer a more developmentally appropriate way for children to interact with technology, yet designing robust, child-safe TUIs requires solving complex challenges in hardware integration and interaction logic.
Project Goal: The goal is to design and develop a Tangible Music Interface. Instead of screens, the system should use physical artifacts (e.g., RFID-tagged figurines, color sensors, or conductive objects) to control sound. The prototype must integrate hardware sensors with a robust software state machine to create an engaging, educational experience.
Core Tasks:
- Interaction Design: Define a physical interaction concept appropriate for a specific age group (e.g., "Placing a block = Play a specific instrument" or "Stacking blocks = Mixing tracks").
- Hardware Integration: Build the physical prototype using microcontrollers (Arduino/ESP32 or Raspberry Pi). Integrate sensors (e.g., RFID/NFC, Accelerometers, or Capacitive Touch) and audio output modules (e.g., DFPlayer Mini or I2S DAC).
- Software Logic (Gamification): Implement the core logic. To ensure sufficient software complexity, the system must include an interactive mode (e.g., a "Simon Says" memory game, a sequence recorder, or a quiz mode) rather than just passive playback.
- Enclosure Engineering: Design and construct a robust, user-friendly housing. Note: Safety and durability are key requirements for child-focused hardware.
Research Focus:
The research focus lies on a Usability & Design Evaluation.
Option A (User Study): Observe a small group of users interacting with the device to measure engagement and intuitiveness.
Option B (Heuristic Eval): Rigorously evaluate the prototype against established Child-Computer Interaction (CCI) guidelines.
- Technologies: Arduino / ESP32 / Raspberry Pi, RFID / NFC Technology, C++ (Firmware) or Python, 3D Printing / Laser Cutting, State Machines.
- Tags:
- Contact: Peter Kalchgruber
PK03 - Automated Grading of Paper Exams using Computer Vision (OMR)
Problem Statement: Paper-based exams are robust against cheating but impose a massive grading burden on educators. Many solutions (like Moodle) require separate Answer Sheets, which are confusing for students and disconnect the answer from the question context. There is a lack of open-source tools that allow flexible, "single-sheet" exam layouts (where answers are ticked directly next to the question) while still supporting fully automated grading via scanning.
Project Goal: The goal is to develop a "Scan-to-Grade" pipeline. The student will build a web application that allows lecturers to (1) Design an exam with auto-generated alignment markers, (2) Bulk-upload scanned PDF submissions, and (3) Automatically grade them using Computer Vision algorithms to detect ticked checkboxes.
Core Tasks:
- Exam Generation Engine: Create a module that generates PDF exams with Fiducial Markers (e.g., ArUco markers or QR codes) at the corners. These markers are essential for the software to "understand" the page geometry later.
- Computer Vision Pipeline (The Core): Implement a Python backend (OpenCV) that:
- Detects the corner markers to De-skew and align the scanned image.
- Applies perspective transformation to flatten the page.
- Extracts the "checkbox regions" based on the digital template coordinates.
- Uses thresholding/pixel density analysis to determine if a box is "Ticked" or "Empty."
- Review Interface: Build a frontend (React) that shows the scanned exam with the detected answers overlaid, allowing the lecturer to manually correct any detection errors.
Research Focus:
The research focus lies on accuracy evaluation. The student must conduct a stress test using a dataset of ~30 scans with various "messy" marking styles (e.g., faint ticks, crossed-out corrections) to calculate the system's Accuracy, False Positive, and False Negative rated compared to manual grading.
- Technologies: React / TypeScript, Python / FastAPI, OpenCV, PDF Generation.
- Tags:
- Contact: Peter Kalchgruber
PK04 - Ambient Location Awareness System (The "Weasley Clock" IoT Project)
Problem Statement: Staying informed about the safety and whereabouts of close family members often involves intrusive measures, such as constantly checking text messages or using tracking apps on a smartphone. This demands high cognitive attention ("Active Monitoring"). In contrast, "Calm Technology" aims to provide this information peripherally—using ambient light or physical movement to convey status without demanding focus.
Project Goal: The goal is to design and build a physical Ambient Information Display (IoT Device). Inspired by the "Weasley Clock," this device will visualize the location status (e.g., "Home," "Work," "Transit," "Gym") of tracked users using physical actuators (servo motors) or light patterns (LED rings), rather than a digital screen.
Core Tasks:
- Data Ingestion (MQTT): Instead of building a mobile app, set up an MQTT Broker to ingest real-time location data from open-source apps like OwnTracks or Home Assistant installed on the user's phone.
- Geofencing Logic: Implement a backend service (Python) that maps raw GPS coordinates to semantic states (e.g., Lat/Lon 48.208 -> "University").
- Hardware Construction: Build the physical display using a microcontroller (ESP32 or Raspberry Pi) e,g,.:
- Use Servo Motors to move a physical hand pointing to the current location.
- Use an RGB LED Ring where specific colors/positions represent locations.
- …
- Privacy Management: Implement a "Privacy Mode" switch on the device that stops data visualization when guests are present.
Research Focus:
The research focus lies on a User Study on "Calm Technology." The student should deploy the device in a home e.g., for 1 week and evaluate (via diary study or interviews) whether it reduced the urge to check phones (Information Anxiety) or improved the feeling of connectedness compared to standard mobile notifications.
Advanced Option: Implement a Machine Learning model to predict the next destination based on historical movement pattern.
- Technologies: ESP32 / Raspberry Pi, MQTT (Message Queuing Telemetry Transport), Python / Node.js, Servo Motors / NeoPixels, OwnTracks API.
- Tags:
- Contact: Peter Kalchgruber
PK05 - Real-Time Audio Fact-Checking Pipeline (Streaming STT & NLP)
Problem Statement: Misinformation spreads rapidly during live events (e.g., debates, streams), often outpacing post-hoc verification. While Speech-to-Text (STT) exists, the challenge lies in processing infinite audio streams. A system must intelligently "chunk" continuous audio into semantic sentences (Audio Segmentation), transcribe them with low latency, and verify claims against a knowledge base without breaking the live flow.
Project Goal: The goal is to develop a "Near Real-Time" Analysis Pipeline. The system will ingest a live audio stream (Mic or WebSocket), segment it using Voice Activity Detection (VAD), transcribe speech to text, and trigger an automated fact-checking agent (LLM or FactCheck API) to display a "Truth Overlay" on a dashboard.
Core Tasks:
- Audio Stream Processing: Implement a buffering system using WebSockets that handles continuous audio data. Crucially, integrate Voice Activity Detection (VAD) to slice the stream into grammatically valid "chunks" for processing.
- Transcription Pipeline: Integrate a high-speed STT engine to convert audio chunks to text on the fly.
- Claim Verification Agent: Develop a lightweight NLP module that detects "check-worthy" claims and verifies them using an external API (e.g., Google FactCheck Claims API or an LLM with Search access).
- Live Dashboard: Build a React frontend that displays the rolling transcript and highlights "Verified" vs. "Disputed" statements with minimal delay.
Research Focus:
The research focus lies on a System Performance Evaluation. The student must benchmark the End-to-End Latency (Time-to-Verification) and analyze the trade-off between Segmentation Length (waiting for a full sentence) vs. Transcription Accuracy (Word Error Rate).
- Technologies: Python (FastAPI/WebSocket), Voice Activity Detection (Silero VAD), Streaming STT (Whisper / Google), LLMs (LangChain), React.
- Tags:
- Contact: Peter Kalchgruber
PK07 - Robust Workout Recognition: Personalized vs. Generalized Models
Problem Statement: While general activity tracking (e.g., running or walking) is well-established, fine-grained recognition of strength training exercises remains a significant challenge due to complex movements and frequent transitions. Current wearables often require manual input to start sets or struggle to distinguish between exercises that have similar movement patterns, such as bench presses versus shoulder presses. Furthermore, high intra- and inter-individual variability in execution style makes it difficult for standard "one-size-fits-all" systems to provide accurate results without specific adaptations.
Project Goal: The goal is to design and implement an end-to-end Human Activity Recognition (HAR) Pipeline that automatically segments and classifies strength training exercises using data from a wrist-worn wearable. The project aims to investigate if simpler, more interpretable machine learning approaches can effectively handle complex workout environments and whether specialized individual training significantly outperforms generalized models.
Core Tasks:
- Data Acquisition & Preprocessing: Implement a data collection system (e.g., using Bangle.js or similar) to record 3D-accelerometer and barometer data at a sampling rate of at least 10Hz. Develop a pipeline to handle noise, normalize timestamps, and extract dynamic movement components by removing gravitational influence via high-pass filtering.
- Feature Engineering: Design and implement a multi-domain feature vector. This must include time-domain statistics (MAD, Zero-Crossings), frequency-domain metrics (band energy, dominant frequency), and orientation-based features (pitch/roll) to distinguish between spatially similar movements.
- Classification & Post-Processing: Train classification models (e.g., Logistic Regression or Random Forest) to predict exercises per window. Crucially, implement a Post-Processing Logic (e.g., State Machine Decoder, Exercise Gate) to merge window predictions into stable, plausible workout segments and rest periods.
- System Integration: Build a prototype application (Web or Mobile) that allows users to upload raw sensor data and receive a summarized workout report, including recognized exercises, sets, and durations.
Research Focus:
The research focus lies on a formal System Validation comparing model performances. The student is expected to conduct an evaluation using data from different users to calculate standard metrics (Precision, Recall, F1-Score) for both generalized and personalized approaches. Additionally, the student must analyze the Segmentation Quality—evaluating how accurately the system identifies the start and end of sets compared to a ground-truth log.
- Technologies: Python (Scikit-learn / Pandas), Embedded JavaScript (Bangle.js) or Mobile Sensors, React / TypeScript, Signal Processing, Machine Learning.
- Tags:
- Contact: Peter Kalchgruber
PK08 - Gamified Grading for Schools (Age 10-18): An Offline-First PWA
Problem Statement: In secondary education (High School / Gymnasium), students often receive performance feedback only via delayed report cards. While Gamification (XP/Levels) effectively motivates the 10-18 age group, teachers lack digital tools to apply it efficiently. Furthermore, school network infrastructure is often unreliable; a grading tool that freezes due to poor Wi-Fi disrupts the flow of the lesson and will not be adopted.
Project Goal: The goal is to design and develop a Gamified Classroom Companion specifically tailored for Pupils (10-18 years) and Teachers. The system must be built as an Offline-First Progressive Web App (PWA) that guarantees zero-latency interaction for the teacher, regardless of internet connectivity.
Core Tasks:
- Offline-First Architecture: Implement a robust data layer using Service Workers and IndexedDB. The teacher's app must remain fully functional (creating grades/notes) while offline and automatically synchronize data with the cloud once connectivity is restored.
- Interaction Design (Teacher Focus): Develop a mobile-first interface that allows teachers to grade an entire class efficiently (e.g., "Swipe-to-Grade") to minimize distraction.
- Gamification Logic Engine: Implement a backend that transforms standard school criteria (Homework, Participation) into an XP system with "Achievements" relevant to teenagers (e.g., "Homework Streak").
- Privacy & Data Protection: Implement strict access controls. Since school data involves minors, pupils must strictly only see their own progress.
Research Focus:
The research focus lies on a Usability & Motivation Evaluation.
Metric A (Efficiency): Measure the time required to grade a student under simulated "poor network" conditions (Offline vs. Online).
Metric B (Motivation): Evaluate whether the "XP System" increases perceived motivation among the target group (10-18 years).
- Technologies: React (PWA), Service Workers / IndexedDB (Offline Storage), Firebase (Firestore) or Node.js, Gamification Mechanics.
- Tags:
- Contact: Peter Kalchgruber
PK10 - Entity Resolution for Heterogeneous Web Data (The FactCheck Project)
Problem Statement: The Web contains vast amounts of conflicting information about entities (e.g., Persons, Movies, Events). A core challenge in Web Information Integration is that different sources use different identifiers (e.g., "J. Doe" vs. "John Doe"). Before any "Fact Checking" can occur, the system must first solve the Identity Problem: determining which heterogeneous data points refer to the same real-world object without having a shared Primary Key.
Project Goal: The goal is to design and implement an Entity Resolution Pipeline for the "FactCheck" framework. The student will build a system that ingests semi-structured data (JSON/HTML from different web sources), normalizes it, and applies similarity algorithms to link disparate records to a single "Golden Entity."
Core Tasks:
- Data Normalization: Develop a preprocessing module to clean and standardize noisy web data (e.g., handling date formats, removing special characters from names).
- Blocking & Indexing: Implement a "Blocking Strategy" to efficiently find candidate matches. (Comparing every record to every other record is too slow; the system must intelligently narrow down the search space).
- Similarity Metrics: Implement and compare different matching algorithms.
- Resolution Logic: Design the decision logic (e.g., weighted thresholding) that decides if two records are a "Match," "Potential Match," or "Non-Match."
Research Focus:
The research focus lies on a Quantitative Performance Evaluation. The student must evaluate their pipeline against a labeled "Ground Truth" dataset (e.g., standard benchmarks like Cora/Cora-Ref or a custom scraped dataset). The analysis must report Precision, Recall, and F1-Score to scientifically justify the chosen resolution strategy.
- Technologies: Python (Pandas, RecordLinkage, Dedupe), String Metrics (FuzzyWuzzy), Vector Embeddings (Optional), JSON/Semi-structured Data.
- Tags:
- Contact: Peter Kalchgruber
PK11 - Automated AI-Driven Oral Examinations for Comprehensive Assessment
Problem Statement: The integration of Generative AI in education has created a "verification gap." High-quality written submissions and coding projects often fail to prove that a student has truly mastered the underlying theoretical concepts. While traditional oral exams are the most robust way to verify individual knowledge and critical thinking, they are difficult to implement at scale due to faculty time constraints.
Project Goal: Inspired by recent pedagogical research (e.g., at NYU), this project aims to develop a scalable, automated "AI Examiner" prototype. The system will conduct voice-based interviews to probe a student's understanding of the theoretical course content (concepts, models, and methodologies taught during the semester).
Core Functionality: The system should be designed as a modular framework. The student is free to choose the implementation strategy based on their available resources:
- Option A (Local): Use open-source models (Ollama, Whisper) for a free, privacy-first approach. Note: This requires a laptop with capable hardware (e.g., Apple Silicon or NVIDIA GPU) to ensure low latency.
- Option B (Cloud): Use external APIs (OpenAI, Anthropic, ElevenLabs) for maximum performance. Note: If this path is chosen, the student is responsible for possible API usage costs (est. €20-€50).
- Contextual Awareness: Ingest course materials (lecture slides/textbooks) to build a relevant RAG (Retrieval-Augmented Generation) pipeline.
- Voice-First Interaction: Facilitate a real-time dialogue using Speech-to-Text and Text-to-Speech technologies.
- Adaptive Questioning: Use LLMs to "drill down" into specific topics if a student's initial answer is vague, mimicking a human examiner.
- Automated Evaluation: Generate a grading report evaluating correctness, clarity, and depth of knowledge (potentially utilizing multi-model consensus).
Research Focus:
The research focus lies on an Effectiveness Analysis. The student must compare the AI-led oral exam results against traditional written assessment scores for the same topics to identify if the voice-based approach better detects deep conceptual knowledge vs. surface-level memorization.
- Technologies: LLM Orchestration (e.g., LangChain, AutoGen), LLMs (flexible: local via Ollama/Llama.cpp or via APIs like GPT/Gemini/Claude), STT/TTS (e.g., Whisper, Web Speech API, or ElevenLabs), Backend: Python (FastAPI).
- Tags:
- Contact: Peter Kalchgruber
PK12 - 3D Acoustic Localization & Tracking of Indoor Insect Flight
Problem Statement: Detecting mosquitoes in indoor environments is a critical challenge for health and comfort. While visual tracking is often ineffective due to the insect's size and speed, their unique "wing-beat" frequency (typically 300 Hz – 700 Hz) provides a distinct acoustic signature. However, identifying this signature and tracking its movement in 3D space requires precise microsecond-level signal processing which is difficult to achieve on low-cost hardware.
Project Goal:
The goal of this "hardware/systems" project is to design, build, and calibrate a sensor-based system capable of detecting and localizing a sound source in a room. The student will focus on the challenge of TDOA (Time Difference of Arrival) geometry and signal processing.
Note: Due to the physical difficulty of recording real insects, the primary development and grading will be performed using a "Simulated Source" (Smartphone generating specific frequencies).
Core Tasks:
- Hardware Integration: Construct a time-synchronized microphone array. This involves wiring at least 4 digital MEMS microphones (I2S) to a central processing unit (ESP32 or Raspberry Pi) in a non-planar (3D) physical layout.
- Acoustic Calibration: Develop a routine to calculate the exact X/Y/Z coordinates of the microphones relative to each other (e.g., using a reference "chirp" signal).
- Signal Identification (DSP): Implement Fast Fourier Transform (FFT) or Bandpass Filtering to isolate the target frequency (e.g., 450 Hz) from background noise.
- 3D Localization Algorithm: Implement a TDOA multilateration algorithm (e.g., GCC-PHAT) to calculate the source's coordinates in real-time based on the delay between microphones.
- Visualization: Create a simple real-time dashboard (Python/Web) plotting the (estimated 3D) flight path of the source.
Research Focus:
The research focus lies on a Localization Accuracy Evaluation. The student must conduct a controlled experiment (e.g., placing the sound source at 10 known grid points) and calculate the Mean Absolute Error (MAE) of the system's estimated X/Y/Z coordinates compared to the actual physical position.
Prerequisites:
- Strong interest in Physics/Math and basic soldering skills.
- Technologies: Digital Signal Processing (DSP), Microcontrollers (ESP32/Raspberry Pi), Python (SciPy, NumPy), Cross-Correlation (GCC-PHAT), TDOA/Multilateration.
- Tags:
- Contact: Peter Kalchgruber
PK13 - Wireless IoT Reflex Game
Problem Statement: Reaction-time training tools are often expensive and proprietary. Creating a custom, open-source alternative allows for flexible gamification but presents significant challenges in synchronizing distributed wireless nodes with low latency. The system must coordinate multiple independent devices to act as a single, cohesive game unit.
Project Goal:
The goal is to design and implement a distributed reflex game consisting of 4-5 wireless "Smart Buttons" and a central controller.
Note: The project focuses on the software and networking challenges. To ensure feasibility, standard hardware components are provided, though the physical implementation remains flexible.
Core Tasks:
- Hardware Integration: Implement the physical interaction nodes. The baseline concept is to retrofit standard 100mm Arcade Buttons with microcontrollers (to minimize hardware effort), but the specific implementation is open to suggestions depending on the student's knowledge and experience (e.g., 3D printing custom enclosures).
- Wireless Low-Latency Network: Implement a robust communication protocol (e.g., ESP-NOW or Wi-Fi UDP) to ensure instant, millisecond-level synchronization between the "Master" controller and the distributed "Pods."
- Control Interface: Develop a user-friendly way to configure the training (e.g., a Web Dashboard hosted on the Master node, or a Mobile App) to select game modes like "Speed Drill," "Multiplayer," or "Memory Sequence."
- Distributed State Management: Develop the game logic (State Machine) to handle concurrency, ensuring the system remains stable regardless of which node is pressed.
- Performance Evaluation & Analytics: Implement a system to measure and visualize user performance, calculating metrics such as Reaction Time (ms), Accuracy, and Consistency.
Research Focus: The research focus lies on a Network Performance & Usability Study. The student must implement a software-based "Round-Trip Time" (RTT) test to measure internal network latency and verify the system's real-time capability (< 100ms). Alternatively, a user study must be conducted to compare the distributed physical setup against standard screen-based reaction trainers.
Prerequisites & Resources:
- Experience with C++ / Arduino and basic circuit wiring.
- Hardware: All necessary components (ESP32 units, Arcade Buttons, Power Banks) will be provided by the supervisor.
- Technologies: Embedded C++ (Arduino IDE), ESP-NOW / Wi-Fi Networking, Web Server (AsyncWebServer), Microcontrollers (ESP32), Data Analytics.
- Tags:
- Contact: Peter Kalchgruber
PK14 - AI-Assisted Automated Grading for Open-Ended Programming Assignments in Moodle
Problem Statement: Modern e-learning platforms like Moodle excel at automated Multiple Choice assessments but struggle with open-ended tasks, such as code logic explanations or architectural decisions. While existing plugins (like VPL) can check if code compiles, they cannot assess the quality, style, or conceptual understanding. Manual grading of these qualitative aspects is time-consuming and inconsistent.
Project Goal: The goal is to develop an AI-driven "Grading Assistant" that integrates with Moodle. The system will act as a "Second Reader," automatically fetching student submissions, evaluating them against a rubric using LLMs, and generating a transparent grading report for the lecturer to review.
Core Tasks:
- Moodle Integration: Use the Moodle REST API to programmatically retrieve student submissions (text or code files) and eventually upload the feedback.
- Data Privacy & Anonymization: Implement a pre-processing step to anonymize data (stripping PII/Student IDs) before sending content to an external LLM provider, ensuring GDPR compliance.
- Qualitative Analysis Pipeline: Develop a prompt engineering strategy to evaluate the submission based on a rubric (e.g., "Is the variable naming consistent?", "Does the comment explain the *why*?", "Is the recursion base case valid?").
- Reasoning & Justification: The system must output a structured "Defense of Grade," itemizing exactly why points were deducted. This prevents "Black Box" grading.
- Lecturer Dashboard (Human-in-the-Loop): Create a simple frontend where the lecturer sees the AI-suggested grade alongside the student work and can "Accept" or "Override" it with one click.
Research Focus:
The research focus lies on an Agreement Analysis. The student must compare the AI-generated grades with those given by a human lecturer for the same set of submissions, and calculate the Inter-Rater Reliability (e.g., Cohen's Kappa or Pearson Correlation) to determine whether the AI is consistent and fair.
Prerequisites & Resources:
- Experience with Python (FastAPI/LangChain) and basic knowledge of REST APIs.
- Technologies: Moodle REST API, Python (LangChain, Pydantic), LLMs (GPT-4o / Claude 3.5), Streamlit or React (for the Dashboard).
- Tags:
- Contact: Peter Kalchgruber
PK16 - Smart Student Support: A RAG-Based Chatbot for University Curricula
Problem Statement: University websites contain vast amounts of information (curricula, ECTS guidelines, opening hours), but finding specific answers is often difficult due to fragmented navigation. Standard keyword searches fail to answer complex questions, such as "Which courses do I need for the Data Science specialization?"
Project Goal: The goal is to develop a Retrieval-Augmented Generation (RAG) Chatbot. Unlike basic scripted bots, this system will ingest official university PDF/HTML documents, index them, and use an LLM to generate accurate, cited answers to student queries in natural language.
Core Tasks:
- Knowledge Ingestion: Build a scraper/parser to extract text from Curriculum PDFs and University Webpages and chunk them for indexing.
- Vector Search Pipeline: Implement a retrieval system (using LangChain and a Vector DB) that finds the most relevant document chunks for a given student question.
- Hallucination Guardrails: Engineer the LLM system prompt to strictly answer only based on the retrieved context and provide citations (e.g., "Source: Curriculum 2024, Page 5").
- Chat Interface: Build a user-friendly web widget (React) that maintains conversation history and context.
Research Focus:
The research focus lies on an Answer Quality Evaluation. The student must curate a dataset of ~30 "Golden Questions" and measure the bot's performance using metrics such as Answer Relevance and Faithfulness (e.g., using frameworks like RAGAS or manual expert review).
- Technologies: Python (LangChain), Vector Databases (ChromaDB), LLMs (OpenAI/Ollama), React.
- Tags:
- Contact: Peter Kalchgruber
WK01 - Jupyter Notebooks for Dedicated Interactive Content of Courses
Jupyter Notebooks allow for the creation and sharing of documents that contain live code, equations,
visualizations,
and narrative text. Jupyter Notebooks are a well-established and well-recog
,,nized tool in academia and education in
general as well as in specific fields of research where it is important to provide for reproducibility of
scientific
results.
Goal of the project is to develop dedicated Jupyter Notebooks for specific course content relevant in the
context of
our courses (MOD, MCM, MST, MRE, MRS). The approach can be based on the existing framework that we already use
for
Juypter Notebooks in some of our courses but may also further improve or suggest new solutions for the framework
as
such. The selection of the programming language to be used needs to meet the requirements of the course content,
most probably Python, but - in fact - is very flexible as Jupyter Notebooks work with a variety of languages.
Mandatory requirement: Student must have understood the course content / material very well and should have
passed
the course already.
- Technology: Juypter Notebook, Python, Jupyter Notebook Hub of the CS-faculty, Markdown, VS Code (or similar IDE)
- Tags:
- Contact: Wolfgang Klas
WK02 - FactCheck - Precision Metrics
FactCheck is a framework for the detection and resolution of conflicting structured data on the Web. The FactCheck framework is the result of ongoing research at our research group. One of the central building blocks is the context-dependent comparison of structured data of various representations of one and the same real world object or artefact. The comparison is guided by so called precision metrics which is a flexible and sophisticated technique for logically comparing structured data values. Precision metrics consist of logical predicates used to evaluate the comparison of structured data. Goal of the project is to design and implement an appropriate model for the representation of precision metrics, the construction of such precision metrics as well as the application of the metrics for evaluating the comparison of data values. Various precision metrics should be defined and compared using a test dataset of 900.000 entities. Results of the project are to be demonstrated by a running demo application.
- Technology: Web Services, Semantic Web technologies, LOD, Microformat, JSON-LD, AI-Tools, Docker
- Provided to the students: existing implementation of framework, test dataset
- Tags:
- Contact: Wolfgang Klas and Daniel Berger
WK03 - Demo of Blockchain Application Using Ethereum
The goal of this project is the implementation of a demo application which illustrates the concept of the consensus technique, e.g., proof-of-stake, Clique (proof-of-authority) (but not the often used "proof-of-work" as, e.g., used in the Bitcoin Blockchain). For example, a possible application could be the implementation of the four-eyes principle (Vier-Augen-Prinzip) for officially approving documents by making use of two signers acting as "proof-of-authorities". There are many other possible application scenarios feasible, e.g., the decision taking principles of a management board of an association or a company, board of managers, board of trustees or directors. The application scenario should be well-chosen in order to illustrate the general principle of proof-of-authority. It may be based on a generic, configurable implementation to show different variations of the proof-of-authority concept, e.g., 1 signer, 2 signers, N signers. The demo application has to be realized such that a short demonstration movie can be recorded, that will be published on the Lab's website.
- Technology: Ethereum, Web technologies, Docker
- Tags:
- Contact: Wolfgang Klas
WK04 - Roon API-based music exploration service available via MCP server
Roon (https://roon.app/en/) is a premium application to manage large, high quality music libraries. It supports browsing, discovery, collecting, and listening to music. One of the unique values of Roon is its rich metadata (see the Metadata Model used by Roon https://help.roonlabs.com/portal/en/kb/articles/metadata-model ) used to organize and describe music. Roon consists of a server and a client, offering access to locally stored music and to high-end/high-res streaming services Qobuz and Tidal. Roon can be run on local hardware (laptops, NUC, NAS systems, etc.) or dedicated pre-configured music server products.
The goal of the project is to implement an application for the exploration of metadata on music managed by Roon. The application should make use of the Roon API (https://github.com/RoonLabs/node-roon-api ). The services to explore music metadata should be made available via a MCP server (https://modelcontextprotocol.io/docs/learn/architecture ) that communicates to an MCP client, which is hosted by a regular Web App (no need for an AI component (that would be optional), just illustrating the functionality by means of a comprehensive demo application). The various layers and components need to be designed and implemented in a highly modular architecture to enable re-use of those components in other applications (like the FactCheck system). Such a modular architecture also allows to include other sources than Roon, e.g., metadata from MusicBrainz (https://musicbrainz.org ) or Discogs (https://www.discogs.com) via their APIs, made available via MCP servers. The latter sources are to be included optionally in the project (depending on SPBA, P1, P2).
- Technology: Roon, Roon API, MCP, Web technology, Python, JavaScript, MusicBrainz API, Discogs API
- Tags:
- Contact: Wolfgang Klas
WK06 - "Studienleistungs & Prüfungspass" Based on Ethereum Blockchain Technology
The goal of this project is - starting out from a given demo implementation - to implement an application for a digital "Studienleistungs & Prüfungspass" (study performance & examination pass) based on blockchain technology. The pass will record the individual, required study achievements (like milestones, tests, etc.) during a course, the final grading of a course, and the collection of gradings of courses during the entire study (like a "Sammelzeugnis" currently used by the university). There are various stakeholders in this scenario: the students, the lectures of courses, and administration (like SPL). The implementation has to be realized based on Ethereum Blockchain technology, which provides the concept of Smart Contracts. Ethereum Smart Contract technology is one of the most promising implementations for smart behavior of blockchain systems. The focus will be on the proper design and implementation of smart contracts to capture most of the functionality of the application.
- Technology: Ethereum Blockchain Infrastructure, on Linux of Windows, or on Cloud Infrastructure, Web-Technologies for implementing Web-based application, Docker.
- Provided to the students: Optionally, virtual machine
- Tags:
- Contact: Wolfgang Klas
WK07 - Securing Images or Videos by Applying Blockchain Technology
The goal of this project is the design and the implementation of a framework based on blockchain technology that allows for the detection of manipulations in images or videos. Images or videos can be manipulated, e.g., individuals (or other objects) can be added to or removed from an image, video frames (or sequences of video frames) can be added or removed from a video. Such a manipulation should be detected to ensure authenticity of content.
The manipulations should be detected based on the storage of specific image encoding parts in a blockchain which allows to re-check the validity of an image encoding. E.g., essential macroblocks or portions of some macroblocks of a JPEG-encoded image could be stored in a blockchain such that it can be checked whether an image still consists of those macroblocks or includes manipulated macroblocks. The project will first have to select and specify the kind of content type (e.g., images, videos) and the kind of manipulations to be considered in the scope of the project. Then one needs to design an approach and a software framework and implement a prototype and a demo application illustrating the approach, based on a specific blockchain platform that best suits the needs of the application.
- Technology: Blockchain Infrastructure (like Ethereum) on Linux, Windows, or on Cloud Infrastructure, JPEG, MPEG, Web-Technologies for implementing Web-based demo application, Docker, Image processing libraries like https://imagemagick.org/, using, e.g., perceptual hash support, e.g.: https://github.com/science-periodicals/phash-imagemagick
- Provided to the students: Optionally, virtual machine
- Tags:
- Contact: Wolfgang Klas
WK08 - Securing Images or Videos by Applying Cryptographic Tools
The goal of this project is the design and the implementation of a framework based on cryptography that allows for the detection of manipulations in images or videos. Images or videos can be manipulated, e.g., individuals (or other objects) can be added to or removed from an image, video frames (or sequences of video frames) can be added to or removed from a video. Such a manipulation should be detected to ensure authenticity of content.
The manipulations should be detected based on cryptographic tools. Devices (software) that record scenes will need to include cryptographic signatures that later allow verifying that the recordings are not altered. A public key cryptographic hash on the image, video, etc. will not match if the item is altered in any way. Manufacturers would need to support a service that allows verification through private keys associated with the device that produced the image/video. The project will first have to select and specify the kind of content type (e.g., images, videos) to be considered in the scope of the project. Then one has to design a software framework and implement a prototype and a demo application illustrating the approach for a service that allows verification through private or public keys associated with the device (software) that produced the image/video.
- Technology: Public/Private Key infrastructure, Linux, Windows, or Cloud Infrastructure, JPEG, MPEG, Web-Technologies for implementing Web-based demo application, Docker, Image processing libraries like https://imagemagick.org/, using, e.g., perceptual hash support, e.g.: https://github.com/science-periodicals/phash-imagemagick
- Provided to the students: Optionally, virtual machine
- Tags:
- Contact: Wolfgang Klas
WK09 - FactCheck - IdaFix Browser-Extension UI for a Chatbot
The FactCheck framework is designed to address the issue of conflicting data on the Web by providing a systematic approach to detect and resolve such discrepancies. It encompasses the entire fact comparison process, including data acquisition, comparison, presentation of results, and advanced analysis features. As a pioneering research initiative of our research group, FactCheck presents several challenging aspects and opportunities in its development and implementation.
This project aims to find a solution for a user interface which allows an end user visiting a web page to understand the comparison results on conflicting information as well as to provide user feedback to the FactCheck system behaviour. The interface should be realized as an interactive chatbot. The startig point for the project is a prototypically implemented browser extension (IdaFix) which illustrates the functionality as well as the internal system API to be used.
- Technology: Web Browser technologies, e.g., JavaScript, HTML & CSS, Browser APIs (e.g., WebExtensions API), Background Scripts, Content Scripts, Popup Scripts, Messaging APIs, JSON-LD
- Tags:
- Contact: Wolfgang Klas and Marie Aichinger
AH01 - Structured Data Extraction from Unstructured Web Content
In many real-world applications, extracting structured data from unstructured text on web pages is a critical task. While numerous Natural Language Processing (NLP) approaches claim to handle this challenge, implementing a robust and scalable solution remains a key area of exploration. This project focuses on leveraging state-of-the-art information extraction models to build a relation extraction web service capable of transforming unstructured text into structured data.
-
Develop a Relation Extraction Web Service:
- Implement a web service that processes web pages as input and outputs extracted triplets (subject, predicate, object) based on a pre-defined schema.
- Use cutting-edge NLP and information extraction techniques to ensure accuracy and scalability.
-
Create a User-Friendly Web Application:
- Design a front-end web application that interacts with the web service.
- Provide an intuitive interface for users to input web pages and view the extracted structured data.
In Praktikum 1 (P1), you will extract structured data from Wikipedia articles and compare the extracted data to Wikidata. This practicum will focus on understanding the relationship between unstructured and structured data and evaluating the accuracy of the extraction process. For Praktikum 2 (P2), you will build an extraction module for FactCheck that:
- Scrapes web page content.
- Extracts information about individuals mentioned on the page.
- Groups the extracted triplets by individual.
- Uses the FactServer's compare endpoint to validate and compare the extracted information with existing data.
Gain hands-on experience with cutting-edge NLP and information extraction models. Learn how to bridge the gap between unstructured and structured data. Work on real-world applications like Wikipedia data analysis and conflict detection systems. Develop skills in web service development, front-end design, and system integration.
- Technology: Technologies: Web Application, Web Service, Python, Javascript, Hugging Face Transformers Library, SpaCy Library, PyTorch, Schema.org
- Tags:
- Contact: Adrian Hofer
AH02 - Unified Entity Linking Web Service
FactCheck is a framework for detecting and resolving conflicting data on the Web. It establishes an entire fact comparison process that consists of data acquisition, data comparison, the presentation of comparison results, and comprehensive analysis functions. FactCheck is a leading research topic of our research group and bears challenges in many aspects. To enhance data acquisition, your task is to link the extracted information to existing knowledge bases.
Named Entity Recognition (NER) and Entity Linking (EL) are critical components of natural language processing (NLP) applications, enabling machines to identify and link entities (e.g., people, places, organizations) in text to structured knowledge bases. However, the vast array of available entity linking APIs—such as DBpedia Spotlight, WAT, and Stanford NLP—often yield inconsistent results due to differences in their underlying algorithms and datasets. This inconsistency poses a challenge for developers and researchers seeking reliable and unified entity linking solutions.
This project aims to address this challenge by creating a web service that integrates multiple entity linking tools, allowing users to configure and combine them flexibly. The service will be showcased through an intuitive web application, enabling users to link named entities in web pages or text using customizable configurations.
- Technology: Web Application, Web Service, Python, Javascript, Hugging Face Transformers Library, SpaCy Library, PyTorch, Schema.org
- Tags:
- Contact: Adrian Hofer
AH03 - Intelligent Web Scraping
The web is vast and diverse, making the task of scraping web pages both challenging and intricate. Yet, extracting meaningful content from web pages is crucial for applications like conflict detection, where identifying discrepancies between sources requires precise and structured data. This project focuses on developing advanced web scraping techniques that can segment web pages into meaningful sections and extract user-relevant content with configurable granularity and depth.
For Praktikum 1 (P1), build a news aggregator for a selected subset of web pages. This will involve designing a system to collect, organize, and display news content from multiple sources in a user-friendly format. In Praktikum 2 (P2), you will develop a web scraping tool that integrates with the FactCheck server. This tool will extract content from web pages and connect it to a fact-checking system to identify and analyze potential conflicts between sources.
Gain hands-on experience with web scraping techniques and tools. Learn how to handle the complexities of heterogeneous web content. Enhance your skills in data processing, conflict detection, and system integration. This project is ideal for students interested in web technologies and building impactful tools for information analysis.
- Technology: Web Service, Web Application, Scraping, Crawling, Conflict Detection
- Tags:
- Contact: Adrian Hofer
AH04 - Model Context Protocoll for Information Extraction
The Model Context Protocol (MCP) is an open protocol that enables seamless integration between LLM applications and external data sources and tools. MCP is a framework for managing interactions between AI models and their context, making it particularly useful for tasks like information extraction. We want to investigate the capability of that tool for an application in our research context.
Develop a versatile system that processes unstructured data from various sources, such as documents, real-time streams, or conversational transcripts, to extract and organize key information into structured, actionable formats. The system should leverage context-aware capabilities to summarize content, identify relationships, track evolving details, and provide users with interactive tools for querying and visualizing insights, while maintaining contextual consistency across multiple inputs.
- Technology: Web Service, Web Application, MCP, LLM
- Tags:
- Contact: Adrian Hofer
AH05 - Abstract Meaning Representation
Understanding how computers interpret and represent human language is a key challenge in natural language processing. By working with AMR and RDF, this project introduces students to two important frameworks for semantic representation. Using the py_amr2fred library, the project demonstrates how website text can be systematically converted into JSON-LD, offering a practical example of bridging unstructured text with structured semantic data. The resulting application not only highlights the conversion pipeline but also provides valuable insights into the role of semantic technologies in knowledge representation.
- Technology: Web Application, Web Service, Python, amr2fred, rdflib
- Tags:
- Contact: Adrian Hofer
DB 01 - FactCheck - Customizable Comparison Function Logic (target group P1/P2)
FactCheck is a framework for detecting and resolving conflicting data on the Web. It establishes an entire fact comparison process that consists of data acquisition, data comparison, the presentation of comparison results, and comprehensive analysis functions. FactCheck is a leading research topic of our research group and bears challenges in many aspects.
We define facts as pieces of information that are published by data providers (e.g., as textual content in their website(s)). If two or more websites publish data on the same topic, we humans can compare the data critically. However, this task is quite difficult for a machine, as they do not have an inherent understanding of semantics.
A comparison between data points may appear simple. However, multiple functions may handle such a comparison at a time. Depending on the function design (and parameters), the yielded result may differ strongly. E.g., a comparison function focused on date comparison may return a boolean value (true/false) if the compared dates are the same, or the number of days in between the two dates.
The goal of this project is to develop a customizable comparison logic that allows experts to express how facts should be compared through formulas. These formulas shall orchestrate comparison functions using several operations (boolean, algebraic), and the system should validate the created formulas along multiple dimensions (syntax, semantics, safety). The final formulas should then be able to be stored in a database and executed on demand.
To ensure a user-friendly creation process, the expert user shall be informed about potential issues during the formula generation, and issues during the final validation steps shall be logged by the system, ensuring transparency and traceability of the process. Using this customizable comparison function logic, the project aims to empower experts with the tools needed to handle complex fact comparison scenarios effectively.
- Technology:Python, MongoDB, Docker, Web Application, Web Services
- Tags:
- Contact: Daniel Berger
DB 03 - FactCheck - Dynamic Code Execution for Fact Comparison (target group P1/P2)
FactCheck is a framework for detecting and resolving conflicting data on the Web. It establishes an entire fact comparison process that consists of data acquisition, data comparison, the presentation of comparison results, and comprehensive analysis functions. FactCheck is a leading research topic of our research group and bears challenges in many aspects.
We define facts as pieces of information that are published by data providers (e.g., as textual content in their website(s)). If two or more websites publish data on the same topic, we humans can compare the data critically. However, this task is quite difficult for a machine, as they do not have an inherent understanding of text and its semantics.
A comparison between two data points may appear simple. However, we often encounter objects that contain multiple attributes and relationships. To be able to compare these objects, a more complex structure for comparison shall be created. Experts may also use individualized code for their comparison strategies.
The goal of this project is to develop a customizable execution framework that allows experts to write, execute, and manage custom code for fact comparison. This framework should support the integration of execution results and logging information into the system, ensuring transparency and traceability. By enabling dynamic code execution, the project aims to empower experts with the tools needed to handle complex fact comparison scenarios effectively.
- Technology:Python, Flask, Azure Cloud Services, Schema.org, REST, RabbitMQ
- Tags:
- Contact: Daniel Berger
DB 04 - Game Based learning (target group SPBA)
Game-based learning leverages the principles of serious gaming to create engaging and effective educational experiences. This project involves designing and developing a game-based learning system tailored to the content and learning goals of a specific lecture or course. The system should focus on how the content is presented, the mechanics used to engage learners, and how the achievement of learning goals is assessed.
Key aspects to explore include defining the learning objectives, designing game mechanics that align with these objectives, and ensuring the content is delivered in an interactive and meaningful way. The project should evaluate whether the learning goals are met and how the game enhances the educational experience.
The goal of this project is to research modern game-based learning techniques and to translate some lecture material into a gamified experience. The application should test skills in selected topics, provide assistance in case of failure of a module and reward the user by successfully completing tasks.
- Technology:Game Engines (Godot, Unreal, Unity, etc.), Web Technologies, Mobile Technologies
- Tags:
- Contact: Daniel Berger
DB 05 - Multimedia Management and Playback system (target group SPBA)
Organizing and managing large collections of multimedia content, such as images, videos, and audio files, can be a complex task. This project involves developing a multimedia management system that focuses on features like automatic tagging, sorting, and metadata extraction based on media content. The system should also allow users to insert, edit, and manage metadata to enhance organization and searchability.
In addition to management, the system should support the display and playback of media objects, providing users a seamless way to interact with collections. Graph-based structures can be explored to represent relationships between media objects, enabling features like recommending the next media object to play based on these relationships. The system should be intuitive, user-friendly, and adaptable to various use cases.
- Technology:UML, Azure Cloud Services, RDF / Knowledge Graphs
- Tags:
- Contact: Daniel Berger
DB 06 - Mobile App: Route finder for large buildings (target group SPBA)
Navigating large or complex buildings can be challenging, especially for individuals with specific accessibility needs. This project aims to develop a mobile application and framework to create digital maps that assist users in navigating unfamiliar buildings. Administrators will be able to map out buildings using 2D/3D tools, while users can select start and end points to receive navigation guidance. The system should consider user preferences, such as avoiding stairs or inaccessible routes, and provide a seamless navigation experience.
The application can be adapted for various use cases, such as shopping centers, hospitals, universities, or airports, and should focus on improving accessibility and ease of use for all visitors.
- Technology:UML, Mobile Technologies, Web Technologies, Firebase
- Tags:
- Contact: Daniel Berger
DB 07 - UML Model analyzer (target group SBPA)
Analyzing and syntactically validating UML models from non-editable formats is a highly complex and advanced challenge. This project involves developing a tool that can read UML models (e.g., Use Case, Class Diagram, ER Diagram, Sequence Diagram) from fixed formats such as JPEG, digitally reconstruct the model, and check its validity. The focus is on exploring methods to interpret and analyze these models, reconstruct their structure, and assess their correctness.
Given the advanced nature of the task, the project encourages a thorough exploration of potential approaches, evaluating their feasibility, and pushing the boundaries of what is achievable. While the problem is challenging, the goal is to make meaningful progress and critically assess the strengths and limitations of the chosen methods.
- Technology:UML, Azure Cloud Services, Computer Vision
- Tags:
- Contact: Daniel Berger
MSA02 - FactCheck - A Comparison of Frontend Frameworks for Web Extensions
Note: Previous experience with at least one frontend framework is recommended.
The FactCheck framework aims to address the issue of conflicting data on the Web by providing a systematic approach to detect and resolve such discrepancies. It encompasses the entire fact comparison process, including data acquisition, comparison, presentation of results, and advanced analysis features. As a pioneering research initiative of our research group, FactCheck presents several challenging aspects and opportunities in its development and implementation.
Frontend frameworks, such as Angular, React, or Vue, have become essential for building responsive, modular web applications. Increasingly, they also find use in Web Extensions (add-ons that add new or enrich existing functionality of a Web browser). Your task is to reimplement our browser extension IdaFix, which is currently using an older frontend framework, using modern frontend frameworks. As part of your work, you will...
- redesign the user interface of IdaFix
- research about, and compare, various frontend frameworks
- reimplement (parts of) IdaFix using two or more frontend frameworks of your choice
- compare the implementations, and reflect on their similarities and differences by means of written report
- Technologies: AngularJS, HTML, CSS, JavaScript, TypeScript, Manifest V3, Angular, React, Vue
- Tags:
- Contact: Marie Aichinger
MSA03 - FactCheck - Semantic Search for Fact Data
Recommended prerequisite: Multimedia and Semantic Technologies (MST)
The FactCheck framework aims to address the issue of conflicting data on the Web by providing a systematic approach to detect and resolve such discrepancies. It encompasses the entire fact comparison process, including data acquisition, comparison, presentation of results, and advanced analysis features. As a pioneering research initiative of our research group, FactCheck presents several challenging aspects and opportunities in its development and implementation.
Currently, FactCheck collects information from the Web via IdaFix and dedicated crawlers, and provides the resulting insights via a Web API that serves, for the most part, JSON. Allowing complex, semantically rich queries over our collected data may be beneficial in delivering our results in a semantic-web-friendly way. Your task will be to enrich our existing FactCheck prototype(s) with semantic web technologies. As part of your work, you will...- revisit our currently document-based data model, and redesign it to a triple/graph-based model more closely aligned to semantic web standards like OWL or RDF; this may additionally involve...
- finding suitable vocabularies (e.g., RDF-Cube)
- writing a script to automate the conversion from the document-based model to your new triple-based one
- enriching our existing data set with data collected from LOD collections (e.g., DBPedia)
- investigate a suitable storage solution (e.g., a triple store such as Apache Jena or RDF4J) for storing the redesigned data
- enable semantic search by configuring a suitable SPARQL endpoint and interface (e.g., Virtuoso, YASGUI), and optionally also hosting a customized version of DBPedia Lookup
- Technologies: Python, rdflib, SPARQL, Docker, Java
- Tags:
- Contact: Marie Aichinger
MSA04 - FactCheck - Design and Deploy A Scalable Statistical Framework for Fact Data
The FactCheck framework aims to address the issue of conflicting data on the Web by providing a systematic approach to detect and resolve such discrepancies. It encompasses the entire fact comparison process, including data acquisition, comparison, presentation of results, and advanced analysis features. As a pioneering research initiative of our research group, FactCheck presents several challenging aspects and opportunities in its development and implementation.
A core aspect of FactCheck is the generation of statistical insights and metrics from the crawled fact data. Your task is to reimplement our existing statistics API as a scalable stand-alone application using an (ideally Python-based) technology stack of your choice (e.g., PySpark, Pandas, NumPy). The key steps will involve...
- Fact Data Exploration: Explore our existing fact database and familiarize yourself with our data model.
- Fact Data Extraction: Extract thousands of fact data from our existing database to be used as your starting dataset, and adapt the data schema if needed.
- Metrics: Develop new or refine existing metrics from the fact data.
- API Reimplementation: Rebuild the statistics API from the ground up. Optionally, you may also create an interface that showcases its abilities.
- Deploy and Test: Deploy and test your newly developed solution alongside our server using Docker.
If needed, a suitable virtual machine will be provided to you. Depending on your strengths and interests, you may focus on...
- select data science aspects - e.g., orchestration, generation of statistics, data wrangling
- the safe and dynamic execution of semantically rich statistics
- creating a frontend (e.g., a dashboard, a Jupyter or Marimo notebook) to visualize your metrics
- Technologies: Python, CouchDB, PySpark, Pandas, NumPy, Jupyter Notebooks, marimo, Docker
- Tags:
- Contact: Marie Aichinger
MSA05 - FactCheck - Serious Games for Information Comparison
The FactCheck framework aims to address the issue of conflicting data on the Web by providing a systematic approach to detect and resolve such discrepancies. It encompasses the entire fact comparison process, including data acquisition, comparison, presentation of results, and advanced analysis features. As a pioneering research initiative of our research group, FactCheck presents several challenging aspects and opportunities in its development and implementation.
Serious games, or gamification, refer to applications with a primary purpose beyond entertainment - such as teaching new skills, crowd-sourcing data, or engaging users with a system in new and innovative ways. Your task is to explore the use of serious games and gamification elements for FactCheck. First, you will identify which aspect(s) of FactCheck you would like to gamify, and then design and implement a prototypical serious game using a technology stack of your choice (e.g., as a progressive Web app) which allows the application to run on the Web and communicate with our APIs and databases.
Aspects you may gamify include…
- Entity Resolution and Linking: given our existing fact data and entity resolution results, have users verify existing entity linking results, or perform the linking themselves using an interactive interface
- Fact Data Exploration: given our existing fact data, provide a gamified interface for users to explore and compare facts
- Feedback on Comparison Results: given our comparison API, have users compare data from various websites, and allow them to give feedback tailored towards improving our comparison processes
Alternatively, you may design a game that, using FactCheck concepts and APIs, teaches players about critical thinking, media literacy, or statistical literacy.
- Technologies: JavaScript, TypeScript, Angular, React, WebGL, Docker
- Tags:
- Contact: Marie Aichinger
MSA06 - FactCheck - Observability/Telemetry Framework
The FactCheck framework aims to address the issue of conflicting data on the Web by providing a systematic approach to detect and resolve such discrepancies. It encompasses the entire fact comparison process, including data acquisition, comparison, presentation of results, and advanced analysis features. As a pioneering research initiative of our research group, FactCheck presents several challenging aspects and opportunities in its development and implementation.
Observability (O11y) describes the ability to understand the internal state of a system using only its outputs (e.g., logs, or metrics such as CPU usage or average response time). As the FactCheck framework grows and becomes more distributed, the ability to debug and troubleshoot from persisted logs alone becomes increasingly difficult and time-consuming. Your task is it to implement a robust, scalable O11y framework using Grafana tools (e.g., Alloy, Loki) and other technologies (e.g., Prometheus). In your project, you will...
- gain an overview of the FactCheck prototype, and choose one component [P1] / at least two components from which you will collect telemetry data
- learn about key O11y concepts, and familiarize yourself with potential technologies to be used
- leverage existing telemetry data (e.g., logs), and/or implement new telemetry data for your chosen component(s) using zero-code and/or code-based instrumentation
- analyze and visualize the collected data by means of Grafana's dashboard creator
- Technologies: Python, JavaScript, OpenTelemetry, OpenMetrics, Grafana, Prometheus, Docker
- Tags:
- Contact: Marie Aichinger
(B) Topics of Master Theses
Please check the listing below for possible topics for a master's thesis. In principle, you may also choose from the topics listed in Section (A) above. Those topics are available for a master's thesis as well, but usually in a more expanded or advanced form.
- FactChecking: Models and Languages of Precision Metrics for comparing facts on the Internet.
- FactChecking: Flexible, configurable framework for crawlers for extracting facts from web pages.
- FactChecking: AI-based text analysis tools for extracting facts from the Internet.
- FactChecking: Multimedia content (images, audio, video) analysis tools (including the use of Azure AI tools and services) for extracting facts from the Internet.
- FactChecking: Analysis of cloud-based storage systems/services and design of a storage framework for a FactChecking prototype.
- FactChecking: Analysis and extraction of structured information from videos using state-of-the-art AI technology
- FactChecking: Analysis and extraction of structured information from images using state-of-the-art AI technology
- FactChecking: Analysis and extraction of structured information from text on the Web (news articles, scientific articles, Wikipedia, movie descriptions, etc.) using state-of-the-art AI technology and methods such as named entity recognition, key phrase recognition, and finding linked entities.
- FactChecking: Analysis and extraction of structured information on music-related information (e.g., album, artist, composer, release date, genre) using dedicated music cloud services or tools/APIs (e.g., MusicBrainz, Roon) by following the schemes available from, e.g., https://musicbrainz.org , https://www.discogs.com , or the Metadata Model used by Roon https://help.roonlabs.com/portal/en/kb/articles/metadata-model )
- Interactive advanced course content components based on Jupyter Notebooks for a dedicated course (e.g., MRE, MRS, MCM, MST, DMP) offered in the Bachelor's or Master's program.
... additional, new topics will become available in near future. In the case of Master Theses topics you may also contact Prof. Klas, or a researcher of the MIS group to find out more about possible topics.
