Thesis Topic Opportunity (Fall 2024)

Neo4j

  • Malmö
  • Permanent
  • Heltid
  • 2 månader sedan
Job Overview:Are you at the end of your studies and want to immerse yourself in graph technology? We are now looking for students who want to do their Master's Thesis alongside us at Neo4j!
As part of Neo4j engineering in Malmö, you will work with a diverse team of talented colleagues worldwide. You will receive advice and continuous support from us - we are experts in graph technology and positioned to help you perform to the best of your ability.Past Thesis Topics:Predicting Loss of Fault Tolerance in a Cloud Graph Database:
With the development of cloud computing it becomes increasingly popular with applications which are hosted on the cloud and used over the internet. In order to keep the system operational and prevent loss of data in case of failure, many systems adapt fault tolerance. Fault tolerance is defined as a system's ability to continue operating without loss of functionality when one or more of its com- ponents fail. Therefore, it becomes essential to be able to detect and predict when a system is at risk of losing fault tolerance. Every anomalous behaviour in a system is a potential cause to an incident that can lead to the system losing this quality. By detecting anomalies that can contribute to fault intolerance, this can be prevented.Random Generation of Semantically Valid Cypher Queries:
Database management systems (DBMS) are integral tools at the center of many software applications, which means that these applications are deeply dependent on the correctness of their DBMS. In recent years, graph DBMSs have seen a significant rise in popularity, but they have not gotten the same amount of academic attention when it comes to testing as their relational counterparts. The most popular graph DBMS is called Neo4j and it has its own query language called Cypher. In this thesis, we present a tool that generates random semantically correct Cypher queries. This query generator has a versatile set of use-cases and is built to be configurable, and in this thesis we have focused on using it for random testing of the Neo4j DBMS. Random testing of a DBMS means generating random but correct queries, executing them on the database and then checking whether the output is incorrect, which can be accomplished in a few different ways.Finding Candidate Node Pairs for Link Prediction at Scale:
There are methods for inferring whether a pair of people are likely to become friends or whether two kinds of drugs are likely to interact if consumed simultaneously. The methods solve the problem of link prediction, i.e. answer the question "Is a link (friendship, interaction) likely to form between two particular nodes (people, drugs)?". Generalizing the problem to graphs translates it to predicting if particular node pairs are likely to form links. As predicting links between all possible node pairs is computationally infeasible for larger graphs, methods for narrowing down the search space are required to efficiently solve the problem. We propose a novel algorithm, DAPPR, for resolving this issue and compare it against an existing solution LinkWaldo, along with breadth first search and a variant of KNN. The algorithms are evaluated by their ability of finding hidden edges on on real-world graphs, and it is shown that DAPPR outperforms all compared algorithms.Preserving Availability in a Consensus Module Using Back Pressure:
In distributed systems, the consensus algorithm Raft is used to replicate a globally ordered log of entries. However, members that fall behind in replicating the log entries can cause system write unavailability. One reason for this write unavailability is that Raft needs a majority of members to replicate a log entry before it is accepted into the system.Row vs. column data layout in a graph database query engine:
This thesis aims to examine if there is any performance improvement to be gained by changing the memory layout from row-wise to column-wise inside the Neo4j query engine. In order to test this a column-wise representation was created along with a new implementation for a few operators to better leverage the potential of the new memory layout, such as using SIMD. This change means that the query execution strategy is changed from the current approach, which relies upon fusing and compilation, to a vectorized approach instead.Categorization of Cypher Queries to Improve Benchmark Coverage for Graph Databases:
Benchmarks are often used to find regressions to avoid performance dropping over time. To make benchmarks relevant for a product, the benchmarks should mirror the users' needs and uses of functionality. To achieve this, user data can be used as a foundation when creating new benchmarks and thus improving the coverage. This thesis was carried out at Neo4j which develops the most frequently used graph database. Using data from their database as a service (AuraDB), we focused on finding a way to improve the coverage of the benchmark suite run by them.We tackle challenges in:
  • Concurrency and parallelism
  • Distributed systems and fault tolerance
  • Language design and type systems
  • Performance tuning and benchmarking
  • Cloud architecture and service design
  • Site Reliability Engineering and cloud automation
  • Continuous Integration and Continuous Delivery
  • Graph algorithms and machine learning
Please send us a description in English of: * Your area of study
  • Your thesis idea and the area of engineering that it corresponds to
  • If you are not completely sure, that is okay - please let us know if you would like to find out more information
  • If you are applying as a group, please apply separately and indicate who you are applying together with in your Cover Letter.

Neo4j

Liknande jobb

  • Postdoc position in silviculture

    • Lomma, Skåne
    Offer Description Southern Swedish Forest Research Centre We are seeking a highly motivated person to conduct research in silviculture of predominantly broadleaved forests of s…
    • 1 månad sedan
  • PhD position in Plant Breeding

    • Lomma, Skåne
    Offer Description Department of Plant Breeding The Department of Plant Breeding conducts research, education, and innovation work of strategic importance for society's long-ter…
    • 1 dag sedan