CS Seminars: “CS 590/690 Seminars”, 3:30PM March 30 2026 (EN)

RPDockDiff: A Diffusion-Based Framework for RNA–Protein Rigid Docking and Structure Refinement

Helyasadat Hashemi Aghdam
Master Student

(Supervisor:Assoc.Prof.Ercüment Çiçek) Computer Engineering Department
Bilkent University

Abstract: RNA–protein interactions are important for many biological processes, such as gene regulation and cellular function. However, predicting how RNA and proteins bind to each other is still a challenging problem, especially in rigid docking. In this project, we present a diffusion-based approach for RNA–protein rigid docking that learns translation and rotation between molecules. Our model uses a diffusion process to gradually improve the position and orientation of RNA and protein structures. Starting from an initial complex, the model refines the binding pose step by step by applying learned transformations. In addition to docking, our method can also be used to refine RNA–protein complex structures predicted by tools such as AlphaFold3. We also address an important data limitation in this field by creating a large RNA–protein dataset. Since most structures in the PDB are ribosomal (around 95%), many models become biased toward this type of complex. To avoid this problem, we construct a more balanced dataset that includes both ribosomal and non-ribosomal complexes, which helps our model generalize better and reduces bias. Overall, this work shows that diffusion-based methods can be effective for RNA–protein docking and structure refinement.

DATE: March 30, Monday @ 15:30 Place: EA 502

2. KENT: Fast Pathogen Screening with NVIDIA Jetson Cluster in Resource-Constrained Areas

Arda İçöz
Master Student

(Supervisor:Assoc.Prof.Can Alkan) Computer Engineering Department
Bilkent University

Abstract: Metagenomic sequencing enables broad pathogen detection without requiring prior assumptions about the organisms present, but its analysis pipelines usually depend on desktop or server-class computing. This limits practical use in disaster-stricken and other resource-constrained environments, where portable sequencing may be available (e.g., with MinION) but reliable access to cloud or laboratory infrastructure is not. We present a portable edge-computing framework, KENT, for metagenomic pathogen screening based on a cluster of NVIDIA Jetson Nano GPUs. KENT adapts CuCLARK for low memory embedded execution and extends it with a custom software stack for distributed read classification, abundance estimation, split-run abundance merging, and report generation. Using an MPI-based controller/responder architecture, KENT supports local, offline analysis across multiple nodes. We evaluate the system using long read wastewater metagenomic samples with a custom-built pathogen database, and using the CAMI2 Marine long read dataset and its corresponding reference set. In the wastewater setting, KENT achieves an average 3.1× speedup over the CPU-based CLARK-l baseline while preserving close agreement with CLARK-l abundance estimates, with an average absolute difference of approximately 0.12% in classified-read abundance. On the CAMI2 Marine dataset, KENT again remained highly consistent with CLARK-l, with a mean absolute difference of approximately 0.01%, and achieved an average speedup of about 2.68× across matched runs. Overall, the results indicate that GPU-accelerated metagenomic classification with KENT can be adapted to low-power embedded platforms while preserving baseline abundance estimates, supporting portable and offline pathogen screening in resource-constrained settings.

DATE: March 30, Monday @ 15:50 Place: EA 502

3. Conpresso: Compressing and querying genome collections

Ali Erdem Karaçay
Master Student
(Supervisor:Assoc.Prof.Can Alkan)

Computer Engineering Department
Bilkent University
Abstract: Motivation: The decreasing costs of sequencing technologies have led to an exponen- tial increase in available genomic data, fueling large-scale initiatives such as the Human Pangenome Reference Consortium, and AllTheBacteria. However, this accessibility has introduced a critical computational bottleneck: data storage. Although general-purpose compressors reduce file sizes dramatically, they do not exploit the unique structural char- acteristics of genomic sequences. Consequently, efficient storage and querying of massive genomic datasets have become a paramount research challenge. In this paper, we explore how locally consistent parsing (LCP) within a dictionary-based architecture captures these genomic redundancies. Our approach not only achieves highly efficient compression but also uniquely enables querying the compressed data without decompression. Results: Our LCP-based compressor achieves up to 5× greater compression than general- purpose tools and up to 2× greater compression than specialized genomic tools in specific use cases. Using frequency-mapped encoding, our architecture maintains high performance and minimal memory overhead. Crucially, the tool operates entirely de novo, requiring no reference genome, and allows sub-linear time sequence searches directly on the compressed archive.

DATE: March 30, Monday @ 16:30 Place: EA 502

4. OrSE: An Orthogonal Layout Algorithm Based on the Spring Embedder with Compound Graph Support

Mohammad Mahdi Khosravi
Master Student

(Supervisor:Prof.Dr.Uğur Doğrusöz) Computer Engineering Department
Bilkent University

Abstract: Graph visualization is essential for understanding networks and relational data across diverse domains. While most existing orthogonal layout algorithms are based on the three-phase TSM approach, this algorithm introduces OrSE (Orthogonal Spring Embedder), a fundamentally different algorithm that leverages force-directed spring embedder to achieve orthogonal layouts. OrSE progressively orthogonalizes edges while nodes are positioned by spring forces, integrating these processes within a unified framework. OrSE’s approach enables more natural node placement and flexible edge alignment. Additionally, OrSE extends support to compound graphs, allowing for the orthogonal layout of hierarchical and clustered structures.

DATE: March 30, Monday @ 16:50 Place: EA 502

Information for: