publications
publications by categories in reversed chronological order. generated by jekyll-scholar.
2025
- ArXiv preprint
GreenhouseSplat: A Dataset of Photorealistic Greenhouse Simulations for Mobile RoboticsDiram Tabaa and Gianni Di CaroSubmitted, Oct 2025Simulating greenhouse environments is critical for developing and evaluating robotic systems for agriculture, yet existing approaches rely on simplistic or synthetic assets that limit simulation-to-real transfer. Recent advances in radiance field methods, such as Gaussian splatting, enable photorealistic reconstruction but have so far been restricted to individual plants or controlled laboratory conditions. In this work, we introduce GreenhouseSplat, a framework and dataset for generating photorealistic greenhouse assets directly from inexpensive RGB images. The resulting assets are integrated into a ROS-based simulation with support for camera and LiDAR rendering, enabling tasks such as localization with fiducial markers. We provide a dataset of 82 cucumber plants across multiple row configurations and demonstrate its utility for robotics evaluation. GreenhouseSplat represents the first step toward greenhouse-scale radiance-field simulation and offers a foundation for future research in agricultural robotics.
- ArXiv preprint
Fiducial Marker Splatting for High-Fidelity Robotics SimulationsDiram Tabaa and Gianni Di CaroSubmitted, Aug 2025High-fidelity 3D simulation is critical for training mobile robots, but its traditional reliance on mesh-based representations often struggle in complex environments, such as densely packed greenhouses featuring occlusions and repetitive structures. Recent neural rendering methods, like Gaussian Splatting (GS), achieve remarkable visual realism but lack flexibility to incorporate fiducial markers, which are essential for robotic localization and control. We propose a hybrid framework that combines the photorealism of GS with structured marker representations. Our core contribution is a novel algorithm for efficiently generating GS-based fiducial markers (e.g., AprilTags) within cluttered scenes. Experiments show that our approach outperforms traditional image-fitting techniques in both efficiency and pose-estimation accuracy. We further demonstrate the framework’s potential in a greenhouse simulation. This agricultural setting serves as a challenging testbed, as its combination of dense foliage, similar-looking elements, and occlusions pushes the limits of perception, thereby highlighting the framework’s value for real-world applications.
2024
- thesis
SampleLapNet: A Learnable Laplacian Approach for Task-Agnostic Point Cloud DownsamplingDiram TabaaCarnegie Mellon University (Senior Honors Thesis), Sep 2024Advancements in 3D sensing technologies have led to an increased reliance on point cloud data for diverse applications ranging from autonomous navigation to environmental modeling. However, the sheer volume of data collected by these technologies poses significant challenges for real-time processing and analysis. This thesis introduces SampleLapNet, a novel neural network architecture designed to address the challenges of point cloud downsampling in a task-agnostic manner. By leveraging the Laplacian operator as a geometric measure of point importance, SampleLapNet learns to predict and preserve critical geometric features during the downsampling process, thereby ensuring minimal loss of relevant information. The architecture combines the robustness of transformer models with the efficiency of Laplacian-based importance scoring to facilitate efficient preprocessing that enhances subsequent point cloud analyses. We demonstrate the effectiveness of SampleLapNet through extensive experiments on benchmark datasets, showing significant improvements in downsampling efficiency without compromising the performance of downstream tasks such as semantic segmentation. This work not only proposes a method to reduce computational demands but also provides insights into the geometric processing of 3D data, suggesting pathways for future innovations in point cloud processing.
2022
- IEEE TIFS
Video Source Characterization Using Encoding and Encapsulation CharacteristicsEnes Altinisik, Hüsrev Taha Sencar, and Diram TabaaIEEE Transactions on Information Forensics and Security, Sep 2022We introduce the use of video coding settings for source identification and propose a new approach that incorporates encoding and encapsulation aspects of a video. To this end, a joint representation of the overall file metadata is developed and used in conjunction with a two-level hierarchical classification method. At the first level, our method groups videos into metaclasses considering several abstractions that represent high-level structural properties of file metadata. This is followed by a more nuanced classification of classes that comprise each metaclass. The method is evaluated on more than 20K videos obtained by combining four public video datasets. Tests show that a balanced accuracy of 91% is achieved in correctly identifying the class of a video among 119 video classes. This corresponds to an improvement of 6.5% over the conventional approach based on video file encapsulation characteristics. Analysis performed on a large, unlabeled video set also confirmed the aptness of our approach. To further demonstrate the versatility of encoding parameters, we consider attribution of partial video files where file metadata is not available. Our results show that, even in this limited setting that is intrinsic to forensic file recovery, an identification accuracy of 57% can be achieved through the use of a subset of encoding parameters estimated from coded video data.