HUF 2024
from
Monday, September 9, 2024 (9:00 AM)
to
Thursday, September 12, 2024 (3:55 PM)
Monday, September 9, 2024
9:00 AM
Bus Transfer
Bus Transfer
9:00 AM - 9:45 AM
10:00 AM
Registration
Registration
10:00 AM - 10:30 AM
Room: Aula
10:30 AM
Welcome
-
Achim Streit
(
KIT-SCC
)
Welcome
Achim Streit
(
KIT-SCC
)
10:30 AM - 10:45 AM
Room: Aula
Welcome
10:45 AM
Introduction to HUF 2024
-
Doris Ressmann
(
Karlsruhe Institute of Technlology
)
Introduction to HUF 2024
Doris Ressmann
(
Karlsruhe Institute of Technlology
)
10:45 AM - 11:00 AM
Room: Aula
11:00 AM
Support Update
-
Jonathan Procknow
Support Update
Jonathan Procknow
11:00 AM - 12:00 PM
Room: Aula
12:00 PM
Lunch break
Lunch break
12:00 PM - 1:30 PM
Room: 126
1:30 PM
KIT's Site report
-
Dorin Lobontu
KIT's Site report
Dorin Lobontu
1:30 PM - 2:00 PM
Room: Aula
Dorin
2:00 PM
IN2P3 Site Status
-
Pierre-Emmanuel BRINETTE
(
IN2P3 / CNRS
)
IN2P3 Site Status
Pierre-Emmanuel BRINETTE
(
IN2P3 / CNRS
)
2:00 PM - 2:20 PM
Room: Aula
## Updates on HPSS at IN2P3 Computing Center - Infrastructure and HPSS upgrade - ARM architecture support
2:20 PM
Coffee Break
Coffee Break
2:20 PM - 2:50 PM
Room: 126
2:50 PM
Transitioning HPSS Monitoring from Nagios to VictoriaMetrics
-
Basil Lalli
(
NERSC - LBNL
)
Transitioning HPSS Monitoring from Nagios to VictoriaMetrics
Basil Lalli
(
NERSC - LBNL
)
2:50 PM - 3:10 PM
Room: Aula
Last year at NERSC we retired our long-standing nagios-based HPSS monitoring deployment in favor of VictoriaMetrics, Loki and Alertmanager. We would like to share our experience and lessons learned on the way. - Motivation for making this transition - Limitations of Nagios-style monitoring - How does VictoriaMetrics address these? - General overview of our monitoring deployment - 3rd party exporters - Custom exporters/"plugins" - Demonstration of some of the dashboards we use and alerts we generate. - Future areas of improvement - Standardizing our HPSS-specific data collection - Service discovery
3:10 PM
HPSS monitoring at KIT
-
Preslav Konstantinov
(
KIT
)
HPSS monitoring at KIT
Preslav Konstantinov
(
KIT
)
3:10 PM - 3:40 PM
Room: Aula
Preslav
3:40 PM
Introduction to GridKa Tour
-
Andreas Petzold
(
KIT
)
Introduction to GridKa Tour
Andreas Petzold
(
KIT
)
3:40 PM - 4:00 PM
Room: Aula
4:00 PM
GridKa Tour 1
GridKa Tour 1
4:00 PM - 4:30 PM
Room: Aula
4:30 PM
GridKa Tour 2
GridKa Tour 2
4:30 PM - 5:00 PM
Room: Aula
5:00 PM
Flammkuchen Event (Tarte Flambé)
Flammkuchen Event (Tarte Flambé)
5:00 PM - 8:00 PM
Room: Aula
8:00 PM
Bus Transfer
Bus Transfer
8:00 PM - 8:45 PM
Room: Aula
Tuesday, September 10, 2024
9:00 AM
Bus Transfer
Bus Transfer
9:00 AM - 9:45 AM
10:00 AM
HPSS Release Roadmap (Restricted)
-
Michael Meseke
HPSS Release Roadmap (Restricted)
Michael Meseke
10:00 AM - 10:45 AM
Room: Aula
10:45 AM
Implementing a Virtualized HPSS Deployment for Testing and Development
-
Forrest Greenwood
(
HPSS Subsriber
)
Implementing a Virtualized HPSS Deployment for Testing and Development
Forrest Greenwood
(
HPSS Subsriber
)
10:45 AM - 11:00 AM
Room: Aula
As part of our efforts to upgrade our site to HPSS 10.3, Indiana University recently began development of a virtualized HPSS environment that we can use to quickly iterate on testing and development initiatives without tying up limited bare-metal hardware resources. This virtual-machine environment is patterned on the implementation created by IBM's HPSS Support team for use at the recent HPSS 10.3 Training held in May 2024. Topics to be discussed include provisioning the VM, installing and configuring a virtual tape library using the mhVTL open-source software package, installing and configuring both DB2 and HPSS, and possibly a sampling of the sorts of issues we intend to test using this environment. If technical affordances permit, this presentation could potentially include a live demonstration of the VM running on an external flash drive. Otherwise, we would be happy to present using the traditional static PowerPoint.
11:00 AM
Coffee Break
Coffee Break
11:00 AM - 11:30 AM
Room: 126
11:30 AM
HPSS S3 Scalability With Rubin LFA S3 Store Use Case
-
Guangwei Che
HPSS S3 Scalability With Rubin LFA S3 Store Use Case
Guangwei Che
11:30 AM - 11:45 AM
Room: Aula
SLAC National Accelerator Laboratory Technology and Inovation Department Scientific Computing Systems With a growing demand on HPSS S3 support from SLAC science user’s community, we eagerly started testing HPSS S3 interface since the pre-GA release in July 2023. From the initial fragile and immature version to today’s more robust and resilient state, we worked directly with HPSS S3 developer’s team to troubleshoot and triage many challenging issues faced with the scalability, data IO performance and the large and deeply nested data structure handling for very small files from Rubin LFA ceph S3 store use case. In this presentation we’ll tell our stories in the journey of bring HPSS S3’s capability to a next level.
11:45 AM
Testing HPSS S3 Interface at MPCDF
-
Elena Summer
(
Max Planck Computing and Data Facility (MPCDF)
)
Testing HPSS S3 Interface at MPCDF
Elena Summer
(
Max Planck Computing and Data Facility (MPCDF)
)
11:45 AM - 12:00 PM
Room: Aula
Starting from version 10.3, HPSS has an S3 interface. We at MPCDF have installed it on our test system to try it out in several usage scenarios including cloud sync - using Ceph Cloud Sync module as well as rclone, generating presigned URLs, and just using different S3 clients. Among our test actions, we are trying out put, get, remove S3 objects as well as getting S3 objects from tape. This talk covers test scenarios, their setup and results, encountered issues and their fixes.
12:00 PM
Lunch break
Lunch break
12:00 PM - 1:05 PM
Room: 126
1:05 PM
BOF Monitoring
BOF Monitoring
1:05 PM - 1:45 PM
Room: Aula
1:45 PM
NeRSC Site Report
-
Owen James
(
NERSC/LBNL
)
NeRSC Site Report
Owen James
(
NERSC/LBNL
)
1:45 PM - 2:15 PM
Room: Aula
NERSC Site Report HUF24 Karlsruhe Institute of Technology Campus Nord from 09-12 September 2024 Abstract Topics to discuss - NERSC Stats - PBs, etc - Upgrade HPSS 7.4.3 to 9.3 New RHEL8 Core Servers. New FS7300 metadata arrays. Update existing movers to RHEL8. Updated PAM auth module to work with NERSC Auth. - Install 4th TS4500 tape library 16 Frame 1188+ slots. testing 10.0.1 firmware with SSL for REST over Ethernet. TS1160 drives JE media, while we figure out TS1170/JF. Total Theoretical capacity 950PB on JE, 2.37EB on JF media. - Issues deploying TS1170 in our air cooled environment - Monitoring update (brief, specific talk to follow) - REST over ethernet testing
2:15 PM
Spectra Logic
-
Matt Starr
Spectra Logic
Matt Starr
2:15 PM - 2:35 PM
Room: Aula
2:35 PM
Coffee Break
Coffee Break
2:35 PM - 3:05 PM
Room: 126
3:05 PM
HPSS Object Storage Class Deep Dive
-
Greg Thorsness
HPSS Object Storage Class Deep Dive
Greg Thorsness
3:05 PM - 4:05 PM
Room: Aula
2024 HUF Presentations by IBM !!duration 1h
4:05 PM
Have I right-sized my disk cache?
-
Francis Dequenne
(
LBL
)
Have I right-sized my disk cache?
Francis Dequenne
(
LBL
)
4:05 PM - 4:35 PM
Room: Aula
Abstract: For most sites, the HPSS disk cache is a critical component of the HPSS configuration, helping boosting performances of storing and retrieving data from the archive. However, it may be a bit of a black art to assess how big the disk cache should be, especially in environments that have grown over the tears. This talk will present a couple of tools that have been developed at NERSC, that allow us to assess the effectiveness of an existing cache, and give some insight on the impact of increasing or decreasing that disk cache size.
4:40 PM
Bus Transfer
Bus Transfer
4:40 PM - 5:25 PM
Room: Aula
8:15 PM
Schlosslichtspiele
Schlosslichtspiele
8:15 PM - 11:15 PM
Wednesday, September 11, 2024
9:00 AM
Bus Transfer
Bus Transfer
9:00 AM - 9:45 AM
10:00 AM
Upcoming HPSS Features (Restricted)
-
Michael Meseke
Upcoming HPSS Features (Restricted)
Michael Meseke
10:00 AM - 11:00 AM
Room: Aula
2024 HUF Presentations by IBM !! duration: 1,5h
11:00 AM
Burning Issues (Restricted)
-
Jonathan Procknow
Burning Issues (Restricted)
Jonathan Procknow
11:00 AM - 11:30 AM
Room: Aula
2024 HUF Presentations by IBM
11:30 AM
Staging ~2 Million Files from Tape for a User
-
Geoff Cleary
(
LLNL
)
Staging ~2 Million Files from Tape for a User
Geoff Cleary
(
LLNL
)
11:30 AM - 11:50 AM
Room: Aula
Imagine you turn on your work laptop or arrive at the office and find this message from the customer service team: "we have a user that is trying to retrieve *many* files from the HPSS archive. At the current retrieval rate, we estimate it will take 6 months for the user to retrieve all the files in the dataset. Can you help?" What do you do? How do you proceed? What features does HPSS offer to help with this situation? I'll answer those questions and more as we examine LLNL's approach to retrieving nearly 2 million files across 10's of tape volumes with tools like `quaid`, SQLite, and RabbitMQ (along with a bit of custom Python code).
11:50 AM
BOF Client Interfaces
BOF Client Interfaces
11:50 AM - 12:10 PM
Room: Aula
12:10 PM
Lunch break
Lunch break
12:10 PM - 1:30 PM
Room: 126
1:30 PM
Group Foto
Group Foto
1:30 PM - 1:40 PM
Room: Aula
1:45 PM
Exploring storage technologies for HPSS disk caches
-
Andreas Petzold
Exploring storage technologies for HPSS disk caches
Andreas Petzold
1:45 PM - 2:05 PM
Room: Aula
At KIT we operate HPSS as a tape system for the GridKa WLCG Tier-1 and for the Baden-Württemberg Data Archive service. Performance limitations of the HPSS disk cache systems led us to explore new technology options for the disk cache, based on classic storage systems with SSDs, and storage servers with local NVMe devices. We will present details on the different possible solutions, including benchmarks.
2:05 PM
Managing Data Throughout Its Lifecycle: Lessons Learned and Future Directions
-
Charles McClary
(
HPSS Subscriber
)
Managing Data Throughout Its Lifecycle: Lessons Learned and Future Directions
Charles McClary
(
HPSS Subscriber
)
2:05 PM - 2:50 PM
Room: Aula
Abstract: Data lifecycle management poses significant challenges, particularly in academic and research environments where data accumulation is rapid and perpetual. This presentation delves into the complexities surrounding data retention and abandonment, highlighting the prevalent issues of data hoarding and the lack of structured deletion policies. Specifically, it addresses the dilemma wherein users, especially researchers, find little incentive to delete data, leading to a cluttered and often inaccessible data landscape. Furthermore, the departure of users from institutions like Indiana University (IU) exacerbates the problem, as data may be left behind with no clear ownership or accessibility. Indiana University is tackling these issues gradually. We'll discuss our efforts to address data management and abandonment through: New usage constraints: Instituting new quotas with tiered growth guidelines. Simplified Archiving and Movement: Providing user-friendly tools to archive and migrate data to appropriate storage tiers. Data Management Education: Empowering researchers with best practices for data stewardship. Insuring allocation value: Requiring annual renewal of desired resources. The "Digital Will" Concept: Developing a system where departing users can designate data inheritors and define deletion policies. By examining the successes and pitfalls of these initiatives, this presentation provides valuable insights into effective data lifecycle management strategies. It underscores the importance of fostering a culture of responsible data stewardship while leveraging technological innovations to facilitate seamless data management throughout its lifecycle.
2:50 PM
Coffee Break
Coffee Break
2:50 PM - 3:05 PM
Room: 126
3:05 PM
HPSS Core Servers on Commodity Hardware or: How We Learned to Love Databases on ZFS
-
Herb Wartens
HPSS Core Servers on Commodity Hardware or: How We Learned to Love Databases on ZFS
Herb Wartens
3:05 PM - 3:50 PM
Room: Aula
At LLNL we have been using commodity hardware more and more to serve our parallel filesystems and archival storage clusters. We wanted to explore how to use this same hardware for our HPSS Core Server systems. In order to make the system as reliable as possible, ZFS emerged as the underlying filesystem we wanted to utilize for its reliability and other advanced features. How would traditional databases perform on top of ZFS? Could we design a production-worthy system using this hardware?
4:00 PM
Bus Transfer
Bus Transfer
4:00 PM - 4:45 PM
Room: Aula
5:00 PM
ZKM Tour
ZKM Tour
5:00 PM - 6:00 PM
Room: Lorenzstraße 19 Karlsruhe 76135
6:00 PM
Conference Dinner
Conference Dinner
6:00 PM - 9:00 PM
Room: Lorenzstraße 19 Karlsruhe 76135
Thursday, September 12, 2024
9:00 AM
Bus Transfer
Bus Transfer
9:00 AM - 9:45 AM
10:00 AM
Generative AI and HPSS
-
Greg Thorsness
Generative AI and HPSS
Greg Thorsness
10:00 AM - 10:40 AM
Room: Aula
2024 HUF Presentations by IBM !! duration 1h
10:40 AM
MPCDF Site Report
-
Manuel Panea
(
Max Planck Computing and Data Facility
)
MPCDF Site Report
Manuel Panea
(
Max Planck Computing and Data Facility
)
10:40 AM - 11:10 AM
Room: Aula
We will present our activities with HPSS since the last HUF, including our upgrade to HPSS 10.3.
11:10 AM
Coffee Break
Coffee Break
11:10 AM - 11:40 AM
Room: 126
11:40 AM
Restful SSM
-
Fabi Adams
Restful SSM
Fabi Adams
11:40 AM - 12:05 PM
Room: Aula
2024 HUF Presentations by IBM !!duration 1h
12:05 PM
SSC Site Report
-
Tarak Patel
SSC Site Report
Tarak Patel
12:05 PM - 12:25 PM
Room: Aula
SSC Site Report will be a recap of previous HUF presentations and focus on : - Solution overview - Review of components throughout upgrades - HPNLS (High Performance Nearline Storage) architecture - HPSS and RHEL Upgrade - HPSS monitoring - User tools and environment
12:25 PM
Lunch break
Lunch break
12:25 PM - 1:55 PM
Room: 126
2:00 PM
JAXA Site Report
-
Naoyuki FUJITA
(
Japan Aerospace Exploration Agency(JAXA)
)
JAXA Site Report
Naoyuki FUJITA
(
Japan Aerospace Exploration Agency(JAXA)
)
2:00 PM - 2:20 PM
Room: Aula
The recent operation status of JAXA HPSS "J-SPACE", the plans and issues for its replacement in 2025, and monitoring functionality will be reported.
2:20 PM
Closing HUF 2024
-
Doris Ressmann
(
Karlsruhe Institute of Technlology
)
Closing HUF 2024
Doris Ressmann
(
Karlsruhe Institute of Technlology
)
2:20 PM - 2:30 PM
Room: Aula
3:00 PM
Bus Transfer
Bus Transfer
3:00 PM - 3:45 PM
Room: Aula