LTCI Cluster#

Welcome to the documentation for the LTCI computing cluster at Télécom Paris. The cluster is a shared GPU/CPU platform available to all LTCI researchers, free of charge, no allocation proposal required.

190+
GPUs

78
Compute Nodes

12
GPU Partitions

Getting Started#

New to the cluster? Follow these three steps:

Prerequisites: verify your account and network access
Access the Cluster: connect via SSH and set up your keys
Your First Job: submit an interactive and batch job with Slurm

How It Works#

The cluster is a set of 78 physical machines (nodes) connected by a fast network. Each node has CPUs and memory. Most nodes also have GPUs — different generations ranging from P100 to H100 — while some nodes are CPU-only. Instead of running code on your own workstation, the cluster pools these resources so any researcher can access what they need, from a single GPU for debugging to multiple nodes for large-scale training.

Resources are organized into partitions — groups of nodes with similar hardware (e.g., A100, H100, CPU). When you submit a job, you specify which partition you want to run on.

You never interact with the hardware directly. You connect to a login node via SSH, prepare your code and environment, then ask a scheduler (Slurm) to run your work on compute nodes. Slurm picks the right machine within the requested partition, allocates your requested resources, runs your job, and writes the output to shared storage.

Cluster Architecture

Who can access the cluster?#

The cluster is available to the LTCI research staff, including researchers, research engineers, post-doctoral researchers, PhD students, and interns. School students can also access the cluster with restrictions.

Research priority

The cluster is a shared resource dedicated to support research at the Lab. Research workloads always take priority over student projects.

Resources and fairshare#

The cluster is free to use: no allocation proposals, no billing, no compute-hour budgets. Because the resource is shared, Slurm uses a fairshare algorithm to balance access: if you've been using a lot of GPUs recently, your next jobs wait a bit longer so others get a turn. Priority recovers automatically over ~2 weeks.

User type	Partitions	Max concurrent jobs	Max wall time
LTCI researchers (PhD, postdoc, intern)	All	No limit	Partition default
Students	P100, 3090, CPU	4	36 hours

See Resources Allocation and Compute Nodes for full details.

Storage#

All nodes share the same NFS filesystem. Your files are visible from any node (login or compute) without copying anything. Code you edit on the login node is immediately available in your jobs, and output written on a compute node appears in your home directory. Home directories have quotas. See Storage for details.

Network#

Compute nodes have full internet access. You can download datasets, pull container images, and connect to external services (Weights & Biases, Hugging Face Hub, etc.) directly from your jobs. Package installations should always target your home directory (via virtual environments or conda), not system paths.

The cluster is only accessible from the Télécom Paris network (on campus or via VPN). See Access the Cluster for connection instructions.

Contributing#

Spot an error or missing topic? Click the icon at the top of any page to propose changes directly on GitLab.