CAEZ: CSI Acquisition at ETH Zurich

CAEZ stands for CSI Acquisition at ETH Zurich. We publish channel-state information (CSI) measurements from our 5G testbed [1] and Wi-Fi testbed [2], denoted CAEZ-5G and CAEZ-WIFI, respectively.

caez
CAEZ: CSI Acquisition at ETH Zurich

CAEZ-5G

We publish three real-world wideband multi-antenna multi-O-RU (Open RAN Radio Units) CSI datasets from the 5G NR uplink channel, specifically from the Physical Uplink Shared Channel (PUSCH): an indoor lab/office room dataset, an outdoor campus courtyard dataset, and a device classification dataset with six commercial-off-the-shelf (COTS) user equipments (UEs). These datasets enable high-accuracy CSI-based sensing tasks including external page neural positioning, external page channel charting in real-world coordinates, and external page closed-set device classification. For further details, please refer to [1].

ETH Zurich 5G NR Testbed

The CAEZ-5G datasets were collected using a 5G NR testbed at ETH Zurich. The testbed builds upon external page NVIDIA ARC-OTA (Aerial RAN CoLab Over-the-Air) and is a full-stack software-defined 5G system with COTS UEs and four COTS O-RUs, where one O-RU is used for 5G communication, while the other three operate as passive listeners. The system operates in the full Swiss private 5G band, i.e., 100MHz of the 5G NR N78 band centered at 3.45GHz. All components (except for the UEs) are connected via a fiber optical switch. A Supermicro NVIDIA MGX GH200 server runs the full-stack 5G system, comprising the NVIDIA Aerial L1, OAI L2, and OAI core network. A PTP (Precision Time Protocol) grand master clock with GNSS (Global Navigation Satellite System) time reference synchronizes the fiber-optical network.

CAEZ-5G-INDOOR

The CAEZ-5G-INDOOR dataset contains CSI measurements from an indoor lab/office environment. The measurement area is a 3.5m x 3.5m square area between lab desks. The O-RUs were placed at the corners of the measurement area. Four WorldViz PPT (Precision Position Tracking) cameras were placed around the measurement area and two cameras were placed above the measurement area to provide ground-truth UE position tracking.

CAEZ-5G Indoor

Measurement Details:

  • Duration: 1h 47min
  • Number of samples: 338,981
  • UE type: Quectel RMU500EK
  • Vehicle: iRobot Create 3 robot platform with Raspberry Pi and Quectel 5G modem
  • Position Tracking: Yes (WorldViz PPT)
  • PUSCH transmission: Every 20ms

The robot was controlled using random waypoint navigation. Four WorldViz PPT markers were mounted on the robot to enable tracking of position and rotation. One measurement operator was present in the lab/office during CSI collection and sometimes walked through the measurement area.

This dataset enables high-accuracy neural UE positioning, achieving 0.6cm mean absolute error on the test set. The simulation code is available in the external page neural positioning repository. For further details, please refer to [1].

CAEZ-5G-OUTDOOR

The CAEZ-5G-OUTDOOR dataset contains CSI measurements from an outdoor campus courtyard environment. The measurement area is a 10m x 10m square area in the ETH Zurich electrical engineering campus courtyard, surrounded by multiple buildings, trees, and other obstacles. The O-RUs were placed at the corners of the measurement area and six WorldViz PPT cameras were placed around the measurement area to provide ground-truth UE position tracking.

CAEZ-5G Outdoor

Measurement Details:

  • Duration: 1h 38min
  • Number of samples: 303,189
  • UE type: Samsung Galaxy S23
  • Vehicle: Custom robot platform with robot arm
  • Position tracking: Yes (WorldViz PPT)
  • PUSCH transmission: Every 20ms

A Samsung Galaxy S23 was mounted on a robot arm on top of a custom robot platform. The robot was controlled manually. Four WorldViz PPT markers were mounted on the robot arm (which remained fixed) to enable tracking of the mounted UE's position and orientation. Two measurement operators were present near the measurement area during CSI collection.

This dataset enables high-accuracy neural UE positioning (5.7cm mean absolute error) and channel charting in real-world coordinates (73cm mean absolute error). The simulation code is available in the external page neural positioning repository and external page channel charting repository. For further details, please refer to [1].

CAEZ-5G-DEV-CLASS

The CAEZ-5G-DEV-CLASS dataset contains CSI measurements from six different COTS UEs for device classification tasks. The measurement setup is similar to that of CAEZ-5G-INDOOR, but the measurements were carried out on different days and no WorldViz PPT cameras were used. The measurement area is a joint lab/office space of about 4m x 4m between the lab desks. The O-RUs were placed at the corners of the measurement area.

DEV-CLASS

Measurement Details:

  • Duration: 6 x (2min + 30s) per UE
  • Number of samples: 83,619 (1st day) + 21,805 (2nd day)
  • UE types: Six COTS UEs (see figure above)
  • Vehicle: Rotation table + human operator
  • Position tracking: No
  • PUSCH transmission: Every 10ms

The dataset consists of six consecutive measurements, each taken separately using one of the six UEs shown in the figure above. The measurement protocol for each UE comprises the following four steps: (i) rotation on a predefined fixed location with a rotation table, (ii) random human walk, (iii) additional rotations on the same location, and (iv) another random human walk on the next day.

On the first measurement day, each measurement started when the UE was mounted on the rotation table and taken out of flight mode. As soon as the UE connected to the 5G network, it was left slowly rotating for 30s. Then, the operator removed the UE from the phone mount and carried it randomly through the lab space for 60s. Finally, the UE was put back into the phone mount on the rotation table and left spinning for another 30s.

On the second measurement day (the following day), each measurement consisted only of 30s random human walk through the lab space. This next-day dataset is used for testing. Note that one of the O-RU's power supplies stopped working overnight, and the lab environment has slightly changed (e.g., chairs and equipment were moved). Therefore, this next-day evaluation dataset is not only recorded from different UE locations, but also in a slightly modified environment.

This dataset enables highly accurate device classification with location-independent radio frequency fingerprinting (RFFI) features, achieving 99% accuracy (same day) and 95% accuracy (next day). The simulation code is available in the external page device classification repository. For further details, please refer to [1].

Citation Key

If you use our CAEZ-5G datasets and/or our simulation code (or parts of it), then you must cite this reference:

@inproceedings{wiesmayr2025csi,
  author = {Wiesmayr, Reinhard and Zumegen, Frederik and Taner, Sueda and Dick, Chris and Studer, Christoph},
  title = {{CSI}-Based User Positioning, Channel Charting, and Device Classification with an {NVIDIA 5G} Testbed},
  booktitle = {Asilomar Conf. Signals, Syst., Comput.},
  month = oct,
  year = {2025},
}

CAEZ-WIFI

We publish a real-world IEEE 802.11a Wi-Fi CSI dataset recorded using multiple distributed, multi-antenna, software-defined Wi-Fi sniffers. The dataset enables high-accuracy CSI-based external page neural positioning and external page channel charting. Further details can be found in [2] and [3].

ETH Zurich Custom Wi-Fi Sniffer Testbed

The CAEZ-WIFI datasets were collected using a software-defined Wi-Fi testbed with multiple Wi-Fi sniffers acting as receivers for CSI measurements. Each sniffer is equipped with a four-antenna software-defined radio (SDR) and runs a custom PHY-layer software stack capable of decoding Wi-Fi traffic. The Wi-Fi traffic itself is generated by UEs operating in an active Wi-Fi network.

Using MAC addresses extracted from received Wi-Fi frames, each sniffer identifies whether a given CSI estimate corresponds to the channel between a targeted UE and its receive antenna array. The resulting CSI estimates are stored locally on the sniffer’s host PC and later processed to form a combined multi-sniffer CSI dataset. To enable temporal alignment during post-processing, all sniffer host PCs synchronize their clocks via a common Network Time Protocol (NTP) server.

CAEZ-WIFI-INDOOR-LSHAPE

The CAEZ-WIFI-INDOOR-LSHAPE dataset contains CSI measurements from an indoor meeting room environment. The measurement area is L-shaped with outer side-lengths of 4m x 5m. The Wi-Fi sniffers were placed outside the measurement area. To create partial NLoS conditions for at least one sniffer at a time with respect to any position inside the measurement area, we placed a wall of RF absorbers on the inner side of the L-shape. Six WorldViz PPT cameras were placed around the measurement area to provide ground-truth UE position tracking.

CAEZ-WIFI Indoor L-Shape

Measurement Details:

  • Duration: 4h 15min
  • Number of samples: 33,411 (sniffer 1), 43,518 (sniffer 2), 40,339 (sniffer 3), 43,914 (sniffer 4)
  • UE type: Wi-Fi USB Adapter Alfa AWUS036ACHM
  • Vehicle: iRobot Create 3 robot platform with Raspberry Pi and Wi-Fi adapter
  • Position tracking: Yes (WorldViz PPT)

The robot was controlled using random waypoint navigation. Four WorldViz PPT markers were mounted on the robot to enable tracking of the UE’s position. Two measurement operators were present in the room during CSI collection and sometimes walked through the measurement area.

This dataset enables high-accuracy neural UE positioning, achieving 6.1cm mean absolute error on a randomly sub-sampled test set. The simulation code is available in the external page neural positioning repository. Further details can be found in [3], where the CAEZ-WIFI-INDOOR-LSHAPE dataset is referred to as “Wi-Fi Large Meeting Room” dataset. The implementation available in the external page neural positioning repository improves upon the supervised baseline presented in [3].

Citation Key

If you use our CAEZ-WIFI datasets and/or our simulation code (or parts of it), then you must cite this reference:

@inproceedings{zumegen2024software, author = {Zumegen, Frederik and Studer, Christoph},
  title = {A Software-Defined and Distributed {Wi-Fi} Channel-State Information Acquisition Testbed},
  booktitle = {Proc. Asilomar Conf. Signals, Syst., Comput.},
  month = oct,
  year = {2024},
}

Download and Usage

By downloading any CAEZ dataset, you agree to the terms of the CAEZ Dataset License v1.0.

CAEZ-5G

The CAEZ-5G datasets are provided as compressed tar.zstd archives containing CSI data from the NVIDIA PyAerial pipeline and ground-truth UE position logs from the WorldViz PPT system. For dataset processing, feature extraction, and ground-truth UE position interpolation, please refer to the external page gen_dataset.py script published in the external page neural positioning repository. For detailed information about the testbed setup and measurement protocols, please refer to [1].

CAEZ-WIFI

The CAEZ-WIFI datasets are provided as compressed tar.gz archives containing CSI data from the custom Wi-Fi sniffer pipeline and ground-truth UE position logs from the WorldViz PPT system. For dataset processing, feature extraction, and ground-truth UE position matching, please refer to the external page make_wifi_dataset.py script published in the external page neural positioning repository. For detailed information about the testbed setup and measurement protocols, please refer to [2] or [3].

Dataset License

The CAEZ datasets are made available under the CAEZ Dataset License v1.0. This license permits commercial and non-commercial use, modification, and creation of derivative works, while requiring attribution and prohibiting redistribution of the original dataset. By accessing or using the datasets, you agree to the terms of this license and acknowledge that you use the datasets entirely at your own risk.

Acknowledgements

The authors acknowledge NVIDIA for their sponsorship of this research regarding CAEZ-5G.

The authors acknowledge the external page Channel Charting as a Service (CHASER) Project for sponsorship of the CAEZ-WIFI research. 

The authors thank Torben Kölle for website administration support.

References

[1] R. Wiesmayr, F. Zumegen, S. Taner, C. Dick, and C. Studer, "CSI-based user positioning, channel charting, and device classification with an NVIDIA 5G testbed," in Asilomar Conf. Signals, Syst., Comput., Oct. 2025, arXiv preprint external page https://arxiv.org/abs/2512.10809

[2] F. Zumegen and C. Studer, "A software-defined and distributed Wi-Fi channel-state information acquisition testbed," in Proc. Asilomar Conf. Signals, Syst., Comput., Oct. 2024, arXiv preprint external page https://arxiv.org/abs/2412.07588

[3] T.-Y. Müller, F. Zumegen, R. Wiesmayr, E. Gönültaş, and C. Studer, “Neural Positioning Without External Reference,” Nov. 2025, arXiv preprint external page https://arxiv.org/abs/2511.16352

JavaScript has been disabled in your browser