Friendly Intro to 3D Gaussian Splats for Hacking Vision Models

I recently came across a really interesting paper on
Arxiv.org  titled, "3DGAA: Realistic and Robust 3D Gaussian-based Adversarial Attack for Autonomous Driving".
But, as with much of the Adversarial Machine Learning research I try to read from Arxiv, I was woefully ignorant of some of the pre-req knowledge needed to effectively consume, digest, and understand the paper. So, I decided to use the Feynman technique to research, learn, and write about the prerequisites in this article.
Some terms we'll run into here:
  • 3D Gaussian Splatting (3DGS) - a modern 3D representation method that is rapidly gaining popularity because it's efficient at synthesizing new views and allows real-time rendering of complex scenes, outperforming traditional methods like Neural Radiance Fields (NeRFs)
  • Neural Radiance Fields (NeRFs) - are neural networks that learn how light and color flow through 3D space, letting you generate realistic new views of a scene from just a handful of 2D images.
Source: Chen & Wang, A Survey on 3D Gaussian Splatting, 2025.
  • 3D Gaussians - are smooth, probabilistic blobs in 3D space used to compactly represent shape, color, and density. They really good for fast, realistic rendering.
  • Gaussian - a smooth, bell-shaped probability distribution defined by a mean (center) and variance (spread), it's the mathematical backbone of “normal” randomness and referred to as the Bell Curve.

Source:


  • Degree of Freedom (DoF) - A degree of freedom is one independent variable you’re free to change. It's an axis of variation that doesn’t depend on anything else.
  • Spherical Harmonics (SH) - a smooth, wave-like function defined on a sphere, used to compactly represent how something varies with direction. Think of these as directional light/color building blocks.

Spherical Harmonics, Source:


The core of 3DGS is modeling a scene as a collection of thousands of individual 3D Gaussians. Each Gaussian is defined by a vector containing 14 degrees of freedom (DoFs). In 3D Gaussian splatting, a 14-DoF vector is the full spec sheet of a splat, its 3D position, size, shape, orientation, color, and opacity, all encoded as 14 independent numbers for fast rendering.
Ten dimensions of this vector handle the geometry of the splat i.e. its 3D coordinates, 3D shape, and its rotation. Summarized, they are:
Geometry (10 DoFs) — “Where it is and what shape it has”
  • Position (x ∈ ℝ³) — 3 numbers giving the splat’s 3D location (x, y, z).
  • Scale (s ∈ ℝ³) — 3 numbers describing how stretched or wide the Gaussian is along each axis.
  • Rotation (q ∈ ℝ⁴, quaternion) — 4 numbers encoding the splat’s 3D orientation without gimbal lock.
Note: x ∈ ℝ³ means he position vector has 3 real-valued components e.g. (x,y,z) and is read like: “x is in R three” or “x is a three-dimensional real vector.”
Note: s ∈ ℝ³ means the scale has 3 independent real values, one per axis. To read this one it's: “s is in R three” or “s is a real three-dimensional vector.”
Note: q ∈ ℝ⁴ means the rotation is stored as a quaternion, 4 real numbers. Read aloud: “q is in R four” or “q is a four-dimensional real vector.”
Four dimensions of the vector handle the appearance of the splat.
Appearance (4 DoFs) — “What it looks like”
  • Color (SH coefficients) — compact representation of view-dependent color using spherical harmonics. Instead of storing a full texture or separate colors for every camera angle, each splat stores a tiny set of numbers (spherical harmonic coefficients) that describe how its color changes with viewing direction.
  • Opacity (α ∈ ℝ) — 1 number controlling transparency or how solid the splat appears. (Read: “alpha is in R” or simply “alpha is a real number.”)
My next post will discuss the research and attack techniques. However, the critical insight for attackers to understand is that this rendering process is differentiable, meaning an adversarial loss function can be defined, and gradient descent can be used to iteratively modify those 14 parameters we talked about above and trick a downstream perception model.
References:
Zhang et al., 3DGAA: Realistic and Robust 3D Gaussian-based Adversarial Attack for Autonomous Driving, 2025.
Back to blog