Pandas Indexing & Slicing Syntax Cheat Sheet

How to remember Pandas indexing and slicing syntax
Author

Sjoerd de Haan

Published

September 17, 2025

Introduction

Pandas is a data manipulation library in Python.

I’m not writing about it because I like it. It’s syntax is everything but helpful.

The problem is that Pandas is used a lot in the Python ecosystem.

Something simple like slicing and indexing is horribly complicated. The API is based on the internals, not on the mathematics or logic of the user that the user wants to express. This makes it hard to remember, use and read the syntax.

I have to write this post as a note to myself and others who struggle with the syntax of Pandas indexing and slicing.

Pandas Indexing & Slicing Methods Overview

Pure vs Impure methods

Indexing and slicing can be done based on labels (names) or positions (integers). Pandas provides several methods. The pure methods work with only labels or only positions. The mixed methods work with both, depending on context. This is even harder to memorize.

“Pure” Methods (Consistent behavior):

.loc[] - Label-based Indexing

The .loc[] method returns the carthesian product of rows and columns. It strictly label-based. It always uses the index and column names.

df.loc[row_labels, col_labels]
df.loc['a':'c']                 # INCLUSIVE on both ends
df.loc[df['col'] > 5]          # Boolean indexing
# Always uses index/column names

The labels are the values on the index (rows) and the column names (columns).

Here is a concrete example:

import pandas as pd
df = pd.DataFrame({'A': [1, 2, 3],
                   'B': [4, 5, 6]},
                  index=['x', 'y', 'z'])
df.loc['x']          # Returns Series: A=1, B=4
df.loc['x', 'A']     # Returns: 1 (scalar)
df.loc['x':'y']      # Returns DataFrame (2 rows)
df.loc['x':'y', 'A'] # Returns Series: x=1, y=2
df.loc['x', ['A', 'B']] # Returns Series: A=1, B=4

.iloc[] - Position-based Indexing

The .iloc[] method also returns the carthesian product of rows and columns. This method is strictly position-based. It always uses integer positions.

df.iloc[row_pos, col_pos]
df.iloc[0:3]                    # EXCLUSIVE end (standard Python)
df.iloc[-1]                     # Last row
# Always uses integer positions

.at[] / .iat[] - Single scalar value

The methods .at[] and .iat[] are optimized for speed when accessing a single value. .at[] is label-based, .iat[] is position-based.

The syntac is similar to .loc[] and .iloc[], but only for single values:

df.at['row_label', 'col_label']   # Label-based scalar
df.iat[0, 1]                       # Position-based scalar
# Fastest for single values

“Impure” Methods (Mixed/context-dependent behavior):

The following methods have mixed or context-dependent behavior.

df[] - Convenience accessor

df['col']                       # Column (by name)
df[['col1', 'col2']]           # Columns (by names)
df[0:3]                        # Rows (by POSITION!) 
df[df['col'] > 5]              # Rows (by boolean)
df['2020':'2021']              # Rows (by label if index is dates/strings)
# Behavior changes based on input type!

.xs() - Cross-section

Above examples use normal indexes. The Cross section is useful when working with MultiIndexes.

The syntax is:

df.xs('key')                    # Can be label or level in MultiIndex
df.xs(('a', 'b'), level=[0,1])  # Mixed with MultiIndex for rows
df.xs('col_name', axis=1, level='level_name')
# Behavior depends on index structure

Here is an example with MultiIndex rows:

# MultiIndex DataFrame
arrays = [['bar', 'bar', 'baz', 'baz'],
['one', 'two', 'one', 'two']]
index = pd.MultiIndex.from_arrays(arrays, names=['first', 'second'])
df = pd.DataFrame({'A': [1, 2, 3, 4]}, index=index)

# Select cross-section at first level
df.xs('bar', level='first')
#        A
# second
# one    1
# two    2

# Select cross-section at second level
df.xs('one', level='second')
#       A
# first
# bar   1
# baz   3

# Multiple levels
df.xs(('bar', 'two'))  # Returns: A=2 (scalar/Series)

.query() - String evaluation

Query is a Convenience method that evalues sql-like expressions on the DataFrame.

df.query('col > 5')             # Evaluates string
df.query('col > @var')          # Mixes with Python variables
# Mixes string parsing with DataFrame operations

Overview

This is how I remember the different methods / look the up.

Comparison Table:

Method Type Label/Position Use Case Predictable?
.loc[] Pure Label Explicit label-based access ✓ Always
.iloc[] Pure Position Explicit position-based access ✓ Always
.at[] Pure Label Fast single value ✓ Always
.iat[] Pure Position Fast single value ✓ Always
df[] Impure Mixed Quick access ✗ Context-dependent
.xs() Impure Mixed MultiIndex slicing ✗ Depends on index
.query() Impure Label Complex filters ~ String parsing

Other methods (Special purpose):

Boolean/conditional:

df.where()      # Conditional replacement (keeps shape)
df.mask()       # Inverse of where
df.filter()     # Filter columns/rows by name pattern

Selection helpers:

df.head()/tail()           # First/last n rows
df.sample()                # Random rows
df.nlargest()/nsmallest()  # By values
df.take()                  # By position array

Best Practices:

  1. Use pure methods when explicit behavior needed
  2. Use .loc[] for setting values (avoid chained indexing)
  3. Use df[] only for simple column access
  4. Be careful with df[slice] - it’s position-based, not label-based!

Conclusion

Remembering the Pandas API is hard if you do not use it daily. The consise syntax of impure methods like df[] and .xs() looks great, but is hard to remember and read.

When it comes to indexing and slicing, prefer the pure methods .loc[] and .iloc[] for clarity and predictability.