# Introducing Basic and Advanced Indexing¶

Thus far we have seen that we can access the contents of a NumPy array
by specifying an integer or slice-object as an index for each one of its
dimensions. Indexing into and slicing along the dimensions of an array
are known as basic indexing. NumPy also provides a sophisticated system
of “advanced indexing”, which permits us powerful means for accessing
elements of an array that is flexible beyond specifying integers and
slices along axes. For example, we can use advanced indexing to access
all of the negative-valued elements from `x`

.

```
# demonstrating basic indexing and advanced indexing
>>> import numpy as np
>>> x = np.array([[ -5, 2, 0, -7],
... [ -1, 9, 3, 8],
... [ -3, -3, 4, 6]])
# Access the column-1 of row-0 and row-2.
# This is an example of basic indexing.
# A "view" of the underlying data in `x`
# is produced; no data is copied.
>>> x[::2, 1]
array([ 2, -3])
# An example of advanced indexing.
# Access all negative elements in `x`.
# This produces a copy of the accessed data.
>>> x[x < 0]
array([-5, -7, -1, -3, -3])
```

We will see that, where basic indexing provides us with a *view* of the
data within the array, without making a copy of it, advanced indexing
requires that a copy of the accessed data be made. Here, we will define
basic indexing and understand the nuances of working with views of
arrays. The next section, then, is dedicated to understanding advanced
indexing.

## Basic Indexing¶

We begin this subsection by defining precisely what basic indexing is. Next, we will touch on each component of this definition, and lastly we will delve into the significance of basic indexing in the way it permits us to reference the underlying data of an array without copying it.

**Definition: Basic Indexing**:

Given an \(N\)-dimensional array, `x`

, `x[index]`

invokes
**basic indexing** whenever `index`

is a *tuple* containing any
combination of the following types of objects:

- integers
- slice objects
- Ellipsis objects
- numpy.newaxis objects

Accessing the contents of an array via basic indexing *does not create a
copy of those contents*. Rather, a “view” of the same underlying data is
produced.

### Indexing with Integers and Slice Objects¶

Our discussion of accessing data along multiple dimensions of a NumPy
array
already provided a comprehensive rundown on the use of integers and
slices to access the contents of an array. According to the preceding
definition, *these were all examples of basic indexing*.

To review the material discussed in that section, recall that one can
access an individual element or a “subsection” of an
\(N\)-dimensional array by specifying \(N\) integers or
slice-objects, or a combination of the two. We also saw that, when
supplied fewer-than \(N\) indices, NumPy will automatically
“fill-in” the remaining indices with trailing slices. Keep in mind that
the indices start at 0, such that the 4th column in `x`

corresponds to
column-3.

```
# Accessing the element located
# at row-1, last-column of `x`
>>> x[1, -1]
8
# Access the subarray of `x`
# contained within the first two rows
# and the first three columns
>>> x[:2, :3]
array([[-5, 2, 0],
[-1, 9, 3]])
# NumPy fills in "trailing" slices
# if we don't supply as many indices
# as there are dimensions in that array
>>> x[0] # equivalent to x[0, :]
array([-5, 2, 0, -7])
```

Recall that the familiar
slicing
syntax actually forms `slice`

objects “behind the scenes”.

```
# Reviewing the `slice` object
# equivalent: x[:2, :3]
>>> x[slice(None, 2), slice(None, 3)]
array([[-5, 2, 0],
[-1, 9, 3]])
```

### Using a Tuple as an N-dimensional Index¶

According to its definition, we must supply our array-indices as a tuple
in order to invoke basic indexing. As it turns out, we have been forming
tuples of indices all along! That is, every time that we index into an
array using the syntax `x[i, j, k]`

, we are actually forming a
tuple
containing those indices. That is, `x[i, j, k]`

is equivalent to
`x[(i, j, k)]`

.

`x[i, j, k]`

forms the tuple `(i, j, k)`

and passes that to the
array’s “get-item” mechanism. Thus, `x[0, 3]`

is equivalent to
`x[(0, 3)]`

.

```
# N-dimensional indexing utilizes tuples:
# `x[i, j, k]` is equivalent to `x[(i, j, k)]`
# equivalent: x[1, -1]
>>> x[(1, -1)]
8
# equivalent: x[:2, :3]
>>> x[(slice(None, 2), slice(None, 3))]
array([[-5, 2, 0],
[-1, 9, 3]])
# equivalent: x[0]
>>> x[(0,)]
array([-5, 2, 0, -7])
```

All objects used in this “get-item” syntax are packed into a tuple. For
instance, `x[0, (0, 1)]`

is equivalent to `x[(0, (0, 1))]`

. You may
be surprised to find that this is a valid index. However, see that *it
does not invoke basic indexing*; the index used here is a tuple that
contains an integer *and another tuple*, which is not permitted by the
rules of basic indexing.

Finally, note that the rules of basic indexing specifically call for a
*tuple* of indices. Supplying a list of indices triggers advanced
indexing rather than basic indexing!

```
# basic indexing specifically requires a tuple
>>> x[(1, -1)]
8
# indexing with a list triggers advanced indexing
>>> x[[1, -1]]
array([[-1, 9, 3, 8],
[-3, -3, 4, 6]])
```

### Ellipsis and Newaxis objects¶

Recall from our discussion of broadcasting, that the `numpy.newaxis`

object can be passed as an index to an array, in order to insert a
size-1 dimension into the
array.

```
# inserting size-1 dimensions with `np.newaxis`
>>> x.shape
(3, 4)
>>> x[np.newaxis, :, :, np.newaxis].shape
(1, 3, 4, 1)
# forming the index as an explicit tuple
>>> x[(np.newaxis, slice(None), slice(None), np.newaxis)].shape
(1, 3, 4, 1)
```

We can also use the built-in `Ellipsis`

object in order to insert
slices into our index such that the index has as many entries as the
array has dimensions. In the same way that `:`

can be used to
represent a `slice`

object, `...`

can be used to represent an
`Ellipsis`

object.

```
>>> y = np.array([[[ 0, 1, 2, 3],
... [ 4, 5, 6, 7]],
...
... [[ 8, 9, 10, 11],
... [12, 13, 14, 15]],
...
... [[16, 17, 18, 19],
... [20, 21, 22, 23]]])
# equivalent: `y[:, :, 0]`
>>> y[..., 0]
array([[ 0, 4],
[ 8, 12],
[16, 20]])
# using an explicit tuple
>>> y[(Ellipsis, 0)]
array([[ 0, 4],
[ 8, 12],
[16, 20]])
# equivalent: `y[0, :, 1]`
>>> y[0, ..., 1]
array([1, 5])
```

An index cannot possess more than one `Ellipsis`

entry. This can be
extremely useful when working with arrays of varying dimensionalities.
To access column-0 along all dimensions of an array, `z`

, would look
like `z[:, 0]`

for a 2D array, `z[:, :, 0]`

for a 3D array, and so
on. `z[..., 0]`

succinctly encapsulates all iterations of this.

**Takeaway:**

Basic indexing is triggered whenever a tuple of: integer, `slice`

,
`numpy.newaxis`

, and/or `Ellipsis`

objects, is used as an index for
a NumPy array. An array produced via basic indexing is a *view* of the
same underlying data as the array that was indexed into; no data is
copied through basic indexing.

**Reading Comprehension: Ellipsis**

Given a \(N\)-dimensional array, `x`

, index into `x`

such that
you axis entry-0 of axis-0, the last entry of axis-\(N-1\),
slicing along all intermediate dimensions. \(N\) is at least
\(2\).

**Reading Comprehension: Basic Indexing**

Given a shape-(4, 3) array,

```
>>> arr = np.array([[ 0, 1, 2, 3],
... [ 4, 5, 6, 7],
... [ 8, 9, 10, 11]])
```

which of the following indexing schemes perform basic indexing? That is, in which instances does the index satisfy the rules of basic indexing?

`arr[0]`

`arr[:-1, 0]`

`arr[(2, 3)]`

`arr[[2, 0]]`

`arr[np.array([2, 0])]`

`arr[(0, 1), (2, 3)]`

`arr[slice(None), ...]`

`arr[(np.newaxis, 0, slice(1, 2), np.newaxis)]`

## Producing a View of an Array¶

As stated above, using basic indexing does not return a copy of the data
being accessed, rather it produces a *view* of the underlying data.
NumPy provides the function `numpy.shares_memory`

to determine if two
arrays refer to the same underlying data.

```
>>> z = np.array([[ 3.31, 4.71, 0.4 ],
... [ 0.21, 2.85, 3.21],
... [-3.77, 4.53, -1.15]])
# `subarray` is column-0 of `z`, via
# basic indexing
>>> subarray = z[:, 0]
>>> subarray
array([ 3.31, 0.21, -3.77])
# `subarray` is a view of the array data
# referenced by `z`
>>> np.shares_memory(subarray, z)
True
```

A single number returned by basic indexing *does not* share memory with
the parent array.

```
>>> z[0, 0]
3.31
>>> np.shares_memory(z[0, 0], z)
False
```

The function `numpy.copy`

can be used to create a copy of an array,
such that it no longer shares memory with any other array.

```
# creating a distinct copy of an array
>>> new_subarray = np.copy(subarray)
>>> new_subarray
array([ 3.31, 0.21, -3.77])
>>> np.shares_memory(new_subarray, z)
False
```

Utilizing an array in a mathematical expression involving the arithmetic
operators (`+, -, *, /, //, **`

) returns an entirely distinct array,
that does not share memory with the original array.

```
# mathematical expressions like `subarray + 2`
# produce distinct arrays, not views
>>> np.shares_memory(subarray + 2, subarray)
False
```

Thus updating a variable `subarray`

via `subarray = subarray + 2`

does *not* overwrite the original data referenced by `subarray`

.
Rather, `subarray + 2`

assigns that new array to the variable
`subarray`

. NumPy does provide mechanisms for performing mathematical
operations to directly update the underlying data of an array without
having to create a distinct array. We will discuss these mechanisms in
the next subsection.

**Reading Comprehension: Views**

Given,

```
x = np.array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
```

Which of the following expressions create views of `x`

? That is, in
which cases do `x`

and the created variable reference the same
underlying array data? Check your work by using `np.shares_memory`

.

`a1 = x`

`a2 = x[0, 0]`

`a3 = x[:, 0]`

`a4 = x[:, 0] + np.array([-1, -2, -3])`

`a5 = np.copy(x[:, 0])`

`a6 = x[np.newaxis]`

`a7 = x.reshape(2, 3, 2)`

`a8 = 2 + x`

## Augmenting the Underlying Data of an Array¶

Because basic indexing produces a *view* of an array’s underlying data,
we must take time to understand the ways in which we can *augment* that
underlying data, versus performing operations that produce an array with
distinct data. Here we will see that:

- in-place assignments
- augmented assignments
- NumPy functions with the
`out`

argument

can all be used to augment array data in-place.

### In-Place Assignments¶

The assignment operator, `=`

, can be used to update an array’s data
in-place. Consider the array `a`

, and its view `b`

.

```
>>> a = np.array([0, 1, 2, 3, 4])
>>> b = a[:]
>>> np.shares_memory(a, b)
True
```

Assigning a new array to `a`

simply changes the data that `a`

references, divorcing `a`

and `b`

, and leaving `b`

unchanged.

```
# `a` is now assigned to reference a distinct array
>>> a = np.array([0, -1, -2, -3, -4])
# `b` still references the original data
>>> b
array([0, 1, 2, 3, 4])
>>> np.shares_memory(a, b)
False
```

Performing an assignment on a *view* of `a`

, i.e. `a[:]`

, instructs
NumPy to perform the assignment to replace `a`

’s data in-place.

```
# reinitialize `a` and `b`.
# `b` is again a view of `a`
>>> a = np.array([0, 1, 2, 3, 4])
>>> b = a[:]
# assigning an array to a *view* of `a`
# causes NumPy to update the data in-place
>>> a[:] = np.array([0, -1, -2, -3, -4])
>>> a
array([ 0, -1, -2, -3, -4])
# `b` a view of the same data, thus
# it is affected by this in-place assignment
>>> b
array([ 0, -1, -2, -3, -4])
>>> np.shares_memory(a, b)
True
```

This view-assignment mechanism can be used update a subsection of an array in-place.

```
>>> p = np.array([[ 0, 1, 2, 3],
... [ 4, 5, 6, 7],
... [ 8, 9, 10, 11]])
>>> q = p[0, :]
# Assign row-0, column-0 the value -40
# and row-0, column-2 the value -50
>>> p[0, ::2] = (-40, -50)
# broadcast-assign -1 to a subsection of `p`
>>> p[1:, 2:] = -1
>>> p
array([[-40, 1, -50, 3],
[ 4, 5, -1, -1],
[ 8, 9, -1, -1]])
```

Again, this updates the underlying data, and thus all views of this data reflect this change.

```
# `q` is still a view of row-0 of `p`
>>> q
array([-40, 1, -50, 3])
```

### Augmented Assignments¶

Recall from our discussion of basic mathematical expressions in Python,
that augmented assignment
expressions
provide a nice shorthand notation for updating the value of a variable.
For example, the assignment expression `x = x + 5`

can be rewritten
using the augmented assignment `x += 5`

.

While `x += 5`

is truly only a shorthand in the context of basic
Python objects (integer,s floats, etc.), *augmented assignments on NumPy
arrays behave fundamentally different than their long-form
counterparts*. Specifically, they directly update the underlying data
referenced by the updated array, rather than creating a distinct array,
thus affecting any arrays that are views of that data. We will
demonstrate this here.

```
# Demonstrating that augmented assignments on NumPy
# arrays update the underlying data reference by that
# array.
>>> a = np.array([[ 0, 1, 2, 3],
... [ 4, 5, 6, 7],
... [ 8, 9, 10, 11]])
# `b` and `c` are both views of row-0 of `a`, via basic indexing
>>> b = a[0]
>>> c = a[0]
>>> np.shares_memory(a, b) and np.shares_memory(a, c)
True
# updating `b` using a mathematical expression creates
# a distinct array, which is divorced from `a` and `c`
>>> b = b * -1
>>> b
array([ 0, -1, -2, -3])
>>> np.shares_memory(a, b)
False
# updating `c` using augmented assignment updates the
# underlying data that `c` is a view of
>>> c *= -2
>>> c
array([ 0, -2, -4, -6])
>>> np.shares_memory(a, c)
True
# note that this update is reflected in `a` as well,
# as it still shares memory with `c`
>>> a
array([[ 0, -2, -4, -6],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
```

### Specifying `out`

to Perform NumPy Operations In-Place¶

There is no reason why we should only be able to augment data using
arithmetic operations. Indeed, NumPy’s various mathematical
functions
have an optional keyword argument, `out`

, which can be used to specify
where to “store” the result of the mathematical operation. By default,
the operation will create a distinct array in memory, leaving the input
data unaffected.

```
# Specifying the 'out' argument in a `numpy.exp`
# to augment the data of an array
# `b` is a view of `a`
>>> a = np.array([0., 0.2, 0.4, 0.6, 0.8, 1.])
>>> b = a[:]
>>> np.shares_memory(a, b)
True
# specifying 'out=a' instructs NumPy
# to overwrite the data referenced by `a`
>>> np.exp(a, out=a)
array([ 1., 1.22140276, 1.4918247, 1.8221188, 2.22554093, 2.71828183])
# `b` is still a view of the now-augmented data
>>> b
array([ 1., 1.22140276, 1.4918247, 1.8221188, 2.22554093, 2.71828183])
```

### Benefits and Risks of Augmenting Data In-Place¶

It is critical to understand the relationship between arrays and the
underlying data that they reference. *Operations that augment data
in-place are more efficient than their counterparts that must allocate
memory for a new array.* That is, an expression like `array += 3`

is
more efficient than `array = array + 3`

.

That being said, to *unwittingly* augment the data of an array, and thus
affect all views of that data, is a big mistake; this produces
hard-to-find bugs in the code of novice NumPy users. See that the
following function, `add_3`

, will change the data of the input array.

```
# updating an array in-place within a function
def add_3(x):
x += 3
return x
>>> x = np.array([0, 1, 2])
>>> y = add_3(x)
>>> y
array([3, 4, 5])
# `x` is updated each time `f(x)` is called
>>> x
array([3, 4, 5])
```

This is hugely problematic unless you intended for `add_3`

to affect
the input array. To remedy this, you can simply begin the function by
making a copy of the input array; afterwards you can freely augment this
copied data.

```
def add_3(x):
x = np.copy(x)
x += 3
return x
```

**Reading Comprehension: Augmenting Array Data In-Place**

Given,

```
x = np.array([[ 0., 1., 2., 3.],
[ 4., 5., 6., 7.],
[ 8., 9., 10., 11.]])
y = x[0, :]
```

Which of the following expressions updates the data originally
referenced by `x`

?

```
# 1.
>>> x += 3
```

```
# 2.
>>> y *= 2.4
```

```
# 3.
>>> x = x + 3
```

```
# 4.
>>> y = np.copy(y)
>>> y += 3
```

```
# 5.
>>> np.log(x[1:3], out=x[1:3])
>>> y += 3
```

```
# 6.
>>> y[:] = y + 2
```

```
# 7.
>>> y = y + 2
```

```
# 8.
>>> x[:] = 0
```

```
# 9.
>>> def f(z): z /= 3
>>> f(y)
```

**Takeaway:**

Assignments to views of an array, augmented assignments, and NumPy
functions that provide an `out`

argument, are all methods for
augmenting the data of an array in-place. This will affect any arrays
that are views of that data. Furthermore, these in-place operations are
more efficient than their counterparts that allocate memory for a new
array. That being said, in-place data augmentation must not be used
haphazardly, for this will inevitably lead to treacherous bugs in one’s
code.

## Links to Official Documentation¶

## Reading Comprehension Solutions¶

**Ellipsis: Solution**

Given a \(N\)-dimensional array, `x`

, index into `x`

such that
you axis entry-0 of axis-0, the last entry of axis-\((N-1)\),
slicing along all intermediate dimensions. \(N\) is at least
\(2\).

Using an `Ellipsis`

object in the index allows us to signal NumPy to
insert the slices along the \(N - 2\) intermediate axis of `x`

:

`x[0, ..., -1]`

or `x[0, Ellipsis, -1]`

**Basic Indexing: Solution**

In which instances does the index used satisfy the rules of basic indexing?

`arr[0]`

✔`arr[:-1, 0]`

✔`arr[(2, 3)]`

✔`arr[[2, 0]]`

✘ (index is a`list`

, not a`tuple`

)`arr[np.array([2, 0])]`

✘ (index is a`numpy.ndarray`

, not a`tuple`

)`arr[:, (2, 3)]`

✘ (index contains a tuple; only`int`

,`slice`

,`np.newaxis`

,`Ellipsis`

allowed)`arr[slice(None), ...]`

✔`arr[(np.newaxis, 0, slice(1, 2), np.newaxis)]`

✔

**Views: Solution**

Given,

```
x = np.array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
```

Which of the following expressions create views `x`

? That is, in which
cases do `x`

and the created variable reference the same underlying
array data? Check your work by using `np.shares_memory`

.

`a1 = x`

✔`a2 = x[0, 0]`

✘; when basic indexing returns a single number, that number does not share memory with the parent array.`a3 = x[:, 0]`

✔`a4 = x[:, 0] + np.array([-1, -2, -3])`

✘; arithmetic operations on NumPy arrays create distinct arrays by default.`a5 = np.copy(x[:, 0])`

✘;`numpy.copy`

informs NumPy to create a distinct copy of an array.`a6 = x[np.newaxis]`

✔`a7 = x.reshape(2, 3, 2)`

✔`a8 = 2 + x`

✘; arithmetic operations on NumPy arrays create distinct arrays by default.

**Augmenting Array Data In-Place: Solution**

Given,

```
x = np.array([[ 0., 1., 2., 3.],
[ 4., 5., 6., 7.],
[ 8., 9., 10., 11.]])
y = x[0, :]
```

Which of the following expressions updates the data originally
referenced by `x`

?

```
# 1.
>>> x += 3 ✔
```

```
# 2.
>>> y *= 2.4 ✔
```

```
# 3.
>>> x = x + 3 ✘
```

```
# 4.
>>> y = np.copy(y)
>>> y += 3 ✘
```

```
# 5.
>>> np.log(x[1:3], out=x[1:3]) ✔
```

```
# 6.
>>> y[:] = y + 2 ✔
```

```
# 7.
>>> x = np.square(x) ✘
```

```
# 8.
>>> x[:] = 0 ✔
```

```
# 9.
>>> def f(z): z /= 3
>>> f(y) ✔
```

```
# 10.
>>> np.square(y, out=y) ✔
```