arrays.txt
Arrays
[ I created this file as a .txt file rather than a .py file. So, you will
have to type my sample code into a python interpreter session to
execute the code. I decided to do this for the ease of my
presentation. Anything that follows a python prompt, i.e., >>>, is
an actual piece of Python code that you would want to execute.
]
I found an online tutorial on NumPy:
http://www.scipy.org/Tentative_NumPy_Tutorial
I am going to extract some material from that tutorial to meet our
students' needs for the upcoming workshop. You are certainly welcome
to read that tutorial if you want to know more beyond what I cover
here.
A Quick Tour First:
NumPy is a Python library for working with multidimensional arrays.
The main data type is an array. An array is a set of elements,
all of the same type, indexed by a vector of nonnegative integers.
Arrays can be created in different ways:
>>> from numpy import *
Create an array out of a list:
>>> u = [10, 20, 30, 40, 50]
>>> a = array(u)
>>> a
array([10, 20, 30, 40, 50])
Create an array of 5 integers, from 0 to 4 using the arange
function. Note that 'range' in Python returns a list, whereas
'arange' in NumPy returns an array:
>>> c = arange(5)
>>> c
array([0, 1, 2, 3, 4])
Create an array of 3 evenly spaced samples from -pi to pi
>>> d = linspace(-pi, pi, 3)
>>> d
array([-3.14159265, 0. , 3.14159265])
New arrays can be created by operating with existing arrays:
>>> e = a+c**2 # elementwise operations
>>> e
array([10, 21, 34, 49, 66])
Arrays may have multiple dimensions, here using 'ones' ('zeros' is
also available - try it):
>>> x = ones((3,5))
>>> x
array([[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.]])
>>> x.shape # a tuple with the dimensions
(3, 5)
You can change the dimensions of existing arrays:
>>> y = arange(15)
>>> y
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14])
This reshaping does not modify the total number of elements. In
fact, the total number of elements must be the same as you reshape
it.
>>> y.shape = 3,5
>>> y
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
It is possible to operate with arrays of different dimensions as long
as they fit well ('broadcasting'):
>>> 3*a # multiply each element of a by 3
array([ 30, 60, 90, 120, 150])
>>> a+y # sum a to each row of y
array([[10, 21, 32, 43, 54],
[15, 26, 37, 48, 59],
[20, 31, 42, 53, 64]])
Similarly to Python lists, arrays can be indexed, sliced, and
iterated over:
>>> a[3:5] = -7,-3 # modify last two elements of a
>>> a
array([10, 20, 30, -7, -3])
>>> for i in a: # iterate over a
... print i
...
10
20
30
-7
-3
When indexing more than one dimension, indices are separated by
commas:
>>> x[1,2] = 20
>>> x[1,:] # x's second row
array([ 1., 1., 20., 1., 1.])
>>> x[0] = a # change first row of x
>>> x
array([[10., 20., 30., -7., -3.],
[ 1., 1., 20., 1., 1.],
[ 1., 1., 1., 1., 1.]])
A Little More in Detail:
NumPy's main object is the homogeneous multidimensional array. This
is a table of elements (usually numbers), all of the same type,
indexed by a tuple of positive integers. Typical examples of
multidimensional arrays include vectors, matrices, images, and
spreadsheets.
By 'multidimensional', we mean that arrays can have several dimensions
or axes. Because the word dimension is ambiguous, we use axis
instead. The number of axes will often be called rank.
For example, the coordinates of a point in 3D space [1, 2, 1] is an
array of rank 1---it has one axis. That axis has a length of 3. As
another example, the array:
[[ 1., 0., 0.],
[ 0., 1., 2.]]
is an array of rank 2 (it is 2-dimensional). The first dimension
(axis) has a length of 2, the second dimension has a length of 3.
The multidimensional array class is called 'ndarray'. Note that this
is not the same as the Standard Python Library class array, which is
only for one-dimensional arrays. The more important attributes of an
ndarray object are:
- ndarray.ndim: the number of axes (dimensions) of the array.
- ndarray.shape: the dimensions of the array. This is a tuple of
integers indicating the size of the array in each dimension. For a
matrix with n rows and m columns, shape will be (n,m). The length
of the shape tuple is therefore the rank, or number of dimensions,
ndim.
- ndarray.size: the total number of elements of the array.
- ndarray.dtype: an object describing the type of the elements in the
array. One can create or specify dtype's using standard Python
types. NumPy provides a bunch of them, for example: bool_,
character, int_, int8, int16, int32, int64, float_, float8,
float16, float32, float64, complex_, complex64, object_.
- ndarray.itemsize: the size in bytes of each element of the array.
For example, an array of elements of type float64 has itemsize 8
(=64/8), while one of type complex32 has itemsize 4 (=32/8). It is
equivalent to ndarray.dtype.itemsize.
- ndarray.data: the buffer containing the actual elements of the
array. Normally, we won't need to use this attribute because we
will access to the elements in an array using indexing facilities.
Let us see an example:
>>> a = array([2,3,4])
>>> a
array([2, 3, 4])
>>> type(a) # a is an object of the ndarray class
array transforms sequences of sequences into bidimensional arrays, and
it transforms sequences of sequences of sequences into tridimensional
arrays, and so on. The type of the resulting array is deduced from
the type of the elements in the sequences.
>>> b = array([(1.5,2,3), (4,5,6)]) # This will be an array of floats
>>> b
array([[ 1.5, 2. , 3. ],
[ 4. , 5. , 6. ]])
Once we have an array we can take a look at its attributes:
>>> b.ndim # number of dimensions
2
>>> b.shape # the dimensions
(2, 3)
>>> b.dtype # the type of elements (8 byte floats)
dtype('float64')
>>> type(b) # cf. this is the type of the array itself, an ndarray
>>> b.itemsize # the size of the type
8
The element type of the array can also be explicitly specified at
creation time. Here I am using complex, but you may try any of the
ones given in the description of ndarray.dtype above:
>>> c = array([[1,2], [3,4]], dtype=complex)
>>> c
array([[ 1.+0.j, 2.+0.j],
[ 3.+0.j, 4.+0.j]])
Some Basic Operations
Arithmetic operators on arrays apply elementwise. A new array is
created and filled with the result.
>>> a = array( [20,30,40,50] )
>>> b = arange( 4 )
>>> c = a-b
>>> c
array([20, 29, 38, 47])
>>> b**2
array([0, 1, 4, 9])
>>> 10*sin(a)
array([ 9.12945251, -9.88031624, 7.4511316 , -2.62374854])
>>> a<35
array([True, True, False, False], dtype=bool)
This gives you the basic knowledge needed to attend the workshop, I
believe.
There are many more interesting operations dealing with arrays and
matrices. I am going to refer you to the rest of the online tutorial
and further to the NumPy manual and stop here.