NumPy is a open-source Python library that stands for Numerical Python. NumPy is widely used for working with large, multi-dimensional arrays (ndarray), matrices of numerical data and it provides extremely fast and efficient numerical operations on arrays. NumPy is commonly employed in data science and machine learning to perform mathematical operations on large datasets.
By the end of this article, we will have a good understanding of the basics of NumPy. We will learn about the installation, NumPy data types, and why NumPy is used. Finally, we will learn about NumPy arrays in detail, data visualization using NumPy, and some of its limitations.
So, without further ado, let’s get started!
1. Why use NumPy?
NumPy arrays are more efficient and faster. Numpy has a built-in data structure called an array similar to the usual Python list, but it can store and operate on data much more efficiently.
Here are some reasons to use NumPy:
- NumPy operations perform faster than equivalent operations on Python lists. This is because NumPy arrays are stored in memory in a way optimized for faster access.
- NumPy provides various functions to perform common operations on arrays without loops. This makes the code easier to read and understand.
- NumPy code is clearer because you can perform operations on arrays at once using the built-in functions, which makes the code more readable and understandable.
2. Installing NumPy
To use NumPy, we first need to install it on the system. There are several methods, but let’s take a look at two simple methods.
2.1. Installing NumPy with ‘pip‘
The ‘pip‘ is a Python’s package installation manager that makes it easy to install Python libraries or frameworks. If we have Python version 3.4 or higher, then Pip comes by default. Otherwise, we will need to install pip before installing NumPy.
Now, first, launch the command prompt and type the following command:
pip install numpy
Hit Enter and we will see that NumPy will start installing. Now we can use NumPy in Python programs.

After the installation is finished, we have to import NumPy into a Python program. We can either use import numpy
or import numpy as np
.
import numpy
//or
import numpy as np
2.2. Installing NumPy with Anaconda
Another way to install NumPy is to install Anaconda. Anaconda is a Python distribution that provides us with access to different tools.
When we install Anaconda, it installs all the major libraries automatically. To install Anaconda, first download it using from its download page.
Now, launch it, but remember to check the following boxes:

Just click on the Install button and wait for the installation to complete. Once Anaconda is installed, we can use NumPy in the Windows command prompt, VS Code editor, or PowerShell prompt (one of the tools available in the Anaconda Navigator).
If we are going to use NumPy, it is a best practice to use it in a Jupyter Notebook. Jupyter Notebook is a web-based, interactive computing notebook.
To use Jupyter, open Anaconda Navigator in our system and open the Jupyter Notebook. We can see the Jupyter Notebook option in the image below.

Just click on the Launch button and the notebook will open on the localhost page, as we can see below.

We can click on the New button and select Python 3. We are now ready to use the Jupyter Notebook.
3. NumPy Data Types
NumPy provides a wider range of numeric data types than Python. The additional data types in NumPy are designed for numerical calculations. Here is the list of NumPy data types and the characters used to represent them:
3.1. i (integer)
The ‘i
‘ is used to represent signed integer types. The number of bits used to store the integer depends on the machine. For example, on a 64-bit machine, the ‘i
‘ is 64 bits wide, while on a 32-bit machine, it is 32 bits wide.
If we execute the below code, the output will differ on various machines. On my machine, the output is: ‘int32
‘, which indicates that the array is of a 32-bit signed integer type. On other machines, the output may be int64
or vary. Here, the .dtype
attribute is used to print the data type.
import numpy as np
arr = np.array([1, 2, 3])
print(arr.dtype) # Prints 'int64'
We can also explicitly specify the data type when creating an array. For example, the following code will create an array of 64-bit integers.
import numpy as np
arr = np.array([1, 2, 3], dtype=np.int32)
print(arr.dtype) # Prints 'int32'
3.2. b (boolean)
The ‘b
‘ represents Boolean values which can either be True or False. Let’s create an array of Boolean values and print its data type.
import numpy as np
arr = np.array([True, False, True])
print(arr.dtype) # Prints 'bool'
3.3. u (unsigned integer)
The ‘u
‘ represents unsigned integer data types. Unsigned integers can only store positive values. The size of the unsigned integer depends on the machine. For example, the below code will create a numpy array of 8-bit unsigned integers.
import numpy as np
arr = np.array([1, 2, 3], dtype=np.uint8)
print(arr.dtype) # Prints 'uint8'
If we try to include the negative values, then we will get an error.
import numpy as np
arr = np.array([1, 2, 3, -4], dtype=np.uint8)
print(arr.dtype) # Error: DeprecationWarning: NumPy will stop allowing conversion of out-of-bound Python integers to integer arrays. The conversion of -1 to uint8 will fail in the future.
3.4. f (float)
The ‘f
‘ represents floating point numbers. The precision of floating point numbers depends on the platform. It may be 16-bit floating point numbers, 32-bit, and 64-bit. For example, if we run the below code, the result will vary. It may be 32-bit, 64-bit, etc. Like in my case, it’s float64.
import numpy as np
arr = np.array([1.1, 2.2, 3.3])
print(arr.dtype) # Prints 'float64'
You can also create floating point numbers of specific bits by mentioning their data type, as shown below, we are creating an array of 32-bit floating point numbers.
import numpy as np
arr = np.array([1.1, 2.2, 3.3], dtype="float32")
print(arr.dtype) # Prints 'float32'
3.5. c (complex float)
The ‘c
‘ represents complex numbers and is often denoted as complex64 or complex128. Complex numbers have both real and imaginary parts. In complex64, both real and imaginary part is represented using a 32-bit floating point number, and in complex128, each part is represented using a 64-bit floating point number.
We can access the real and imaginary parts using the numpy built-in functions: .real
and .imag
. Here is an example:
import numpy as np
array1 = np.array([1 + 2j, 3 - 4j], dtype=np.complex64)
array2 = np.array([1.5 + 2.5j, -3.5 - 4j], dtype=np.complex128)
print(array1)
print(array2)
print("Real part of Array 1:", array1.real)
print("Imaginary part of Array 1:", array1.imag)
The program output:
[1.+2.j 3.-4.j]
[ 1.5+2.5j -3.5-4.j ]
Real part of Array 1: [1. 3.]
Imaginary part of Array 1: [ 2. -4.]
3.6. m (timedelta64)
The ‘m
‘ represents time delta which means time durations or intervals. It also allows us to perform arithmetic operations on time intervals such as adding different time durations.
In the following code, we create four-time delta objects that show day, hour, minute, and second.
import numpy as np
time_day = np.timedelta64(4, 'D')
time_hr = np.timedelta64(7, 'h')
time_min = np.timedelta64(60, 'm')
time_sec = np.timedelta64(120, 's')
print(time_day, time_hr, time_min, time_sec) # Prints '4 days 7 hours 60 minutes 120 seconds'
3.7. M (datetime64)
The ‘M
‘ represents date and time. When we create a datetime64 object, It holds the year, month, day, hour, minute, second, and even fractions of a second. For example, consider the following datetime:
np.datetime64('2023-08-18T12:30:45.500')
In this case, datetime holds the date: ‘August 18th, 2023, and time: 12:30:00.50‘. We can then perform various operations with this datetime object, such as comparing it to other date-times, calculating time intervals, and more.
In the below code, we have calculated the duration by subtracting the start time from the end time.
import numpy as np
start = np.datetime64('2023-08-19T11:00:00')
end = np.datetime64('2023-08-21T16:40:00')
event_duration = end - start
print(event_duration) # Prints '193200 seconds'
4. Creating Arrays with NumPy
NumPy arrays are similar to Python lists, except lists can store elements of different data types whereas all of the elements in a NumPy array should be homogeneous.
There are several ways to create NumPy arrays. We will explore some of these methods below.
4.1. Creating an Empty Array
An empty array allocates memory for the array elements without initializing them to any particular value. It’s crucial to recognize that the array’s elements are uninitialized and may retain previous memory values.
We can create an empty array using the np.empty()
function. In the following example, the shape (3, 4)
specifies that the array should have 3 rows and 4 columns
import numpy as np
empty_array = np.empty((3, 4)) # Creates a 3x4 empty array
4.2. Creating N-d Arrays
NumPy provides following inbuilt methods to create arrays:
- array()
- arrange()
- zeros()
- ones()
- linespace()
Let us learn how to use these methods in brief.
4.2.1. numpy.array()
The numpy.array()
function creates an array from any iterable object, such as a list, tuple, or range. For example, the following code creates an array from a list:
import numpy as np
list1 = [1, 2, 3, 4]
array1 = np.array(list1)
print(array1) # Prints: [1 2 3 4]
We can also create multi-dimensional arrays by providing nested lists or tuples:
# Create a 2D NumPy array from a nested Python list
matrix = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
my_2d_array = np.array(matrix)
# Create a 3D NumPy array from nested Python lists
cube = [[[1, 2], [3, 4]], [[5, 6], [7, 8]]]
my_3d_array = np.array(cube)
4.2.2. numpy.arrange()
The numpy.arrange()
creates one-dimensional arrays of evenly spaced numbers. It takes three arguments:
start
: The starting value of the array.stop
: The ending value of the array, is not included.step
: The step size between elements in the array.
# Create an array of integers from 0 to 9 (exclusive)
arr1 = np.arange(10)
print(arr1) # Output: [0 1 2 3 4 5 6 7 8 9]
# Create an array of even integers from 2 to 10 (exclusive)
arr2 = np.arange(2, 10, 2)
print(arr2) # Output: [2 4 6 8]
# Create an array of floating-point numbers from 0.0 to 1.0 (exclusive) with a step of 0.1
arr3 = np.arange(0.0, 1.0, 0.1)
print(arr3) # Output: [0. 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9]
We can use arrange() with other NumPy functions to create multi-dimensional arrays if desired. For example, we are using reshape() to convert a single-dimension array to 2-D array.
# Create a one-dimensional array of 12 elements
arr1d = np.arange(12)
# Reshape it into a 3x4 two-dimensional array
arr2d = arr1d.reshape(3, 4)
print(arr2d)
# Output
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]]
4.2.3. numpy.zeros()
The numpy.zeros()
creates an array of zeros. It takes one argument: shape
(size of the array).
array3 = np.zeros(8)
print(array3) # Output: [0. 0. 0. 0. 0. 0. 0. 0.]
To create multi-dimensional arrays, we can specify the shape by passing a tuple containing the desired dimensions.
# Create a 2D array with dimensions 3x4 filled with zeros
zeros_2d = np.zeros((3, 4))
# Create a 3D array with dimensions 2x3x2 filled with zeros
zeros_3d = np.zeros((2, 3, 2))
# Create a 4D array with dimensions 2x3x2x2 filled with zeros
zeros_4d = np.zeros((2, 3, 2, 2))
4.2.4. numpy.ones()
The numpy.ones()
creates an array of ones. It takes one argument: shape
(size of the array).
array4 = np.ones(8)
print(array4) # Output: [1. 1. 1. 1. 1. 1. 1. 1.]
Similar to zeros(), we can specify the dimensions for creating a multi-dimensional arrays.
# Create a 2D array with dimensions 3x4 filled with ones
ones_2d = np.ones((3, 4))
# Create a 3D array with dimensions 2x3x2 filled with ones
ones_3d = np.ones((2, 3, 2))
# Create a 4D array with dimensions 2x3x2x2 filled with ones
ones_4d = np.ones((2, 3, 2, 2))
4.2.5. numpy.linspace()
The numpy.linspace()
creates an array of evenly spaced numbers over a specified interval. The linspace()
function takes four arguments:
start
: The starting value of the array.stop
: The ending value of the array, inclusive.num
: The number of elements in the array.endpoint
: Whether to include the stop value in the array.
array5 = np.linspace(0, 1, 5)
print(array5) # Output: [0. 0.2 0.4 0.6 0.8]
We can use the reshape() for converting an array created with linespace() to N-dimensional array.
# Create a 2D array with 4 rows and 3 columns, with values evenly spaced from 0 to 1 (inclusive)
arr_1d = np.linspace(0, 1, 12) # 12 values
arr_nd = arr_1d.reshape(4, 3) # Reshape into a 2D array
4.3. Loading Arrays from Files
NumPy provides functions like numpy.loadtxt()
, numpy.genfromtxt()
, and numpy.load()
for reading data from text files, CSV files, and NumPy binary files, respectively.
The
function loads arrays from ‘.npy’ files (NumPy Binary Files), previously saved with its own binary format using numpy.
load()numpy.save()
.
import numpy as np
arr = np.array([1, 2, 3, 4])
np.save('my_array.npy', arr)
array = np.load("my_array.npy")
print(array) # Output: [1, 2, 3, 4]
The numpy.genfromtxt()
or numpy.loadtxt()
functions load arrays from text files or CSV files, generally, previously saved using numpy.savetxt()
. It takes the following arguments:
filename
: The name of the file to load.delimiter
: The delimiter is used to separate the values in the file.dtype
: The data type of the values in the file.skip_header
: The number of lines to skip at the top of the file.comments
: A character to identify comment lines in the file.
import numpy as np
arr = np.array([1, 2, 3, 4])
np.savetxt('my_array.txt', arr)
array = np.genfromtxt("my_array.txt", delimiter=",")
print(array) # Output: [1, 2, 3, 4]
5. Constants and Attributes
Constants are predefined values that can be used without having to define them first. Attributes are properties that can be accessed using the dot notation.
5.1. Constants
NumPy provides several important mathematical constants that can be accessed for various calculations. Here are some of the most commonly used constants:
Constant | Description |
---|---|
np.pi | The mathematical constant pi (π), which is approximately equal to 3.141592653589793 |
np.e | The mathematical constant e, which is approximately equal to 2.718281828459045 |
np.inf | Positive infinity |
np.nan | Not a Number (NaN) |
5.2. Attributes
NumPy arrays have several attributes that provide information about the array’s properties. Here are some of the most important attributes:
Attribute | Description |
---|---|
.shape | Tuple indicates the dimensions of the array |
.dtype | Data type of the array elements |
.size | Total number of elements in the array |
.ndim | Number of dimensions (axes) of the array |
import numpy as np
print(np.pi) # 3.141592653589793
print(np.e) # 2.718281828459045
print(np.inf) # inf
print(np.nan) # nan
array = np.array([1, 2, 3])
print(array.dtype) # int32
print(array.shape) # (3,)
print(array.size) # 3
print(array.ndim) # 1
6. Working with NumPy Arrays
We can perform various operations like addition, subtraction, and multiplication, using statistical functions, comparing two arrays, performing matrix operations, set operations, and much more on NumPy arrays.
6.1. Add, Subtract, Multiply and Divide Arrays
We can perform arithmetic operations on NumPy arrays using the standard arithmetic operators (+, -, *, /). The results of these operations will be arrays of the same data type as the input arrays.
import numpy as np
array1 = np.array([1, 2, 3])
array2 = np.array([4, 5, 6])
add_res = array1 + array2
subtract_res = array1 - array2
multiply_res = array1 * array2
divide_res = array1 / array2
print(add_res, subtract_res, multiply_res, divide_res)
Here’s the output:
[5 7 9] [-3 -3 -3] [ 4 10 18] [0.25 0.4 0.5 ]
6.2. Statistical Functions
NumPy has a bunch of statistical functions that can summarize data. Here are some of them:
mean():
Returns the mean of an array of values.median():
Returns the middle value in an array when it is sorted in increasing or decreasing order.min():
Returns the smallest value in an array.max():
Returns the largest value in an array.std():
Returns the standard deviation of the values in an array.var():
Returns the variance of the values in an array.
import numpy as np
data = np.array([11, 13, 17, 20, 23])
mini = np.min(data)
maxi = np.max(data)
mean = np.mean(data)
median = np.median(data)
std_dev = np.std(data)
variance = np.var(data)
print(mini)
print(maxi)
print(mean)
print(median)
print(std_dev)
print(variance)
Here’s the output:
11
23
16.8
17.0
4.4
19.36
6.3. Comparing Two Arrays
We can use the comparison operators (==, !=, <, >, <=, >=) to compare two NumPy arrays element-wise. The results of these comparisons will be Boolean arrays.
import numpy as np
arr1 = np.array([2, 3, 4])
arr2 = np.array([4, 3, 2])
more_than = arr1 > arr2
equal_to = arr1 == arr2
print(more_than)
print(equal_to)
Here’s the result:
[False False True]
[False True False]
6.4. Manipulating Strings Stored in Arrays
NumPy provides a set of string functions that can be used to manipulate strings stored in arrays. Some functions include:
lower():
Converts all the characters in a string to lowercase.upper():
Converts all the characters in a string to uppercase.str_len():
To find the length of all the strings.
import numpy as np
names = np.array(['Alice', 'Bob', 'Charlie'])
uppercase = np.char.upper(names)
lowercase = np.char.lower(names)
length = np.char.str_len(names)
print(uppercase)
print(lowercase)
print(length)
Here’s the result:
['ALICE' 'BOB' 'CHARLIE']
['alice' 'bob' 'charlie']
[5 3 7]
6.5. Matrix Operations
NumPy has lots of functions for working with matrix arrays. Some of these functions are:
dot()
: calculates a matrix’s dot product.transpose()
: transposes a matrix.inverse()
: calculates a matrix’s inverse.
import numpy as np
matrix1 = np.array([[1, 2], [3, 4]])
matrix2 = np.array([[5, 6], [7, 8]])
result_dot = np.dot(matrix1, matrix2)
print("Dot Product:")
print(result_dot)
transpose_matrix = matrix1.T
print("Transposed Matrix:")
print(transpose_matrix)
inverse_matrix = np.linalg.inv(matrix1)
print("Inverse Matrix:")
print(inverse_matrix)
Here’s the result:
Dot Product:
[[19 22]
[43 50]]
Transposed Matrix:
[[1 3]
[2 4]]
Inverse Matrix:
[[-2. 1. ]
[ 1.5 -0.5]]
6.6. Set Operations
We can do set operations on arrays as well. These include:
union()
: returns the union of two arrays.intersection()
: returns the intersection of two arrays.difference()
: returns the difference between two arrays.
import numpy as np
set1 = np.array([1, 2, 3, 4])
set2 = np.array([3, 4, 5, 6])
union = np.union1d(set1, set2)
intersection = np.intersect1d(set1, set2)
difference = np.setdiff1d(set1, set2)
print(union)
print(intersection)
print(difference)
Here’s the result:
[1 2 3 4 5 6]
[3 4]
[1 2]
6.7. Vectorization
A vectorization involves applying a function to each element of an array. Consider the following example where we apply the square root and sin function on each element of the array.
import numpy as np
arr = np.array([4, 9, 16, 25])
result_sqrt = np.sqrt(arr)
result_sin = np.sin(arr)
print("\nElement-wise Square Root:")
print(result_sqrt)
print("\nElement-wise Sine:")
print(result_sin)
Here’s the result:
Element-wise Square Root:
[2. 3. 4. 5.]
Element-wise Sine:
[-0.7568025 0.41211849 -0.28790332 -0.13235175]
7. Error Handling
NumPy provides several functions that can be used to handle errors, including:
try/except
: handles errors that occur in the code.assert
: checks for conditions that should never be true.isnan()
: checks if a value isNaN
(Not a Number).
import numpy as np
# Assert Statement
x = np.array([1, 2, 3])
y = np.array([1, 2, 4])
try:
assert len(x) == len(y), "Arrays must have the same length"
except AssertionError as e:
print("Assertion Error:", e)
else:
print("Lenghts are same")
# isnan() Function
z = np.array([1.0, np.nan, 3.0, np.nan, 5.0])
is_nan = np.isnan(z)
print("\nOriginal Array:")
print(z)
print("\nNaN Mask:")
print(is_nan)
Here’s the result:
Lenghts are same
Original Array:
[ 1. nan 3. nan 5.]
NaN Mask:
[False True False True False]
8. Data Visualization
Next, we will dive into data visualization using NumPy.
First, we need to install matplotlib and import it. We should also open a Jupyter notebook so we can run all the code and see the actual visualization. NumPy has a number of cool ways to show data visually, such as line plots, scatter plots, bar graphs, and histograms. Visualizing data helps us quickly understand large data sets.
8.1. Line Plot
In NumPy, a line plot displays data as a series of points connected by a line. The plot()
function is used to line plot the data, and it takes two arguments: the x-coordinates and the y-coordinates.
Let’s see an example.
import numpy as np
import matplotlib.pyplot as plt
fruit = np.array(["Apple", "Banana", "Orange", "Grapes", "Mango", "Strawberry"])
weight = np.array([150, 120, 180, 85, 200, 50])
plt.plot(fruit, weight)
plt.show()
Here’s the result:

In this case, we’ve used plot()
to plot the data. The x
and y
coordinates are set according to the fruit
and weight
arrays.
8.2. Scatter Plot
The scatter plot displays data as a collection of points. Use the scatter()
function to plot the data.
import numpy as np
import matplotlib.pyplot as plt
fruit = np.array(["Apple", "Banana", "Orange", "Grapes", "Mango", "Strawberry"])
weight = np.array([150, 120, 180, 85, 200, 50])
plt.scatter(fruit, weight)
plt.show()
Here’s the result:

So, in this example, we used the scatter()
function to plot the data points we had. We just passed the fruit and weight as the x and y coordinates. But, in a scatter plot, we can also pass the c
and s
arguments to set the color
and size
of the points. For example,
import numpy as np
import matplotlib.pyplot as plt
fruit = np.array(["Apple", "Banana", "Orange", "Grapes", "Mango", "Strawberry"])
weight = np.array([150, 120, 180, 85, 200, 50])
colors = np.array([1, 2, 3, 4, 5, 6])
sizes = np.array([21, 41, 61, 81, 101, 121])
plt.scatter(fruit, weight, c=colors, s=sizes)
plt.show()
Here’s the result:

8.3. Bar Graph
Bar graphs are like rectangular boxes that show data. NumPy has bar()
function that we can use to plot data in a bar graph.
For example,
import numpy as np
import matplotlib.pyplot as plt
fruit = np.array(["Apple", "Banana", "Orange", "Grapes", "Mango", "Strawberry"])
weight = np.array([150, 120, 180, 85, 200, 50])
plt.bar(fruit, weight)
plt.title('Bar Graph')
plt.show()
Here’s the result:

Here, we have used the bar()
function to plot the bar graph and pass two arrays, fruit
and weight
, as its arguments.
8.4. Histogram
NumPy uses hist()
to create histograms. Here’s an example.
import numpy as np
import matplotlib.pyplot as plt
weight = np.array([0.6, 1.8, 2.2, 2.5])
plt.hist(weight)
plt.show()
Here’s the result:

9. Advantages of NumPy
Let’s discuss some of the great advantages of NumPy:
- NumPy arrays use less memory. NumPy’s arrays are more compact in size than Python lists.
- The speed is also great. NumPy arrays perform computations faster than Python lists. The NumPy library uses the BLAS (Basic Linear Algebra Subroutines) library as its backend.
- In comparison with Python lists, NumPy is more efficient and faster at performing mathematical calculations on arrays and matrices.
- It is open source and all features can be accessed for free.
- In Numpy arrays, there are various functions, methods, and variables, which simplify the computation of matrices.
10. Limitations of NumPy
Apart from advantages, there are also some limitations.
- NumPy can have a steep learning curve, especially for beginners who are not familiar with array programming concepts.
- NumPy arrays can consume more memory than Python lists because they store additional metadata and type information with each element. This can lead to memory problems, especially in systems with limited memory.
- NumPy arrays do not have a built-in way to represent missing values (NaN). This can be a problem for data analysis tasks requiring missing values handling.
- NumPy arrays require all elements to be of the same data type. This can limit their use for handling data structures that contain different types of data.
11. Conclusion
In this Python tutorial, we have discussed how to get started with NumPy. We took a look at the data types of NumPy, and how to use NumPy for data visualization, followed by the advantages and limitations of NumPy.
Happy Learning!
Comments