In This Article, You Will Learn About Python SciPy Sparse Data.
Python SciPy Sparse Data – Before moving ahead, let’s know a bit about SciPy Optimizers
Table of Contents
SciPy Sparse Data
What is Sparse Data
Spare Data is a data that contains most of the elements as zero.
It can be any array such as:
[1, 0, 2, 0, 0, 3, 0, 10, 0, 20, 0, 15]
In other words, Spare Data is a data, where most of the values are zero.
Dense array – Dense Data is a data, where most of the values are not zero.
Note: We will discuss in detail about Spare Data, when we will go throughout partial derivatives in linear algebra.
Get started with Spare Data
To deal with Spare Data, SciPy has a function called scipy.sparse.
It includes two types of sparse matrices; we can use –
Compressed Sparse Column(CSC) – For efficient arithmetic, fast column slicing.
Compressed Sparse Row(CSR) – For fast row slicing, faster matrix vector products.
Here, we will go throughout CSR matrix.
CSR Matrix
CSR matrix can be created by passing an arrray into function scipy.sparse.csr_matrix().
Example – Using an array to create a CSR matrix
import numpy as np
from scipy.sparse import csr_matrix
array = np.array([0, 0, 10, 0, 10, 1, 1, 0, 2])
print(csr_matrix(array))
As a result, it returned value after classified them in row.
From above data:
Item is in row 0 position 2 and has the value 1.
Item is in row 0 position 4 and has the value 10.
Item is in row 0 position 5 and has the value 1.
Item is in row 0 position 6 and has the value 1.
Item is in row 0 position 8 and has the value 2.
Sparse Matrix Methods
Viewing only stored data (not included zero) with the data property.
Example – Creating data only with non-empty numbers.
import numpy as np
from scipy.sparse import csr_matrix
array = np.array([[0, 50, 0], [10, 0, 1], [1, 0, 2]])
print(csr_matrix(array).data)
As a result, it returned value after classified them in non-zeros.
Now, counting non-zeros with the count_nonzero() method.
Example – Counting number of elements that are not zero.
import numpy as np
from scipy.sparse import csr_matrix
array = np.array([[0, 10, 0], [0, 50, 1], [1, 0, 2]])
print(csr_matrix(array).count_nonzero())
As shown above, it returned the number of elements, i.e., 5 (not including zero).
Now, removing zero-entries from the matrix by using method eliminate_zeros().
Example – Removing zero-entries from the matrix
import numpy as np
from scipy.sparse import csr_matrix
array = np.array([[0, 8, 0], [0, 10, 1], [1, 0, 2]])
matrix = csr_matrix(array)
matrix.eliminate_zeros()
print(matrix)
As has been noted, it returned data after elementaing zeros.
Now, by using method sum_duplicates(), we are eliminating duplicates values.
Example – Eliminating duplicates by using sum_duplicates().
import numpy as np
from scipy.sparse import csr_matrix
array = np.array([[0, 0, 0], [0, 0, 1], [1, 0, 2]])
matrix = csr_matrix(array)
matrix.sum_duplicates()
print(matrix)
As can be seen, it returned data after removing duplicates numbers.
Now, to convert matrix from CSR to CSC, using tocsc() method.
Example – Converting matrix from CSR to CSC.
import numpy as np
from scipy.sparse import csr_matrix
array = np.array([[1, 0, 3], [0, 0, 1], [1, 0, 2]])
new_array = csr_matrix(array).tocsc()
print(new_array)
As a result, it finally converted matrix from CSR to CSC.