NumPy nanpercentile() Function



The NumPy nanpercentile() function computes the nth percentile of the input array along a specified axis while ignoring NaN (Not a Number) values. If the input contains integers or floats smaller than float64, the output data-type is float64. Otherwise, the output data-type is the same as that of the input. If out is specified, that array is returned instead. This function is particularly useful when dealing with datasets containing missing or invalid data.

In Numpy, the percentail() and nanpercentile() functions allows for the calculation of the nth percentile value. The only difference is that nanpercentile() function excludes NaN values from the computation.

The nanpercentile() function performs interpolation when the desired percentile lies between two data points in the array. By default, it uses linear interpolation to estimate the result.

Syntax

Following is the syntax of the NumPy nanpercentile() function −

numpy.nanpercentile(a, q, axis=None, out=None, overwrite_input=False, method='linear', keepdims=<no value>, weights=None, interpolation=None)

Parameters

Following are the parameters of the NumPy nanpercentile() function −

  • a: Input array or object that can be converted to an array. It can be a NumPy array, list, or scalar value. NaN values are ignored.
  • q: The percentile value or array of percentiles to compute. It should be between 0 and 100.
  • axis (optional): Axis or axes along which the percentiles are computed. If None, the percentile is computed over the entire flattened array.
  • out (optional): Alternate output array to store the result. It must have the same shape as the expected output.
  • overwrite_input (optional): If True, the input array is modified in place. Default is False.
  • keepdims (optional): If True, the reduced dimensions are retained as dimensions of size one in the output. Default is False.
  • weights: If weights=None, then all data in a are assumed to have a weight equal to one. Only method=inverted_cdf supports weights.
  • interpolation(optional): Deprecated name for the method keyword argument.
  • method (optional): Specifies the interpolation method. Options include −
  • linear (default): Linear interpolation between two data points.
  • lower: Use the lower value when the percentile lies between two values.
  • higher: Use the higher value when the percentile lies between two values.
  • midpoint: Use the midpoint of the two values when the percentile lies between them.
  • nearest: Use the nearest value.

Return Values

This function returns the computed percentile(s) as a scalar or a NumPy array, depending on the input. The result is based on the specified interpolation method and axis, excluding any NaN values. If multiple percentiles are given, first axis of the result corresponds to the percentiles. The other axes are the axes that remain after the reduction of a.

Example

Following is a basic example to compute the 25th percentile (median) of an array using the NumPy nanpercentile() function, ignoring NaN values −

import numpy as np
# input array with NaN values
data = np.array([1, 2, np.nan, 4, 5, np.nan, 7, 8, 9, 10])
# calculating 50th percentile (median) ignoring NaN
percentile_25 = np.nanpercentile(data, 25)
print("25th Percentile (ignoring NaN):", percentile_25)

Output

Following is the output of the above code −

25th Percentile (ignoring NaN): 3.5

Example: Percentile Along an Axis

The nanpercentile() function can compute percentiles along a specified axis in multi-dimensional arrays. In the following example, we have calculated the 90th percentile along the rows (axis=1) of a 2D array, ignoring NaN values −

import numpy as np
# 2D array with NaN values
data = np.array([[1, 2, np.nan], [4, np.nan, 6], [7, 8, 9]])
# 90th percentile along rows (axis=1), ignoring NaN
percentile_90_rows = np.nanpercentile(data, 90, axis=1)
print("90th Percentile Along Rows (ignoring NaN):", percentile_90_rows)

Output

Following is the output of the above code −

90th Percentile Along Rows (ignoring NaN): [1.9 5.8 8.8]

Example: Usage of 'method' Parameter

In the following example, we have computed the 25th percentile of an array using the 'midpoint' interpolation method, ignoring NaN values −

import numpy as np
# input array with NaN values
data = np.array([1, np.nan, 5, 7])
# 25th percentile using 'midpoint' method, ignoring NaN
percentile_25_midpoint = np.nanpercentile(data, 25, method='midpoint')
print("25th Percentile (Midpoint Method, ignoring NaN):", percentile_25_midpoint)

Output

Following is the output of the above code −

25th Percentile (Midpoint Method, ignoring NaN): 3.0

Example: MultiDimensional Arrays

The nanpercentile() function also works on multi-dimensional arrays. In the following example, we have calculated the 75th percentile along the columns (axis=0) of a 2D array, ignoring NaN values −

import numpy as np
# 2D array with NaN values
data = np.array([[1, np.nan, 5], [2, 4, np.nan], [3, 5, 7]])
# 75th percentile along columns (axis=0), ignoring NaN
percentile_75_columns = np.nanpercentile(data, 75, axis=0)
print("75th Percentile Along Columns (ignoring NaN):", percentile_75_columns)

Output

Following is the output of the above code −

75th Percentile Along Columns (ignoring NaN): [2.5 4.5 6.5]

Example: Graphical Representation

In the following example, we have visualize the 50th percentile of an array while ignoring NaN values. NumPy is used to generate the data, and matplotlib is used to plot the results −

import numpy as np
import matplotlib.pyplot as plt
# input data with NaN values
x = np.linspace(0, 10, 100)
x[::10] = np.nan  # introduce NaN values
y = np.nanpercentile(x, 50)
# plotting the result
plt.plot(x, np.full_like(x, y), label="50th Percentile (Median, ignoring NaN)")
plt.title("Nanpercentile Function Visualization")
plt.xlabel("Input")
plt.ylabel("Percentile Value")
plt.legend()
plt.grid()
plt.show()

Output

The plot demonstrates the constant 50th percentile line across the range of values, excluding NaN values −

Nanpercentile Function Visualization
numpy_statistical_functions.htm
Advertisements