knn#

class datacheese.knn.KNN(seed=None)#

Bases: object

K-nearest neighbours classification model.

Parameters:

seed (int or None, default None) – Random seed used to shuffle the data.

Examples

>>> import numpy as np
>>> from datacheese.knn import KNN

Generate input data:

>>> X = np.array([[0], [1], [2], [3]], dtype=np.float64)
>>> X
array([[0.],
       [1.],
       [2.],
       [3.]])

Generate target values:

>>> y = np.array([0, 0, 1, 1], dtype=int)
>>> y
array([0, 0, 1, 1])

Fit model using data:

>>> model = KNN()
>>> model.fit(X, y)

Use model to make predictions:

>>> X_test = np.array([[1.1]], dtype=np.float64)
>>> X_test
array([[1.1]])
>>> y_test = np.array([0], dtype=int)
>>> y_test
array([0])
>>> model.predict(X_test, k=3)
array([0])

Compute prediction accuracy:

>>> model.score(X_test, y_test, k=3)
1.0
fit(X, y)#

Fit model by processing and storing training data.

Parameters:
  • X (numpy.ndarray) – 2D training features array, of shape n x d, where n is the number of training examples and d is the number of dimensions.

  • y (numpy.ndarray) – 1D training target values array of shape n, where n is the number of training examples.

predict(X, k)#

Use stored training data to predict target values for test data.

Parameters:
  • X (numpy.ndarray) – 2D testing features array, of shape m x d, where m is the number of testing examples and d is the number of dimensions.

  • k (int) – Number of neighbours.

Returns:

y_pred – Array of predicted target values.

Return type:

numpy.ndarray

score(X, y, k)#

Use stored training data to predict target values for test data and compute prediction score.

Parameters:
  • X (numpy.ndarray) – 2D testing features array, of shape m x d, where m is the number of testing examples and d is the number of dimensions.

  • y (numpy.ndarray) – 1D testing target values array of shape m, where m is the number of testing examples.

  • k (int) – Number of neighbours.

Returns:

accuracy – Prediction score, a value between 0 and 1.

Return type:

float