recommenders-team · miguelgfierro · Oct 27, 2022 · Oct 26, 2022 · Oct 27, 2022
@@ -58,7 +58,6 @@
    "source": [
     "# set the environment path to find Recommenders\n",
     "import sys\n",
-
     "import pandas as pd\n",
     "import pyspark\n",
     "from sklearn.preprocessing import minmax_scale\n",
@@ -387,7 +386,7 @@
     "* **the recommender is to predict ranking instead of explicit rating**. For example, if the consumer of the recommender cares about the ranked recommended items, rating metrics do not apply directly. Usually a relevancy function such as top-k will be applied to generate the ranked list from predicted ratings in order to evaluate the recommender with other metrics. \n",
     "* **the recommender is to generate recommendation scores that have different scales with the original ratings (e.g., the SAR algorithm)**. In this case, the difference between the generated scores and the original scores (or, ratings) is not valid for measuring accuracy of the model.\n",
     "\n",
-    "#### 2.1.2 How-to with the evaluation utilities\n",
+    "#### 2.1.2 How to work with the evaluation utilities\n",
     "\n",
     "A few notes about the interface of the Rating evaluator class:\n",
     "1. The columns of user, item, and rating (prediction) should be present in the ground-truth DataFrame (prediction DataFrame).\n",
@@ -539,7 +538,7 @@
    "source": [
     "|Metric|Range|Selection criteria|Limitation|Reference|\n",
     "|------|-------------------------------|---------|----------|---------|\n",
-    "|RMSE|$> 0$|The smaller the better.|May be biased, and less explainable than MSE|[link](https://en.wikipedia.org/wiki/Root-mean-square_deviation)|\n",
+    "|RMSE|$> 0$|The smaller the better.|May be biased, and less explainable than MAE|[link](https://en.wikipedia.org/wiki/Root-mean-square_deviation)|\n",
     "|R2|$\\leq 1$|The closer to $1$ the better.|Depend on variable distributions.|[link](https://en.wikipedia.org/wiki/Coefficient_of_determination)|\n",
     "|MAE|$\\geq 0$|The smaller the better.|Dependent on variable scale.|[link](https://en.wikipedia.org/wiki/Mean_absolute_error)|\n",
     "|Explained variance|$\\leq 1$|The closer to $1$ the better.|Depend on variable distributions.|[link](https://en.wikipedia.org/wiki/Explained_variation)|"
@@ -556,7 +555,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "\"Beyond-accuray evaluation\" was proposed to evaluate how relevant recommendations are for users. In this case, a recommendation system is a treated as a ranking system. Given a relency definition, recommendation system outputs a list of recommended items to each user, which is ordered by relevance. The evaluation part takes ground-truth data, the actual items that users interact with (e.g., liked, purchased, etc.), and the recommendation data, as inputs, to calculate ranking evaluation metrics. \n",
+    "\"Beyond-accuray evaluation\" was proposed to evaluate how relevant recommendations are for users. In this case, a recommendation system is a treated as a ranking system. Given relency definition, recommendation system outputs a list of recommended items to each user, which is ordered by relevance. The evaluation part takes ground-truth data, the actual items that users interact with (e.g., liked, purchased, etc.), and the recommendation data, as inputs, to calculate ranking evaluation metrics. \n",
     "\n",
     "#### 2.2.1 Use cases\n",
     "\n",
@@ -576,7 +575,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### 2.2.1 Relevancy of recommendation\n",
+    "#### 2.2.3 Relevancy of recommendation\n",
     "\n",
     "Relevancy of recommendation can be measured in different ways:\n",
     "\n",
@@ -641,7 +640,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### 2.2.1 Precision\n",
+    "#### 2.2.4 Precision\n",
     "\n",
     "Precision@k is a metric that evaluates how many items in the recommendation list are relevant (hit) in the ground-truth data. For each user the precision score is normalized by `k` and then the overall precision scores are averaged by the total number of users. \n",
     "\n",
@@ -669,7 +668,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### 2.2.2 Recall\n",
+    "#### 2.2.5 Recall\n",
     "\n",
     "Recall@k is a metric that evaluates how many relevant items in the ground-truth data are in the recommendation list. For each user the recall score is normalized by the total number of ground-truth items and then the overall recall scores are averaged by the total number of users. "
    ]
@@ -695,7 +694,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### 2.2.3 Normalized Discounted Cumulative Gain (NDCG)\n",
+    "#### 2.2.6 Normalized Discounted Cumulative Gain (NDCG)\n",
     "\n",
     "NDCG is a metric that evaluates how well the recommender performs in recommending ranked items to users. Therefore both hit of relevant items and correctness in ranking of these items matter to the NDCG evaluation. The total NDCG score is normalized by the total number of users."
    ]
@@ -721,7 +720,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### 2.2.4 Mean Average Precision (MAP)\n",
+    "#### 2.2.7 Mean Average Precision (MAP)\n",
     "\n",
     "MAP is a metric that evaluates the average precision for each user in the datasets. It also penalizes ranking correctness of the recommended items. The overall MAP score is normalized by the total number of users."
    ]
@@ -747,7 +746,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### 2.2.5 ROC and AUC\n",
+    "#### 2.2.8 ROC and AUC\n",
     "\n",
     "ROC, as well as AUC, is a well known metric that is used for evaluating binary classification problem. It is similar in the case of binary rating typed recommendation algorithm where the \"hit\" accuracy on the relevant items is used for measuring the recommender's performance. \n",
     "\n",
@@ -1891,7 +1890,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### 2.2.5 Summary"
+    "#### 2.3 Summary"
    ]
   },
   {