Skip to content

Commit

Permalink
Update documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
Yury Kashnitsky committed Jan 6, 2025
1 parent 8c46f9e commit a6a6f12
Show file tree
Hide file tree
Showing 76 changed files with 1,445 additions and 173 deletions.
Binary file modified .DS_Store
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions _sources/book/topic09/assignment09_time_series.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ DATA_PATH = "https://raw.githubusercontent.com/Yorko/mlcourse.ai/main/data/"
df = pd.read_csv(DATA_PATH + "wiki_machine_learning.csv", sep=" ")
df = df[df["count"] != 0]
df.head()
```


Expand Down
2 changes: 1 addition & 1 deletion _sources/book/topic09/assignment09_time_series_solution.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,7 @@ train_df = df[:-predictions].copy()

```{code-cell} ipython3
m = Prophet()
m.fit(train_df);
m.fit(train_df)
```


Expand Down
5 changes: 2 additions & 3 deletions _sources/book/topic09/topic9_part1_time_series_python.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ currency = pd.read_csv(
```

```{code-cell} ipython3
plt.figure(figsize=(12, 6))
plt.figure(figsize=(12, 8))
plt.plot(ads.Ads)
plt.title("Ads watched (hourly data)")
plt.grid(True)
Expand Down Expand Up @@ -1099,7 +1099,6 @@ def plotSARIMA(series, model, n_steps):
# forecasting on n_steps forward
forecast = model.predict(start=data.shape[0], end=data.shape[0] + n_steps)
forecast = data.arima_model.append(forecast)
# calculate error, again having shifted on s+d steps from the beginning
error = mean_absolute_percentage_error(
data["actual"][s + d :], data["arima_model"][s + d :]
Expand Down Expand Up @@ -1530,7 +1529,7 @@ plotModelResults(
X_train=X_train_scaled,
X_test=X_test_scaled,
plot_intervals=True,
plot_anomalies=True,
plot_anomalies=True
)
```

Expand Down
2 changes: 1 addition & 1 deletion _sources/book/topic09/topic9_part2_facebook_prophet.md
Original file line number Diff line number Diff line change
Expand Up @@ -216,7 +216,7 @@ Let's sort the dataframe by time and take a look at what we've got:


```{code-cell} ipython3
df.sort_values(by=["published"]).head(n=3)
df.sort_values(by=["published"]).head(n=2)
```

Medium's public release date was August 15, 2012. But, as you can see from the data above, there are at least several rows with much earlier publication dates. They have somehow turned up in our dataset, but they are hardly legitimate ones. We will just trim our time series to keep only those rows that fall onto the period from August 15, 2012 to June 25, 2017:
Expand Down
2 changes: 1 addition & 1 deletion _static/plotly_htmls/Box–Cox_transformation.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion _static/plotly_htmls/New_posts_on_Medium.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion _static/plotly_htmls/No_transformations.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion _static/plotly_htmls/Posts_on_Medium_(daily).html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion _static/plotly_htmls/Posts_on_Medium_(weekly).html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion _static/plotly_htmls/assign9_plot.html

Large diffs are not rendered by default.

Binary file modified book/.DS_Store
Binary file not shown.
1 change: 1 addition & 0 deletions book/topic09/assignment09_time_series.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ DATA_PATH = "https://raw.githubusercontent.com/Yorko/mlcourse.ai/main/data/"
df = pd.read_csv(DATA_PATH + "wiki_machine_learning.csv", sep=" ")
df = df[df["count"] != 0]
df.head()
```


Expand Down
473 changes: 410 additions & 63 deletions book/topic09/assignment09_time_series_solution.html

Large diffs are not rendered by default.

2 changes: 1 addition & 1 deletion book/topic09/assignment09_time_series_solution.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,7 @@ train_df = df[:-predictions].copy()

```{code-cell} ipython3
m = Prophet()
m.fit(train_df);
m.fit(train_df)
```


Expand Down
918 changes: 877 additions & 41 deletions book/topic09/topic9_part1_time_series_python.html

Large diffs are not rendered by default.

5 changes: 2 additions & 3 deletions book/topic09/topic9_part1_time_series_python.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ currency = pd.read_csv(
```

```{code-cell} ipython3
plt.figure(figsize=(12, 6))
plt.figure(figsize=(12, 8))
plt.plot(ads.Ads)
plt.title("Ads watched (hourly data)")
plt.grid(True)
Expand Down Expand Up @@ -1099,7 +1099,6 @@ def plotSARIMA(series, model, n_steps):
# forecasting on n_steps forward
forecast = model.predict(start=data.shape[0], end=data.shape[0] + n_steps)
forecast = data.arima_model.append(forecast)
# calculate error, again having shifted on s+d steps from the beginning
error = mean_absolute_percentage_error(
data["actual"][s + d :], data["arima_model"][s + d :]
Expand Down Expand Up @@ -1530,7 +1529,7 @@ plotModelResults(
X_train=X_train_scaled,
X_test=X_test_scaled,
plot_intervals=True,
plot_anomalies=True,
plot_anomalies=True
)
```

Expand Down
55 changes: 25 additions & 30 deletions book/topic09/topic9_part2_facebook_prophet.html
Original file line number Diff line number Diff line change
Expand Up @@ -766,7 +766,7 @@ <h3><a class="toc-backref" href="#id12" role="doc-backlink">3.2 Dataset</a><a cl
<p>Let’s sort the dataframe by time and take a look at what we’ve got:</p>
<div class="cell docutils container">
<div class="cell_input docutils container">
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">df</span><span class="o">.</span><span class="n">sort_values</span><span class="p">(</span><span class="n">by</span><span class="o">=</span><span class="p">[</span><span class="s2">&quot;published&quot;</span><span class="p">])</span><span class="o">.</span><span class="n">head</span><span class="p">(</span><span class="n">n</span><span class="o">=</span><span class="mi">3</span><span class="p">)</span>
<div class="highlight-ipython3 notranslate"><div class="highlight"><pre><span></span><span class="n">df</span><span class="o">.</span><span class="n">sort_values</span><span class="p">(</span><span class="n">by</span><span class="o">=</span><span class="p">[</span><span class="s2">&quot;published&quot;</span><span class="p">])</span><span class="o">.</span><span class="n">head</span><span class="p">(</span><span class="n">n</span><span class="o">=</span><span class="mi">2</span><span class="p">)</span>
</pre></div>
</div>
</div>
Expand Down Expand Up @@ -804,11 +804,6 @@ <h3><a class="toc-backref" href="#id12" role="doc-backlink">3.2 Dataset</a><a cl
<td>1970-01-01 00:00:00.001000+00:00</td>
<td>https://medium.com/@ikaella/melon-rebranding-b...</td>
</tr>
<tr>
<th>37395</th>
<td>1970-01-18 05:11:46.500000+00:00</td>
<td>http://www.novosti.rs/%D0%B2%D0%B5%D1%81%D1%82...</td>
</tr>
</tbody>
</table>
</div></div></div>
Expand Down Expand Up @@ -1321,10 +1316,10 @@ <h3><a class="toc-backref" href="#id14" role="doc-backlink">3.4 Making a forecas
</div>
</div>
<div class="cell_output docutils container">
<div class="output stderr highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>16:18:21 - cmdstanpy - INFO - Chain [1] start processing
<div class="output stderr highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>16:55:27 - cmdstanpy - INFO - Chain [1] start processing
</pre></div>
</div>
<div class="output stderr highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>16:18:21 - cmdstanpy - INFO - Chain [1] done processing
<div class="output stderr highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>16:55:27 - cmdstanpy - INFO - Chain [1] done processing
</pre></div>
</div>
</div>
Expand Down Expand Up @@ -1429,10 +1424,10 @@ <h3><a class="toc-backref" href="#id14" role="doc-backlink">3.4 Making a forecas
<th>904</th>
<td>2017-06-23</td>
<td>276.682485</td>
<td>255.508448</td>
<td>305.337762</td>
<td>276.267743</td>
<td>277.057474</td>
<td>254.201087</td>
<td>304.016830</td>
<td>276.267332</td>
<td>277.098510</td>
<td>2.316748</td>
<td>2.316748</td>
<td>2.316748</td>
Expand All @@ -1451,10 +1446,10 @@ <h3><a class="toc-backref" href="#id14" role="doc-backlink">3.4 Making a forecas
<th>905</th>
<td>2017-06-24</td>
<td>277.388317</td>
<td>217.021728</td>
<td>269.092764</td>
<td>276.939924</td>
<td>277.794113</td>
<td>218.637265</td>
<td>267.275922</td>
<td>276.948748</td>
<td>277.834694</td>
<td>-34.992533</td>
<td>-34.992533</td>
<td>-34.992533</td>
Expand All @@ -1473,10 +1468,10 @@ <h3><a class="toc-backref" href="#id14" role="doc-backlink">3.4 Making a forecas
<th>906</th>
<td>2017-06-25</td>
<td>278.094149</td>
<td>221.166044</td>
<td>271.685007</td>
<td>277.611047</td>
<td>278.543160</td>
<td>222.470485</td>
<td>272.170684</td>
<td>277.624320</td>
<td>278.567768</td>
<td>-31.355580</td>
<td>-31.355580</td>
<td>-31.355580</td>
Expand Down Expand Up @@ -1505,7 +1500,7 @@ <h3><a class="toc-backref" href="#id14" role="doc-backlink">3.4 Making a forecas
</div>
</div>
<div class="cell_output docutils container">
<img alt="../../_images/74099c2a0f73e222fdae2229fb78ab58a40e1808d8fc1d483af6e798e5545fcc.png" src="../../_images/74099c2a0f73e222fdae2229fb78ab58a40e1808d8fc1d483af6e798e5545fcc.png" />
<img alt="../../_images/eed11a86ee6c8c5f919dd459341bcbd8441930b310c3c7f57354bb3ca87c425f.png" src="../../_images/eed11a86ee6c8c5f919dd459341bcbd8441930b310c3c7f57354bb3ca87c425f.png" />
</div>
</div>
<p>This chart doesn’t look very informative. The only definitive conclusion that we can draw here is that the model treated many of the data points as outliers.</p>
Expand All @@ -1518,7 +1513,7 @@ <h3><a class="toc-backref" href="#id14" role="doc-backlink">3.4 Making a forecas
</div>
</div>
<div class="cell_output docutils container">
<img alt="../../_images/df5cccb8c6a20b18ecb21d4482c2fb521dd000f64d384e65b5cb27f7d8e82b7a.png" src="../../_images/df5cccb8c6a20b18ecb21d4482c2fb521dd000f64d384e65b5cb27f7d8e82b7a.png" />
<img alt="../../_images/a86af3d5e44dd070a80826fd921fb8fa299ff15fba87989baafe5ccb7e454697.png" src="../../_images/a86af3d5e44dd070a80826fd921fb8fa299ff15fba87989baafe5ccb7e454697.png" />
</div>
</div>
<p>As you can see from the trend graph, Prophet did a good job by fitting the accelerated growth of new posts at the end of 2016. The graph of weekly seasonality leads to the conclusion that usually there are less new posts on Saturdays and Sundays than on the other days of the week. In the yearly seasonality graph there is a prominent dip on Christmas Day.</p>
Expand Down Expand Up @@ -1598,22 +1593,22 @@ <h3><a class="toc-backref" href="#id15" role="doc-backlink">3.5 Forecast quality
<tr>
<th>2017-06-23</th>
<td>278.999233</td>
<td>255.508448</td>
<td>305.337762</td>
<td>254.201087</td>
<td>304.016830</td>
<td>421</td>
</tr>
<tr>
<th>2017-06-24</th>
<td>242.395784</td>
<td>217.021728</td>
<td>269.092764</td>
<td>218.637265</td>
<td>267.275922</td>
<td>277</td>
</tr>
<tr>
<th>2017-06-25</th>
<td>246.738569</td>
<td>221.166044</td>
<td>271.685007</td>
<td>222.470485</td>
<td>272.170684</td>
<td>253</td>
</tr>
</tbody>
Expand Down Expand Up @@ -1803,10 +1798,10 @@ <h2><a class="toc-backref" href="#id17" role="doc-backlink">4. Box-Cox Transform
</div>
</div>
<div class="cell_output docutils container">
<div class="output stderr highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>16:18:22 - cmdstanpy - INFO - Chain [1] start processing
<div class="output stderr highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>16:55:27 - cmdstanpy - INFO - Chain [1] start processing
</pre></div>
</div>
<div class="output stderr highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>16:18:22 - cmdstanpy - INFO - Chain [1] done processing
<div class="output stderr highlight-myst-ansi notranslate"><div class="highlight"><pre><span></span>16:55:27 - cmdstanpy - INFO - Chain [1] done processing
</pre></div>
</div>
</div>
Expand Down
2 changes: 1 addition & 1 deletion book/topic09/topic9_part2_facebook_prophet.md
Original file line number Diff line number Diff line change
Expand Up @@ -216,7 +216,7 @@ Let's sort the dataframe by time and take a look at what we've got:


```{code-cell} ipython3
df.sort_values(by=["published"]).head(n=3)
df.sort_values(by=["published"]).head(n=2)
```

Medium's public release date was August 15, 2012. But, as you can see from the data above, there are at least several rows with much earlier publication dates. They have somehow turned up in our dataset, but they are hardly legitimate ones. We will just trim our time series to keep only those rows that fall onto the period from August 15, 2012 to June 25, 2017:
Expand Down
Loading

0 comments on commit a6a6f12

Please sign in to comment.