Skip to content
This repository was archived by the owner on Aug 25, 2024. It is now read-only.

Commit 5384eb2

Browse files
yashlambapdxjohnny
authored andcommitted
model: scikit: Simple Linear Regression
Signed-off-by: John Andersen <john.s.andersen@intel.com>
1 parent 85763d6 commit 5384eb2

File tree

15 files changed

+605
-1
lines changed

15 files changed

+605
-1
lines changed

.travis.yml

+1
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@ env:
1616
- PLUGIN=.
1717
- PLUGIN=model/tensorflow
1818
- PLUGIN=model/scratch
19+
- PLUGIN=model/scikit
1920
- PLUGIN=feature/git
2021
- PLUGIN=feature/auth
2122
- CHANGELOG=1

CHANGELOG.md

+1
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
2121
`argparse.ArgumentParser` via the `CLI_FORMATTER_CLASS` property.
2222
- Skeleton for service creation was added
2323
- Simple Linear Regression model from scratch
24+
- Scikit Linear Regression model
2425
- Community link in CONTRIBUTING.md.
2526
- Explained three main parts of DFFML on docs homepage
2627
- Documentation on how to use ML models on docs Models plugin page.

docs/plugins/dffml_model.rst

+84
Original file line numberDiff line numberDiff line change
@@ -220,6 +220,90 @@ hash of their feature names.
220220
- default: /home/user/.cache/dffml/scratch
221221
- Directory where state should be saved
222222

223+
- predict: String
224+
225+
- Label or the value to be predicted
226+
227+
dffml_model_scikit
228+
------------------
229+
230+
.. code-block:: console
231+
232+
pip install dffml-model-scikit
233+
234+
235+
scikitlr
236+
~~~~~~~~
237+
238+
*Core*
239+
240+
Linear Regression Model implemented using scikit. Models are saved under the
241+
``directory`` in subdirectories named after the hash of their feature names.
242+
243+
.. code-block:: console
244+
245+
$ cat > train.csv << EOF
246+
Years,Expertise,Trust,Salary
247+
0,1,0.2,10
248+
1,3,0.4,20
249+
2,5,0.6,30
250+
3,7,0.8,40
251+
EOF
252+
$ cat > test.csv << EOF
253+
Years,Expertise,Trust,Salary
254+
4,9,1.0,50
255+
5,11,1.2,60
256+
EOF
257+
$ dffml train \
258+
-model scikitlr \
259+
-features def:Years:int:1 def:Expertise:int:1 def:Trust:float:1 \
260+
-model-predict Salary \
261+
-sources f=csv \
262+
-source-filename train.csv \
263+
-source-readonly \
264+
-log debug
265+
$ dffml accuracy \
266+
-model scikitlr \
267+
-features def:Years:int:1 def:Expertise:int:1 def:Trust:float:1 \
268+
-model-predict Salary \
269+
-sources f=csv \
270+
-source-filename test.csv \
271+
-source-readonly \
272+
-log debug
273+
1.0
274+
$ echo -e 'Years,Expertise,Trust\n6,13,1.4\n' | \
275+
dffml predict all \
276+
-model scikitlr \
277+
-features def:Years:int:1 def:Expertise:int:1 def:Trust:float:1 \
278+
-model-predict Salary \
279+
-sources f=csv \
280+
-source-filename /dev/stdin \
281+
-source-readonly \
282+
-log debug
283+
[
284+
{
285+
"extra": {},
286+
"features": {
287+
"Expertise": 13,
288+
"Trust": 1.4,
289+
"Years": 6
290+
},
291+
"last_updated": "2019-07-31T08:40:59Z",
292+
"prediction": {
293+
"confidence": 1.0,
294+
"value": 70.0
295+
},
296+
"src_url": "0"
297+
}
298+
]
299+
300+
**Args**
301+
302+
- directory: String
303+
304+
- default: /home/user/.cache/dffml/scikit
305+
- Directory where state should be saved
306+
223307
- predict: String
224308

225309
- Label or the value to be predicted

model/scikit/.coveragerc

+13
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
[run]
2+
source =
3+
dffml_model_scikit
4+
tests
5+
branch = True
6+
7+
[report]
8+
exclude_lines =
9+
no cov
10+
no qa
11+
noqa
12+
pragma: no cover
13+
if __name__ == .__main__.:

model/scikit/.gitignore

+20
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
*.log
2+
*.pyc
3+
.cache/
4+
.coverage
5+
.idea/
6+
.vscode/
7+
*.egg-info/
8+
build/
9+
dist/
10+
docs/build/
11+
venv/
12+
wheelhouse/
13+
*.egss
14+
.mypy_cache/
15+
*.swp
16+
.venv/
17+
.eggs/
18+
*.modeldir
19+
*.db
20+
htmlcov/

model/scikit/LICENSE

+21
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
Copyright (c) 2019 Intel
2+
3+
MIT License
4+
5+
Permission is hereby granted, free of charge, to any person obtaining a copy
6+
of this software and associated documentation files (the "Software"), to deal
7+
in the Software without restriction, including without limitation the rights
8+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9+
copies of the Software, and to permit persons to whom the Software is
10+
furnished to do so, subject to the following conditions:
11+
12+
The above copyright notice and this permission notice shall be included in all
13+
copies or substantial portions of the Software.
14+
15+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21+
SOFTWARE.

model/scikit/MANIFEST.in

+2
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,2 @@
1+
include README.md
2+
include LICENSE

model/scikit/README.md

+71
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
# DFFML Models For scikit / sklearn
2+
3+
## About
4+
5+
Models created using scikit.
6+
7+
## Install
8+
9+
```console
10+
python3.7 -m pip install --user dffml-model-scikit
11+
```
12+
13+
## Usage
14+
15+
1. Linear Regression Model
16+
17+
For implementing linear regression to a dataset, let us take a simple example:
18+
19+
| Years of Experience | Expertise | Trust Factor | Salary |
20+
| -------------------- | ---------- | ------------ | ------ |
21+
| 0 | 01 | 0.2 | 10 |
22+
| 1 | 03 | 0.4 | 20 |
23+
| 2 | 05 | 0.6 | 30 |
24+
| 3 | 07 | 0.8 | 40 |
25+
| 4 | 09 | 1.0 | 50 |
26+
| 5 | 11 | 1.2 | 60 |
27+
28+
```console
29+
$ cat > train.csv << EOF
30+
Years,Expertise,Trust,Salary
31+
0,1,0.2,10
32+
1,3,0.4,20
33+
2,5,0.6,30
34+
3,7,0.8,40
35+
EOF
36+
$ cat > test.csv << EOF
37+
Years,Expertise,Trust,Salary
38+
4,9,1.0,50
39+
5,11,1.2,60
40+
EOF
41+
$ dffml train \
42+
-model scikitlr \
43+
-features def:Years:int:1 def:Expertise:int:1 def:Trust:float:1 \
44+
-model-predict Salary \
45+
-sources f=csv \
46+
-source-filename train.csv \
47+
-source-readonly \
48+
-log debug
49+
$ dffml accuracy \
50+
-model scikitlr \
51+
-features def:Years:int:1 def:Expertise:int:1 def:Trust:float:1 \
52+
-model-predict Salary \
53+
-sources f=csv \
54+
-source-filename test.csv \
55+
-source-readonly \
56+
-log debug
57+
$ echo -e 'Years,Expertise,Trust\n6,13,1.4\n' | \
58+
dffml predict all \
59+
-model scikitlr \
60+
-features def:Years:int:1 def:Expertise:int:1 def:Trust:float:1 \
61+
-model-predict Salary \
62+
-sources f=csv \
63+
-source-filename /dev/stdin \
64+
-source-readonly \
65+
-log debug
66+
```
67+
68+
## License
69+
70+
Scikit Models are distributed under the terms of the
71+
[MIT License](LICENSE).

model/scikit/dffml_model_scikit/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)