Skip to content

Commit a7df684

Browse files
committed
2 parents 849fe39 + 3933d64 commit a7df684

File tree

68 files changed

+1878
-214
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

68 files changed

+1878
-214
lines changed

CHANGELOG.md

+145
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,160 @@
1+
## [1.27.0](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.26.7...v1.27.0) (2024-10-26)
2+
3+
4+
### Features
5+
6+
* add conditional node structure to the smart_scraper_graph and implemented a structured way to check condition ([cacd9cd](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/cacd9cde004dace1a7dcc27981245632a78b95f3))
7+
* add integration with scrape.do ([ae275ec](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/ae275ec5e86c0bb8fdbeadc2e5f69816d1dea635))
8+
* add model integration gpt4 ([51c55eb](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/51c55eb3a2984ba60572edbcdea4c30620e18d76))
9+
* implement ScrapeGraph class for only web scraping automation ([612c644](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/612c644623fa6f4fe77a64a5f1a6a4d6cd5f4254))
10+
* Implement SmartScraperMultiParseMergeFirstGraph class that scrapes a list of URLs and merge the content first and finally generates answers to a given prompt. ([3e3e1b2](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/3e3e1b2f3ae8ed803d03b3b44b199e139baa68d4))
11+
* refactoring of export functions ([0ea00c0](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/0ea00c078f2811f0d1b356bd84cafde80763c703))
12+
* refactoring of get_probable_tags node ([f658092](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/f658092dffb20ea111cc00950f617057482788f4))
13+
* refactoring of ScrapeGraph to SmartScraperLiteGraph ([52b6bf5](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/52b6bf5fb8c570aa8ef026916230c5d52996f887))
14+
15+
16+
### Bug Fixes
17+
18+
* fix export function ([c8a000f](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/c8a000f1d943734a921b34e91498b2f29c8c9422))
19+
* fix the example variable name ([69ff649](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/69ff6495564a5c670b89c0f802ebb1602f0e7cfa))
20+
* remove variable "max_result" not being used in the code ([e76a68a](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/e76a68a782e5bce48d421cb620d0b7bffa412918))
21+
22+
23+
### chore
24+
25+
* fix example ([9cd9a87](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/9cd9a874f91bbbb2990444818e8ab2d0855cc361))
26+
27+
28+
### Test
29+
30+
* Add scrape_graph test ([cdb3c11](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/cdb3c1100ee1117afedbc70437317acaf7c7c1d3))
31+
* Add smart_scraper_multi_parse_merge_first_graph test ([464b8b0](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/464b8b04ea0d51280849173d5eda92d4d4db8612))
32+
33+
34+
### CI
35+
36+
* **release:** 1.26.6-beta.1 [skip ci] ([e0fc457](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/e0fc457d1a850f3306d473fbde55dd800133b404))
37+
* **release:** 1.27.0-beta.1 [skip ci] ([9266a36](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/9266a36b2efdf7027470d59aa14b654d68f7cb51))
38+
* **release:** 1.27.0-beta.10 [skip ci] ([eee131e](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/eee131e959a36a4471f72610eefbc1764808b6be))
39+
* **release:** 1.27.0-beta.2 [skip ci] ([d84d295](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/d84d29538985ef8d04badfed547c6fdc73d7774d))
40+
* **release:** 1.27.0-beta.3 [skip ci] ([f576afa](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/f576afaf0c1dd6d1dbf79fd5e642f6dca9dbe862))
41+
* **release:** 1.27.0-beta.4 [skip ci] ([3d6bbcd](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/3d6bbcdaa3828ff257adb22f2f7c1a46343de5b5))
42+
* **release:** 1.27.0-beta.5 [skip ci] ([5002c71](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/5002c713d5a76b2c2e4313f888d9768e3f3142e1))
43+
* **release:** 1.27.0-beta.6 [skip ci] ([94b9836](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/94b9836ef6cd9c24bb8c04d7049d5477cc8ed807))
44+
* **release:** 1.27.0-beta.7 [skip ci] ([407f1ce](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/407f1ce4eb22fb284ef0624dd3f7bf7ba432fa5c))
45+
* **release:** 1.27.0-beta.8 [skip ci] ([4f1ed93](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/4f1ed939e671e46bb546b6b605db87e87c0d66ee))
46+
* **release:** 1.27.0-beta.9 [skip ci] ([fd57cc7](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/fd57cc7c126658960e33b7214c2cc656ea032d8f))
47+
48+
## [1.27.0-beta.10](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.27.0-beta.9...v1.27.0-beta.10) (2024-10-25)
49+
50+
51+
### Bug Fixes
52+
53+
* fix export function ([c8a000f](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/c8a000f1d943734a921b34e91498b2f29c8c9422))
54+
55+
## [1.27.0-beta.9](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.27.0-beta.8...v1.27.0-beta.9) (2024-10-24)
56+
57+
58+
### Features
59+
60+
* add model integration gpt4 ([51c55eb](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/51c55eb3a2984ba60572edbcdea4c30620e18d76))
61+
62+
## [1.27.0-beta.8](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.27.0-beta.7...v1.27.0-beta.8) (2024-10-24)
63+
64+
65+
### Bug Fixes
66+
67+
* removed tokenizer ([a184716](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/a18471688f0b79f06fb7078b01b68eeddc88eae4))
68+
69+
70+
### CI
71+
72+
* **release:** 1.26.7 [skip ci] ([ec9ef2b](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/ec9ef2bcda9aa81f66b943829fcdb22fe265976e))
73+
74+
## [1.27.0-beta.7](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.27.0-beta.6...v1.27.0-beta.7) (2024-10-24)
75+
76+
77+
### Features
78+
79+
* refactoring of get_probable_tags node ([f658092](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/f658092dffb20ea111cc00950f617057482788f4))
80+
81+
## [1.27.0-beta.6](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.27.0-beta.5...v1.27.0-beta.6) (2024-10-23)
82+
83+
84+
### Features
85+
86+
* add integration with scrape.do ([ae275ec](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/ae275ec5e86c0bb8fdbeadc2e5f69816d1dea635))
87+
88+
## [1.27.0-beta.5](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.27.0-beta.4...v1.27.0-beta.5) (2024-10-22)
89+
90+
91+
### Features
92+
93+
* refactoring of export functions ([0ea00c0](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/0ea00c078f2811f0d1b356bd84cafde80763c703))
94+
95+
## [1.27.0-beta.4](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.27.0-beta.3...v1.27.0-beta.4) (2024-10-21)
96+
97+
98+
### Features
99+
100+
* refactoring of ScrapeGraph to SmartScraperLiteGraph ([52b6bf5](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/52b6bf5fb8c570aa8ef026916230c5d52996f887))
101+
102+
## [1.27.0-beta.3](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.27.0-beta.2...v1.27.0-beta.3) (2024-10-20)
103+
104+
105+
### Features
106+
107+
* implement ScrapeGraph class for only web scraping automation ([612c644](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/612c644623fa6f4fe77a64a5f1a6a4d6cd5f4254))
108+
* Implement SmartScraperMultiParseMergeFirstGraph class that scrapes a list of URLs and merge the content first and finally generates answers to a given prompt. ([3e3e1b2](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/3e3e1b2f3ae8ed803d03b3b44b199e139baa68d4))
109+
=======
1110
## [1.26.7](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.26.6...v1.26.7) (2024-10-19)
2111

3112

4113
### Bug Fixes
5114

115+
* fix the example variable name ([69ff649](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/69ff6495564a5c670b89c0f802ebb1602f0e7cfa))
116+
117+
118+
### chore
119+
120+
* fix example ([9cd9a87](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/9cd9a874f91bbbb2990444818e8ab2d0855cc361))
121+
122+
123+
### Test
124+
125+
* Add scrape_graph test ([cdb3c11](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/cdb3c1100ee1117afedbc70437317acaf7c7c1d3))
126+
* Add smart_scraper_multi_parse_merge_first_graph test ([464b8b0](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/464b8b04ea0d51280849173d5eda92d4d4db8612))
127+
128+
## [1.27.0-beta.2](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.27.0-beta.1...v1.27.0-beta.2) (2024-10-18)
129+
130+
131+
### Bug Fixes
132+
133+
* refactoring of gpt2 tokenizer ([44c3f9c](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/44c3f9c98939c44caa86dc582242819a7c6a0f80))
134+
135+
136+
### CI
137+
138+
* **release:** 1.26.6 [skip ci] ([a4634c7](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/a4634c73312b5c08581a8d670d53b7eebe8dadc1))
139+
140+
## [1.27.0-beta.1](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.26.6-beta.1...v1.27.0-beta.1) (2024-10-16)
141+
142+
143+
### Features
144+
145+
* add conditional node structure to the smart_scraper_graph and implemented a structured way to check condition ([cacd9cd](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/cacd9cde004dace1a7dcc27981245632a78b95f3))
146+
147+
6148
* removed tokenizer ([a184716](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/a18471688f0b79f06fb7078b01b68eeddc88eae4))
7149

8150
## [1.26.6](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.26.5...v1.26.6) (2024-10-18)
9151

152+
## [1.26.6-beta.1](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.26.5...v1.26.6-beta.1) (2024-10-14)
10153

11154
### Bug Fixes
12155

156+
* remove variable "max_result" not being used in the code ([e76a68a](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/e76a68a782e5bce48d421cb620d0b7bffa412918))
157+
13158
* refactoring of gpt2 tokenizer ([44c3f9c](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/44c3f9c98939c44caa86dc582242819a7c6a0f80))
14159

15160
## [1.26.5](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.26.4...v1.26.5) (2024-10-13)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
"""
2+
Basic example of scraping pipeline using SmartScraper
3+
"""
4+
import os
5+
import json
6+
from dotenv import load_dotenv
7+
from scrapegraphai.graphs import SmartScraperLiteGraph
8+
from scrapegraphai.utils import prettify_exec_info
9+
10+
load_dotenv()
11+
12+
graph_config = {
13+
"llm": {
14+
"api_key": os.getenv("ANTHROPIC_API_KEY"),
15+
"model": "anthropic/claude-3-haiku-20240307",
16+
},
17+
"verbose": True,
18+
"headless": False,
19+
}
20+
21+
smart_scraper_lite_graph = SmartScraperLiteGraph(
22+
prompt="Who is Marco Perini?",
23+
source="https://perinim.github.io/",
24+
config=graph_config
25+
)
26+
27+
result = smart_scraper_lite_graph.run()
28+
print(json.dumps(result, indent=4))
29+
30+
graph_exec_info = smart_scraper_lite_graph.get_execution_info()
31+
print(prettify_exec_info(graph_exec_info))
32+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
"""
2+
Basic example of scraping pipeline using SmartScraper
3+
"""
4+
import os
5+
import json
6+
from dotenv import load_dotenv
7+
from scrapegraphai.graphs import SmartScraperMultiLiteGraph
8+
from scrapegraphai.utils import prettify_exec_info
9+
10+
load_dotenv()
11+
12+
graph_config = {
13+
"llm": {
14+
"api_key": os.getenv("ANTHROPIC_API_KEY"),
15+
"model": "anthropic/claude-3-haiku-20240307",
16+
},
17+
"verbose": True,
18+
"headless": False,
19+
}
20+
21+
smart_scraper_multi_lite_graph = SmartScraperMultiLiteGraph(
22+
prompt="Who is Marco Perini?",
23+
source= [
24+
"https://perinim.github.io/",
25+
"https://perinim.github.io/cv/"
26+
],
27+
config=graph_config
28+
)
29+
30+
result = smart_scraper_multi_lite_graph.run()
31+
print(json.dumps(result, indent=4))
32+
33+
graph_exec_info = smart_scraper_multi_lite_graph.get_execution_info()
34+
print(prettify_exec_info(graph_exec_info))
35+
+31
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
"""
2+
Basic example of scraping pipeline using SmartScraper
3+
"""
4+
import os
5+
import json
6+
from dotenv import load_dotenv
7+
from scrapegraphai.graphs import SmartScraperLiteGraph
8+
from scrapegraphai.utils import prettify_exec_info
9+
10+
load_dotenv()
11+
12+
graph_config = {
13+
"llm": {
14+
"api_key": os.environ["AZURE_OPENAI_KEY"],
15+
"model": "azure_openai/gpt-4o"
16+
},
17+
"verbose": True,
18+
"headless": False
19+
}
20+
21+
smart_scraper_lite_graph = SmartScraperLiteGraph(
22+
prompt="Who is Marco Perini?",
23+
source="https://perinim.github.io/",
24+
config=graph_config
25+
)
26+
27+
result = smart_scraper_lite_graph.run()
28+
print(json.dumps(result, indent=4))
29+
30+
graph_exec_info = smart_scraper_lite_graph.get_execution_info()
31+
print(prettify_exec_info(graph_exec_info))
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
"""
2+
Basic example of scraping pipeline using SmartScraper
3+
"""
4+
import os
5+
import json
6+
from dotenv import load_dotenv
7+
from scrapegraphai.graphs import SmartScraperMultiLiteGraph
8+
from scrapegraphai.utils import prettify_exec_info
9+
10+
load_dotenv()
11+
12+
graph_config = {
13+
"llm": {
14+
"api_key": os.environ["AZURE_OPENAI_KEY"],
15+
"model": "azure_openai/gpt-4o"
16+
},
17+
"verbose": True,
18+
"headless": False
19+
}
20+
21+
smart_scraper_multi_lite_graph = SmartScraperMultiLiteGraph(
22+
prompt="Who is Marco Perini?",
23+
source= [
24+
"https://perinim.github.io/",
25+
"https://perinim.github.io/cv/"
26+
],
27+
config=graph_config
28+
)
29+
30+
result = smart_scraper_multi_lite_graph.run()
31+
print(json.dumps(result, indent=4))
32+
33+
graph_exec_info = smart_scraper_multi_lite_graph.get_execution_info()
34+
print(prettify_exec_info(graph_exec_info))
35+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
"""
2+
Basic example of scraping pipeline using SmartScraper
3+
"""
4+
import json
5+
from scrapegraphai.graphs import SmartScraperLiteGraph
6+
from scrapegraphai.utils import prettify_exec_info
7+
8+
graph_config = {
9+
"llm": {
10+
"client": "client_name",
11+
"model": "bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
12+
"temperature": 0.0
13+
}
14+
}
15+
16+
smart_scraper_lite_graph = SmartScraperLiteGraph(
17+
prompt="Who is Marco Perini?",
18+
source="https://perinim.github.io/",
19+
config=graph_config
20+
)
21+
22+
result = smart_scraper_lite_graph.run()
23+
print(json.dumps(result, indent=4))
24+
25+
graph_exec_info = smart_scraper_lite_graph.get_execution_info()
26+
print(prettify_exec_info(graph_exec_info))
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
"""
2+
Basic example of scraping pipeline using SmartScraper
3+
"""
4+
import json
5+
from scrapegraphai.graphs import SmartScraperMultiLiteGraph
6+
from scrapegraphai.utils import prettify_exec_info
7+
8+
graph_config = {
9+
"llm": {
10+
"client": "client_name",
11+
"model": "bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
12+
"temperature": 0.0
13+
}
14+
}
15+
16+
smart_scraper_multi_lite_graph = SmartScraperMultiLiteGraph(
17+
prompt="Who is Marco Perini?",
18+
source= [
19+
"https://perinim.github.io/",
20+
"https://perinim.github.io/cv/"
21+
],
22+
config=graph_config
23+
)
24+
25+
result = smart_scraper_multi_lite_graph.run()
26+
print(json.dumps(result, indent=4))
27+
28+
graph_exec_info = smart_scraper_multi_lite_graph.get_execution_info()
29+
print(prettify_exec_info(graph_exec_info))
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
"""
2+
Basic example of scraping pipeline using SmartScraper
3+
"""
4+
import os
5+
import json
6+
from dotenv import load_dotenv
7+
from scrapegraphai.graphs import SmartScraperLiteGraph
8+
from scrapegraphai.utils import prettify_exec_info
9+
10+
load_dotenv()
11+
12+
graph_config = {
13+
"llm": {
14+
"api_key": os.getenv("DEEPSEEK_API_KEY"),
15+
"model": "deepseek/deepseek-coder-33b-instruct",
16+
},
17+
"verbose": True,
18+
"headless": False,
19+
}
20+
21+
smart_scraper_lite_graph = SmartScraperLiteGraph(
22+
prompt="Who is Marco Perini?",
23+
source="https://perinim.github.io/",
24+
config=graph_config
25+
)
26+
27+
result = smart_scraper_lite_graph.run()
28+
print(json.dumps(result, indent=4))
29+
30+
graph_exec_info = smart_scraper_lite_graph.get_execution_info()
31+
print(prettify_exec_info(graph_exec_info))

0 commit comments

Comments
 (0)