Skip to content

Commit b7d5a20

Browse files
authored
Merge pull request #764 from ScrapeGraphAI/pre/beta
2 parents 4cd5ef2 + eee131e commit b7d5a20

File tree

68 files changed

+1831
-214
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

68 files changed

+1831
-214
lines changed

CHANGELOG.md

+98
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,113 @@
1+
## [1.27.0-beta.10](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.27.0-beta.9...v1.27.0-beta.10) (2024-10-25)
2+
3+
4+
### Bug Fixes
5+
6+
* fix export function ([c8a000f](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/c8a000f1d943734a921b34e91498b2f29c8c9422))
7+
8+
## [1.27.0-beta.9](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.27.0-beta.8...v1.27.0-beta.9) (2024-10-24)
9+
10+
11+
### Features
12+
13+
* add model integration gpt4 ([51c55eb](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/51c55eb3a2984ba60572edbcdea4c30620e18d76))
14+
15+
## [1.27.0-beta.8](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.27.0-beta.7...v1.27.0-beta.8) (2024-10-24)
16+
17+
18+
### Bug Fixes
19+
20+
* removed tokenizer ([a184716](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/a18471688f0b79f06fb7078b01b68eeddc88eae4))
21+
22+
23+
### CI
24+
25+
* **release:** 1.26.7 [skip ci] ([ec9ef2b](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/ec9ef2bcda9aa81f66b943829fcdb22fe265976e))
26+
27+
## [1.27.0-beta.7](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.27.0-beta.6...v1.27.0-beta.7) (2024-10-24)
28+
29+
30+
### Features
31+
32+
* refactoring of get_probable_tags node ([f658092](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/f658092dffb20ea111cc00950f617057482788f4))
33+
34+
## [1.27.0-beta.6](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.27.0-beta.5...v1.27.0-beta.6) (2024-10-23)
35+
36+
37+
### Features
38+
39+
* add integration with scrape.do ([ae275ec](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/ae275ec5e86c0bb8fdbeadc2e5f69816d1dea635))
40+
41+
## [1.27.0-beta.5](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.27.0-beta.4...v1.27.0-beta.5) (2024-10-22)
42+
43+
44+
### Features
45+
46+
* refactoring of export functions ([0ea00c0](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/0ea00c078f2811f0d1b356bd84cafde80763c703))
47+
48+
## [1.27.0-beta.4](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.27.0-beta.3...v1.27.0-beta.4) (2024-10-21)
49+
50+
51+
### Features
52+
53+
* refactoring of ScrapeGraph to SmartScraperLiteGraph ([52b6bf5](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/52b6bf5fb8c570aa8ef026916230c5d52996f887))
54+
55+
## [1.27.0-beta.3](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.27.0-beta.2...v1.27.0-beta.3) (2024-10-20)
56+
57+
58+
### Features
59+
60+
* implement ScrapeGraph class for only web scraping automation ([612c644](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/612c644623fa6f4fe77a64a5f1a6a4d6cd5f4254))
61+
* Implement SmartScraperMultiParseMergeFirstGraph class that scrapes a list of URLs and merge the content first and finally generates answers to a given prompt. ([3e3e1b2](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/3e3e1b2f3ae8ed803d03b3b44b199e139baa68d4))
62+
=======
163
## [1.26.7](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.26.6...v1.26.7) (2024-10-19)
264

365

466
### Bug Fixes
567

68+
* fix the example variable name ([69ff649](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/69ff6495564a5c670b89c0f802ebb1602f0e7cfa))
69+
70+
71+
### chore
72+
73+
* fix example ([9cd9a87](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/9cd9a874f91bbbb2990444818e8ab2d0855cc361))
74+
75+
76+
### Test
77+
78+
* Add scrape_graph test ([cdb3c11](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/cdb3c1100ee1117afedbc70437317acaf7c7c1d3))
79+
* Add smart_scraper_multi_parse_merge_first_graph test ([464b8b0](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/464b8b04ea0d51280849173d5eda92d4d4db8612))
80+
81+
## [1.27.0-beta.2](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.27.0-beta.1...v1.27.0-beta.2) (2024-10-18)
82+
83+
84+
### Bug Fixes
85+
86+
* refactoring of gpt2 tokenizer ([44c3f9c](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/44c3f9c98939c44caa86dc582242819a7c6a0f80))
87+
88+
89+
### CI
90+
91+
* **release:** 1.26.6 [skip ci] ([a4634c7](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/a4634c73312b5c08581a8d670d53b7eebe8dadc1))
92+
93+
## [1.27.0-beta.1](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.26.6-beta.1...v1.27.0-beta.1) (2024-10-16)
94+
95+
96+
### Features
97+
98+
* add conditional node structure to the smart_scraper_graph and implemented a structured way to check condition ([cacd9cd](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/cacd9cde004dace1a7dcc27981245632a78b95f3))
99+
100+
6101
* removed tokenizer ([a184716](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/a18471688f0b79f06fb7078b01b68eeddc88eae4))
7102

8103
## [1.26.6](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.26.5...v1.26.6) (2024-10-18)
9104

105+
## [1.26.6-beta.1](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.26.5...v1.26.6-beta.1) (2024-10-14)
10106

11107
### Bug Fixes
12108

109+
* remove variable "max_result" not being used in the code ([e76a68a](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/e76a68a782e5bce48d421cb620d0b7bffa412918))
110+
13111
* refactoring of gpt2 tokenizer ([44c3f9c](https://github.com/ScrapeGraphAI/Scrapegraph-ai/commit/44c3f9c98939c44caa86dc582242819a7c6a0f80))
14112

15113
## [1.26.5](https://github.com/ScrapeGraphAI/Scrapegraph-ai/compare/v1.26.4...v1.26.5) (2024-10-13)
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
"""
2+
Basic example of scraping pipeline using SmartScraper
3+
"""
4+
import os
5+
import json
6+
from dotenv import load_dotenv
7+
from scrapegraphai.graphs import SmartScraperLiteGraph
8+
from scrapegraphai.utils import prettify_exec_info
9+
10+
load_dotenv()
11+
12+
graph_config = {
13+
"llm": {
14+
"api_key": os.getenv("ANTHROPIC_API_KEY"),
15+
"model": "anthropic/claude-3-haiku-20240307",
16+
},
17+
"verbose": True,
18+
"headless": False,
19+
}
20+
21+
smart_scraper_lite_graph = SmartScraperLiteGraph(
22+
prompt="Who is Marco Perini?",
23+
source="https://perinim.github.io/",
24+
config=graph_config
25+
)
26+
27+
result = smart_scraper_lite_graph.run()
28+
print(json.dumps(result, indent=4))
29+
30+
graph_exec_info = smart_scraper_lite_graph.get_execution_info()
31+
print(prettify_exec_info(graph_exec_info))
32+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
"""
2+
Basic example of scraping pipeline using SmartScraper
3+
"""
4+
import os
5+
import json
6+
from dotenv import load_dotenv
7+
from scrapegraphai.graphs import SmartScraperMultiLiteGraph
8+
from scrapegraphai.utils import prettify_exec_info
9+
10+
load_dotenv()
11+
12+
graph_config = {
13+
"llm": {
14+
"api_key": os.getenv("ANTHROPIC_API_KEY"),
15+
"model": "anthropic/claude-3-haiku-20240307",
16+
},
17+
"verbose": True,
18+
"headless": False,
19+
}
20+
21+
smart_scraper_multi_lite_graph = SmartScraperMultiLiteGraph(
22+
prompt="Who is Marco Perini?",
23+
source= [
24+
"https://perinim.github.io/",
25+
"https://perinim.github.io/cv/"
26+
],
27+
config=graph_config
28+
)
29+
30+
result = smart_scraper_multi_lite_graph.run()
31+
print(json.dumps(result, indent=4))
32+
33+
graph_exec_info = smart_scraper_multi_lite_graph.get_execution_info()
34+
print(prettify_exec_info(graph_exec_info))
35+
+31
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
"""
2+
Basic example of scraping pipeline using SmartScraper
3+
"""
4+
import os
5+
import json
6+
from dotenv import load_dotenv
7+
from scrapegraphai.graphs import SmartScraperLiteGraph
8+
from scrapegraphai.utils import prettify_exec_info
9+
10+
load_dotenv()
11+
12+
graph_config = {
13+
"llm": {
14+
"api_key": os.environ["AZURE_OPENAI_KEY"],
15+
"model": "azure_openai/gpt-4o"
16+
},
17+
"verbose": True,
18+
"headless": False
19+
}
20+
21+
smart_scraper_lite_graph = SmartScraperLiteGraph(
22+
prompt="Who is Marco Perini?",
23+
source="https://perinim.github.io/",
24+
config=graph_config
25+
)
26+
27+
result = smart_scraper_lite_graph.run()
28+
print(json.dumps(result, indent=4))
29+
30+
graph_exec_info = smart_scraper_lite_graph.get_execution_info()
31+
print(prettify_exec_info(graph_exec_info))
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
"""
2+
Basic example of scraping pipeline using SmartScraper
3+
"""
4+
import os
5+
import json
6+
from dotenv import load_dotenv
7+
from scrapegraphai.graphs import SmartScraperMultiLiteGraph
8+
from scrapegraphai.utils import prettify_exec_info
9+
10+
load_dotenv()
11+
12+
graph_config = {
13+
"llm": {
14+
"api_key": os.environ["AZURE_OPENAI_KEY"],
15+
"model": "azure_openai/gpt-4o"
16+
},
17+
"verbose": True,
18+
"headless": False
19+
}
20+
21+
smart_scraper_multi_lite_graph = SmartScraperMultiLiteGraph(
22+
prompt="Who is Marco Perini?",
23+
source= [
24+
"https://perinim.github.io/",
25+
"https://perinim.github.io/cv/"
26+
],
27+
config=graph_config
28+
)
29+
30+
result = smart_scraper_multi_lite_graph.run()
31+
print(json.dumps(result, indent=4))
32+
33+
graph_exec_info = smart_scraper_multi_lite_graph.get_execution_info()
34+
print(prettify_exec_info(graph_exec_info))
35+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
"""
2+
Basic example of scraping pipeline using SmartScraper
3+
"""
4+
import json
5+
from scrapegraphai.graphs import SmartScraperLiteGraph
6+
from scrapegraphai.utils import prettify_exec_info
7+
8+
graph_config = {
9+
"llm": {
10+
"client": "client_name",
11+
"model": "bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
12+
"temperature": 0.0
13+
}
14+
}
15+
16+
smart_scraper_lite_graph = SmartScraperLiteGraph(
17+
prompt="Who is Marco Perini?",
18+
source="https://perinim.github.io/",
19+
config=graph_config
20+
)
21+
22+
result = smart_scraper_lite_graph.run()
23+
print(json.dumps(result, indent=4))
24+
25+
graph_exec_info = smart_scraper_lite_graph.get_execution_info()
26+
print(prettify_exec_info(graph_exec_info))
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
"""
2+
Basic example of scraping pipeline using SmartScraper
3+
"""
4+
import json
5+
from scrapegraphai.graphs import SmartScraperMultiLiteGraph
6+
from scrapegraphai.utils import prettify_exec_info
7+
8+
graph_config = {
9+
"llm": {
10+
"client": "client_name",
11+
"model": "bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
12+
"temperature": 0.0
13+
}
14+
}
15+
16+
smart_scraper_multi_lite_graph = SmartScraperMultiLiteGraph(
17+
prompt="Who is Marco Perini?",
18+
source= [
19+
"https://perinim.github.io/",
20+
"https://perinim.github.io/cv/"
21+
],
22+
config=graph_config
23+
)
24+
25+
result = smart_scraper_multi_lite_graph.run()
26+
print(json.dumps(result, indent=4))
27+
28+
graph_exec_info = smart_scraper_multi_lite_graph.get_execution_info()
29+
print(prettify_exec_info(graph_exec_info))
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
"""
2+
Basic example of scraping pipeline using SmartScraper
3+
"""
4+
import os
5+
import json
6+
from dotenv import load_dotenv
7+
from scrapegraphai.graphs import SmartScraperLiteGraph
8+
from scrapegraphai.utils import prettify_exec_info
9+
10+
load_dotenv()
11+
12+
graph_config = {
13+
"llm": {
14+
"api_key": os.getenv("DEEPSEEK_API_KEY"),
15+
"model": "deepseek/deepseek-coder-33b-instruct",
16+
},
17+
"verbose": True,
18+
"headless": False,
19+
}
20+
21+
smart_scraper_lite_graph = SmartScraperLiteGraph(
22+
prompt="Who is Marco Perini?",
23+
source="https://perinim.github.io/",
24+
config=graph_config
25+
)
26+
27+
result = smart_scraper_lite_graph.run()
28+
print(json.dumps(result, indent=4))
29+
30+
graph_exec_info = smart_scraper_lite_graph.get_execution_info()
31+
print(prettify_exec_info(graph_exec_info))
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,35 @@
1+
"""
2+
Basic example of scraping pipeline using SmartScraper
3+
"""
4+
import os
5+
import json
6+
from dotenv import load_dotenv
7+
from scrapegraphai.graphs import SmartScraperMultiLiteGraph
8+
from scrapegraphai.utils import prettify_exec_info
9+
10+
load_dotenv()
11+
12+
graph_config = {
13+
"llm": {
14+
"api_key": os.getenv("DEEPSEEK_API_KEY"),
15+
"model": "deepseek/deepseek-coder-33b-instruct",
16+
},
17+
"verbose": True,
18+
"headless": False,
19+
}
20+
21+
smart_scraper_multi_lite_graph = SmartScraperMultiLiteGraph(
22+
prompt="Who is Marco Perini?",
23+
source= [
24+
"https://perinim.github.io/",
25+
"https://perinim.github.io/cv/"
26+
],
27+
config=graph_config
28+
)
29+
30+
result = smart_scraper_multi_lite_graph.run()
31+
print(json.dumps(result, indent=4))
32+
33+
graph_exec_info = smart_scraper_multi_lite_graph.get_execution_info()
34+
print(prettify_exec_info(graph_exec_info))
35+

0 commit comments

Comments
 (0)