Skip to content

Commit 59b3075

Browse files
author
smudap
committed
eighth term of office
1 parent 74df84f commit 59b3075

File tree

102 files changed

+684
-522
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

102 files changed

+684
-522
lines changed

INSTRUCTION.Rmd

+58-47
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ output: pdf_document
77
# Description
88

99
**sejmRP** package enables scraping data from Polish Diet's webpage [sejm.gov.pl](http://www.sejm.gov.pl/)
10-
about votings and deputies in actual term of office. All data is storaged in database.
10+
about votings and deputies from seventh to eighth term of office. All data is storaged in database.
1111

1212
# Installation
1313

@@ -41,6 +41,7 @@ install.packages("sejmRP")
4141
* statements_update_table
4242
* statements_get_statement
4343
* statements_get_statements_data
44+
* statements_get_statements_table
4445

4546
# Access functions
4647

@@ -69,79 +70,79 @@ After that we will get four tables:
6970

7071
1. *deputies* with columns:
7172
1) *id_deputy* - deputy's id,
72-
2) *surname_name* - deputy's names and surnames,
73+
2) *nr_term_of_office* - Polish Diet's number of term of office,
74+
3) *surname_name* - deputy's names and surnames,
7375
2. *votings* with columns:
7476
1) *id_voting* - voting's id,
75-
2) *nr_meeting* - meeting's number,
76-
3) *date_meeting* - meeting's date,
77-
4) *nr_voting* - voting's number,
78-
5) *topic_voting* - voting's topic,
79-
6) *link_results* - link with voting's results,
77+
2) *nr_term_of_office* - Polish Diet's number of term of office,
78+
3) *nr_meeting* - meeting's number,
79+
4) *date_meeting* - meeting's date,
80+
5) *nr_voting* - voting's number,
81+
6) *topic_voting* - voting's topic,
82+
7) *link_results* - link with voting's results,
8083
3. *votes* with columns:
8184
1) *id_vote* - vote's id,
82-
2) *id_deputy* - deputy's id,
83-
3) *id_voting* - voting's id,
84-
4) *vote* - deputy's vote, one of: "Za","Przeciw","Wstrzymał się","Nieobecny",
85-
5) *club* - deputy's club,
85+
2) *nr_term_of_office* - Polish Diet's number of term of office,
86+
3) *id_deputy* - deputy's id,
87+
4) *id_voting* - voting's id,
88+
5) *vote* - deputy's vote, one of: "Za","Przeciw","Wstrzymał się","Nieobecny",
89+
6) *club* - deputy's club,
8690
4. *statements* with columns:
8791
1) *id_statement* - statement's id,
88-
2) *surname_name* - author of statement,
89-
3) *date_statement* - statement's date,
90-
4) *titles_order_points* - title of order points,
91-
5) *statement* - content of statement.
92+
2) *nr_term_of_office* - Polish Diet's number of term of office,
93+
3) *surname_name* - author of statement,
94+
4) *date_statement* - statement's date,
95+
5) *titles_order_points* - title of order points,
96+
6) *statement* - content of statement.
9297

9398
\newpage
9499
Now when we have empty database, we can start completing it with data. First of all we complete the *deputies* table.
95100
To do that we use:
96101
```{r, eval = FALSE}
97-
deputies_create_table(dbname, user, password, host)
102+
deputies_create_table(dbname, user, password, host, nr_term_of_office = 8)
98103
```
99104
This function scraps active and inactive deputies' data from Polish Diet's webpage and put it to the table.
105+
If you want create *deputies* table with seventh term of office, choose *nr_term_of_office = 8*.
100106
In the case of you want get only deputies' data you can use:
101107

102108
```{r, eval = FALSE}
103-
deputies_get_data(type)
109+
deputies_get_data(type, nr_term_of_office = 8)
104110
```
105-
where you can choose between active and inactive deputies.
111+
where you can choose between active and inactive deputies from eighth term of office.
106112
To update *deputies* table use:
107113
```{r, eval = FALSE}
108-
deputies_update_table(dbname, user, password, host)
114+
deputies_update_table(dbname, user, password, host, nr_term_of_office = 8)
109115
```
110-
116+
Remember to choose a proper value of *nr_term_of_office* variable.
111117
After that we should complete *votings* table with:
112118
```{r, eval = FALSE}
113-
votings_create_table(dbname, user, password, host,
114-
home_page = "http://www.sejm.gov.pl/Sejm7.nsf/",
115-
page = "http://www.sejm.gov.pl/Sejm7.nsf/agent.xsp?symbol=posglos&NrKadencji=7")
119+
votings_create_table(dbname, user, password, host, nr_term_of_office = 8)
116120
```
117-
This function scraps all information about votings from [http://www.sejm.gov.pl/Sejm7.nsf/agent.xsp?symbol=posglos&NrKadencji=7](http://www.sejm.gov.pl/Sejm7.nsf/agent.xsp?symbol=posglos&NrKadencji=7). If you want update *votings* table, try:
121+
This function scraps all information about votings from [http://www.sejm.gov.pl/Sejm7.nsf/agent.xsp?symbol=posglos&NrKadencji=7](http://www.sejm.gov.pl/Sejm7.nsf/agent.xsp?symbol=posglos&NrKadencji=7) for *nr_term_of_office = 7* and [http://www.sejm.gov.pl/Sejm8.nsf/agent.xsp?symbol=posglos&NrKadencji=8](http://www.sejm.gov.pl/Sejm8.nsf/agent.xsp?symbol=posglos&NrKadencji=8) for *nr_term_of_office = 8*. If you want update *votings* table, try:
118122
```{r, eval = FALSE}
119-
votings_update_table(dbname, user, password, host,
120-
home_page = "http://www.sejm.gov.pl/Sejm7.nsf/",
121-
page = "http://www.sejm.gov.pl/Sejm7.nsf/agent.xsp?symbol=posglos&NrKadencji=7")
123+
votings_update_table(dbname, user, password, host, nr_term_of_office = 8,
124+
verbose = FALSE)
122125
```
123-
124-
If you are interested in extra information, you can use additional functions:
126+
Again - remember to choose a proper value of *nr_term_of_office* variable. The last argument says function if you use want additional info be printed. If you are interested in extra information, you can use additional functions:
125127
```{r, eval = FALSE}
126128
votings_get_meetings_table(
127-
page = "http://www.sejm.gov.pl/Sejm7.nsf/agent.xsp?symbol=posglos&NrKadencji=7")
129+
page = "http://www.sejm.gov.pl/Sejm8.nsf/agent.xsp?symbol=posglos&NrKadencji=8")
128130
votings_get_votings_table(page)
129131
```
130-
First of them enables downloading table with information about meetings during diet's term of office. The second one does the same with votings during meeting.
132+
First of them enables downloading table with information about meetings during eighth diet's term of office. The second one does the same with votings during meeting.
131133

132134
Then we need to complete *votes* table. To do that we use:
133135
```{r, eval = FALSE}
134-
votes_create_table(dbname, user, password, host,
135-
home_page = "http://www.sejm.gov.pl/Sejm7.nsf/",
136-
windows = .Platform$OS.type == 'windows')
136+
votes_create_table(dbname, user, password, host, nr_term_of_office = 8,
137+
windows = .Platform$OS.type == 'windows')
137138
```
138139
The last argument says function if you use Windows, because of encoding issue on this operating system.
139140

140141
To update *votes* table use:
141142
```{r, eval = FALSE}
142-
votes_update_table(dbname, user, password, host,
143-
home_page = "http://www.sejm.gov.pl/Sejm7.nsf/",
144-
windows = .Platform$OS.type == 'windows')
143+
votes_update_table(dbname, user, password, host, nr_term_of_office = 8,
144+
windows = .Platform$OS.type == "windows",
145+
verbose = FALSE)
145146
```
146147

147148
If you want to know how deputies from chosen club voted try:
@@ -152,15 +153,20 @@ As *page* argument you should put page with this club's voting's results ([examp
152153

153154
Finally we should complete *statements* table with:
154155
```{r, eval = FALSE}
155-
statements_create_table(dbname, user, password, host,
156-
home_page = "http://www.sejm.gov.pl/Sejm7.nsf/")
156+
statements_create_table(dbname, user, password, host, nr_term_of_office = 8)
157157
```
158158

159159
To update *statements* table use:
160160
```{r, eval = FALSE}
161-
statements_update_table(dbname, user, password, host,
162-
home_page = "http://www.sejm.gov.pl/Sejm7.nsf/")
161+
statements_update_table(dbname, user, password, host, nr_term_of_office = 8,
162+
verbose = FALSE)
163+
```
164+
Like before if you are interested in extra information, you can use additional functions:
165+
```{r, eval = FALSE}
166+
statements_get_statements_table(page)
167+
statements_get_statement(page, ...)
163168
```
169+
First of them enables downloading table with information about statements during chosen meeting ([example](http://www.sejm.gov.pl/Sejm7.nsf/posiedzenie.xsp?posiedzenie=99&dzien=2)). The second one gets statement's content ([example](http://www.sejm.gov.pl/Sejm7.nsf/wypowiedz.xsp?posiedzenie=99&dzien=2&wyp=10)).
164170

165171
# How to read tables from database
166172

@@ -204,8 +210,9 @@ There is also a function:
204210
```{r, eval = FALSE}
205211
get_filtered_votes(dbname = 'sejmrp', user = 'reader', password = 'qux94874',
206212
host = 'services.mini.pw.edu.pl', windows = .Platform$OS.type == 'windows',
207-
clubs = character(0), dates = character(0), meetings = integer(0),
208-
votings = integer(0), deputies = character(0), topics = character(0))
213+
clubs = character(0), dates = character(0), terms_of_office = integer(0),
214+
meetings = integer(0), votings = integer(0), deputies = character(0),
215+
topics = character(0))
209216
```
210217
that retrieves joined *deputies*, *votes* and *votings* tables
211218
with filtred data. As you see there are few possible filters:
@@ -216,20 +223,24 @@ like for example: "PO", "PiS", "SLD". It is possible to choose more than one clu
216223
in date format "YYYY-MM-DD", where the first describes left boundary of period and
217224
the second right boundary. It is possible to choose only one day, just try the same
218225
date as first and second element of vector.
219-
3. *meetings* - range of meetings' numbers. This filter is a integer vector with two
226+
3. *terms_of_office* - range of terms of office's numbers. This filter is a integer
227+
vector with two elements, where the first describes a left boundary of range
228+
and the second a right boundary. It is possible to choose only one term of office,
229+
just try the same number as first and second element of vector.
230+
4. *meetings* - range of meetings' numbers. This filter is a integer vector with two
220231
elements, where the first describes a left boundary of range and the second a right
221232
boundary. It is possible to choose only one meeting, just try the same number
222233
as first and second element of vector.
223-
4. *votings* - range of votings' numbers. This filter is a integer vector with two
234+
5. *votings* - range of votings' numbers. This filter is a integer vector with two
224235
elements, where the first describes a left boundary of range and the second a right
225236
boundary. It is possible to choose only one voting, just try the same number
226237
as first and second element of vector.
227-
5. *deputies* - full names of deputies. This filter is a character vector with full
238+
6. *deputies* - full names of deputies. This filter is a character vector with full
228239
names of deputies in format: "surname first_name second_name". If you are not sure
229240
if the deputy you were thinking about has second name, try "surname first_name" or
230241
just "surname". There is high probability that proper deputy will be chosen.
231242
It is possible to choose more than one deputy.
232-
6. *topics* - text patterns. This filter is a character vector with text patterns of
243+
7. *topics* - text patterns. This filter is a character vector with text patterns of
233244
topics that you are interested about. Note that the votings' topics are written like
234245
sentences, so remember about case inflection of nouns and adjectives and use stems of
235246
words as patterns. For example if you want to find votings about education (in Polish:

INSTRUCTION.pdf

4.3 KB
Binary file not shown.

MANUAL.pdf

68.9 KB
Binary file not shown.

all_statements.rda

-21.7 MB
Binary file not shown.

all_votes.rda

-4.83 MB
Binary file not shown.

cron/update_data.R

+5-5
Original file line numberDiff line numberDiff line change
@@ -6,17 +6,17 @@ args <- commandArgs(TRUE)
66

77
#connecting to database
88
drv <- dbDriver("PostgreSQL")
9-
database_diet <- dbConnect(drv,dbname = dbname,user = user,password = password,host = host)
9+
database_diet <- dbConnect(drv, dbname = dbname, user = user, password = password, host = host)
1010

1111
#updating deputies table
12-
tryCatch(deputies_update_table(dbname,user,password,host),
12+
tryCatch(deputies_update_table(dbname = dbname, user = user, password = password, host = host, nr_term_of_office = 8),
1313
error = function(err){
1414
suppressWarnings(dbDisconnect(database_diet))
1515
stop("Error during updating deputies table")
1616
})
1717

1818
#updating votings table
19-
tryCatch(votings_update_table(dbname,user,password,host, verbose=TRUE),
19+
tryCatch(votings_update_table(dbname = dbname, user = user, password = password, host = host, nr_term_of_office = 8, verbose = TRUE),
2020
error = function(err){
2121
suppressWarnings(dbDisconnect(database_diet))
2222
#removing a flag file if error occured
@@ -25,14 +25,14 @@ tryCatch(votings_update_table(dbname,user,password,host, verbose=TRUE),
2525
})
2626

2727
#updating votes table
28-
tryCatch(votes_update_table(dbname,user,password,host, verbose=TRUE),
28+
tryCatch(votes_update_table(dbname = dbname, user = user, password = password, host = host, nr_term_of_office = 8, verbose = TRUE),
2929
error = function(err){
3030
suppressWarnings(dbDisconnect(database_diet))
3131
stop("Error during updating votes table")
3232
})
3333

3434
#updating statements table
35-
tryCatch(statements_update_table(dbname,user,password,host, verbose=TRUE),
35+
tryCatch(statements_update_table(dbname = dbname, user = user, password = password, host = host, nr_term_of_office = 8, verbose = TRUE),
3636
error = function(err){
3737
suppressWarnings(dbDisconnect(database_diet))
3838
stop("Error during updating statements table")

deputies.rda

-5.97 KB
Binary file not shown.

instalacja komenda.txt

-1
This file was deleted.

sejmRP/DESCRIPTION

+17-13
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,28 @@
11
Package: sejmRP
2-
Title: An Information About Deputies and Votings in Polish Diet
3-
Version: 1.2
4-
Date: 2015-10-05
2+
Title: An Information About Deputies and Votings in Polish Diet from seventh to eighth term of office
3+
Version: 1.3
4+
Date: 2016-01-03
55
Authors@R: c(
66
person("Piotr","Smuda",email="piotrsmuda@gmail.com",role=c("aut","cre")),
77
person("Przemyslaw","Biecek",email="przemyslaw.biecek@gmail.com",role="aut"),
88
person("Tomasz","Mikolajczyk",email="t.mikolajczyk@gmail.com",role="ctb"))
99
Maintainer: Piotr Smuda <piotrsmuda@gmail.com>
10-
Description: Set of functions that access information about deputies and votings in Polish diet from webpage http://www.sejm.gov.pl/.
11-
The package was developed as a result of an internship in MI2 Group - http://mi2.mini.pw.edu.pl/, Faculty of Mathematics and Information Science, Warsaw University of Technology.
10+
Description: Set of functions that access information about deputies and votings
11+
in Polish diet from webpage http://www.sejm.gov.pl/. The package was developed
12+
as a result of an internship in MI2 Group - http://mi2.mini.pw.edu.pl/,
13+
Faculty of Mathematics and Information Science, Warsaw University of Technology.
1214
BugReports: http://github.com/mi2-warsaw/sejmRP/issues
13-
Depends: R (>= 3.1.0)
15+
Depends:
16+
R (>= 3.1.0)
1417
License: GPL-2
1518
LazyLoad: true
1619
LazyData: true
1720
Imports:
18-
DBI,
19-
dplyr,
20-
RPostgreSQL,
21-
rvest,
22-
stringi,
23-
XML,
24-
xml2
21+
DBI,
22+
dplyr,
23+
RPostgreSQL,
24+
rvest,
25+
stringi,
26+
XML,
27+
xml2
28+
RoxygenNote: 5.0.1

sejmRP/NAMESPACE

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Generated by roxygen2 (4.1.1): do not edit by hand
1+
# Generated by roxygen2: do not edit by hand
22

33
export(create_database)
44
export(deputies_add_new)

sejmRP/R/create_database.R

+38-28
Original file line numberDiff line numberDiff line change
@@ -8,28 +8,32 @@
88
#' Created tables:
99
#' 1. deputies with columns:
1010
#' 1) id_deputy - deputy's id,
11-
#' 2) surname_name - deputy's names and surnames,
11+
#' 2) nr_term_of_office - Polish Diet's number of term of office,
12+
#' 3) surname_name - deputy's names and surnames,
1213
#' 2. votings with columns:
1314
#' 1) id_voting - voting's id,
14-
#' 2) nr_meeting - meeting's number,
15-
#' 3) date_meeting - meeting's date,
16-
#' 4) nr_voting - voting's number,
17-
#' 5) topic_voting - voting's topic,
18-
#' 6) link_results - link with voting's results,
15+
#' 2) nr_term_of_office - Polish Diet's number of term of office,
16+
#' 3) nr_meeting - meeting's number,
17+
#' 4) date_meeting - meeting's date,
18+
#' 5) nr_voting - voting's number,
19+
#' 6) topic_voting - voting's topic,
20+
#' 7) link_results - link with voting's results,
1921
#' 3. votes with columns:
2022
#' 1) id_vote - vote's id,
21-
#' 2) id_deputy - deputy's id,
22-
#' 3) id_voting - voting's id,
23-
#' 4) vote - deputy's vote, one of: 'Za','Przeciw',
23+
#' 2) nr_term_of_office - Polish Diet's number of term of office,
24+
#' 3) id_deputy - deputy's id,
25+
#' 4) id_voting - voting's id,
26+
#' 5) vote - deputy's vote, one of: 'Za','Przeciw',
2427
#' 'Wstrzymal sie','Nieobecny',
25-
#' 5) club - deputy's club,
28+
#' 6) club - deputy's club,
2629
#' 4. statements with columns:
2730
#' 1) id_statement - statement's id, like:
2831
#' (meeting's number).(voting's number).(statement's number),
29-
#' 2) surname_name - author of statement,
30-
#' 3) date_statement - statement's date,
31-
#' 4) titles_order_points - title of order points,
32-
#' 5) statement - content of statement.}
32+
#' 2) nr_term_of_office - Polish Diet's number of term of office,
33+
#' 3) surname_name - author of statement,
34+
#' 4) date_statement - statement's date,
35+
#' 5) titles_order_points - title of order points,
36+
#' 6) statement - content of statement.}
3337
#'
3438
#' @usage create_database(dbname, user, password, host)
3539
#'
@@ -64,26 +68,32 @@ create_database <- function(dbname, user, password, host) {
6468
database_diet <- dbConnect(drv, dbname = dbname, user = user, password = password, host = host)
6569

6670
# creating table with deputies data
67-
dbSendQuery(database_diet, "CREATE TABLE deputies (id_deputy varchar(4) NOT NULL PRIMARY KEY,
68-
surname_name varchar(50) NOT NULL,
69-
CONSTRAINT uq_surname_name UNIQUE (surname_name))")
71+
dbSendQuery(database_diet, "CREATE TABLE deputies (id_deputy varchar(4) NOT NULL,
72+
nr_term_of_office int NOT NULL, surname_name varchar(50) NOT NULL,
73+
PRIMARY KEY (id_deputy, nr_term_of_office),
74+
CONSTRAINT uq_surname_name UNIQUE (nr_term_of_office, surname_name))")
7075

7176
# creating table with voting data
72-
dbSendQuery(database_diet, "CREATE TABLE votings (id_voting int NOT NULL PRIMARY KEY,
73-
nr_meeting int NOT NULL, date_meeting date NOT NULL,
74-
nr_voting int NOT NULL, topic_voting text NOT NULL,
75-
link_results varchar(200))")
77+
dbSendQuery(database_diet, "CREATE TABLE votings (id_voting int NOT NULL,
78+
nr_term_of_office int NOT NULL, nr_meeting int NOT NULL,
79+
date_meeting date NOT NULL, nr_voting int NOT NULL,
80+
topic_voting text NOT NULL, link_results varchar(200),
81+
PRIMARY KEY (id_voting, nr_term_of_office))")
7682

7783
# creating table with votes data
78-
dbSendQuery(database_diet, "CREATE TABLE votes (id_vote int NOT NULL PRIMARY KEY,
79-
id_deputy varchar(4) NOT NULL, id_voting int NOT NULL, vote varchar(20) NOT NULL,
80-
club varchar(50), FOREIGN KEY (id_deputy) REFERENCES deputies(id_deputy),
81-
FOREIGN KEY (id_voting) REFERENCES votings(id_voting))")
84+
dbSendQuery(database_diet, "CREATE TABLE votes (id_vote int NOT NULL,
85+
nr_term_of_office int NOT NULL, id_deputy varchar(4) NOT NULL,
86+
id_voting int NOT NULL, vote varchar(20) NOT NULL, club varchar(50),
87+
PRIMARY KEY (id_vote, nr_term_of_office),
88+
FOREIGN KEY (id_deputy, nr_term_of_office) REFERENCES deputies(id_deputy, nr_term_of_office),
89+
FOREIGN KEY (id_voting, nr_term_of_office) REFERENCES votings(id_voting, nr_term_of_office))")
8290

8391
# creating table with statements data
84-
dbSendQuery(database_diet, "CREATE TABLE statements (id_statement varchar(11) NOT NULL PRIMARY KEY,
85-
surname_name varchar(100) NOT NULL, date_statement date NOT NULL, titles_order_points text NOT NULL,
86-
statement text NOT NULL)")
92+
dbSendQuery(database_diet, "CREATE TABLE statements (id_statement varchar(11) NOT NULL,
93+
nr_term_of_office int NOT NULL, surname_name varchar(100) NOT NULL,
94+
date_statement date NOT NULL, titles_order_points text NOT NULL,
95+
statement text NOT NULL,
96+
PRIMARY KEY (id_statement, nr_term_of_office))")
8797

8898
# creating table with counter data
8999
dbSendQuery(database_diet, "CREATE TABLE counter (id SERIAL PRIMARY KEY, what varchar(10) NOT NULL,

0 commit comments

Comments
 (0)